Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cezary Mazurek Marcin Werla Poznań Supercomputing and Networking Center (Poznań, Poland) 2009-09-30ECDL.

Similar presentations


Presentation on theme: "Cezary Mazurek Marcin Werla Poznań Supercomputing and Networking Center (Poznań, Poland) 2009-09-30ECDL."— Presentation transcript:

1 Cezary Mazurek (mazurek@man.poznan.pl) Marcin Werla (mwerla@man.poznan.pl) Poznań Supercomputing and Networking Center (Poznań, Poland) 2009-09-30ECDL 2009, Corfu, Greece

2 2009-09-30ECDL 2009, Corfu, Greece

3 Main organizational models Regional digital libraries Created and maintained by several institutions from particular region Gather mostly resources related to the region, its history and culture but also academic educational materials and national cultural heritage Institutional digital libraries Created and maintained by single institutions (like universities) Gather mostly resources related to present activities (like institutional repositories) and history of the institution In many cases the technical base and support for digital libraries is provided by local computing or networking centres (like PSNC) 2009-09-30ECDL 2009, Corfu, Greece

4 Regional digital libraries Institutional digital libraries Overall number of digital objects 285 thousands Number of active digital libraries: 19 regional 21 institutional Number of cooperating institutions: Several hundreds of libraries, museums and archives + several other digital libraries in the phase of planning, configuration or initial content uploading 2009-09-30ECDL 2009, Corfu, Greece

5 Main aims To facilitate the use of resources from Polish digital libraries To increase the visibility of these resources in the Internet To create new, advanced network services both for end-users and digital libraries creators on the base of these resources 2009-09-30ECDL 2009, Corfu, Greece

6 Basic assumptions No need nor requirement to move resources to the DLF No fees for the use of the DLF and for being a part of it Open standards are the basis for cooperation Particular digital libraries can use different technological platforms 2009-09-30ECDL 2009, Corfu, Greece

7 Basic functions Search in the available publications Simple Advanced Digitization plans Searchable Report API for the prevention of duplicted digitization Location of digital objects on the basis of their OAI Identifiers Database of Polish digital libraries Statistics and reports Information in the DLF is updated on the daily (nightly) basis 2009-09-30ECDL 2009, Corfu, Greece

8 See it: http://fbc.pionier.net.pl/ 2009-09-30ECDL 2009, Corfu, Greece

9 Digital Libraries Federation search plugin 2009-09-30

10 Digital Libraries Federation InstitutionalRegionalLibrariesArchivesMuseums…. National (exclude??) Other InstitutionsDigital librariesMetadata aggregator 2009-09-30ECDL 2009, Corfu, Greece

11 We gather the information about content providers and their information systems Database of Polish Digital Libraries in the DLF 2009-09-30ECDL 2009, Corfu, Greece

12 We gather the metadata of objects that should be visible in Europeana Done with the OAI-PMH In most cases we require the OAI-PMH interface In really special cases we can do it in different way (eg. Polish Internet Library) Now we harvest only Dublin Core Simple Works on new national metadata schema started in September 2009 Approximate time of development: 3 months Approximate time of deployment: ??? 2009-09-30ECDL 2009, Corfu, Greece

13 We will try to clean-up the metadata, normalize it and enrich On the DLF level there are automatically built dictionaries on the basis of aggregated metadata Separately for each metadata element Separately for each metadata language Differences between the metadata from various digital libraries have negative impact for the searching possibilities of the end-users That is why the metadata normalization is so important The basic analysis shows which elements are crucial and which should be easy to clean-up The analysis was done in April 2009 on the metadata of 214 254 aggregated objects 2009-09-30ECDL 2009, Corfu, Greece

14 DC Element Number of unique values How many times values were used in metadata Average number of uses per one value format 39 209 789 5 379,2 language 195 210 529 1 079,6 type 822 211 816 257,7 rights 1 192 246 093 206,5 coverage 66 2 390 36,2 publisher 18 002 310 764 17,3 contributor 12 979 83 464 6,4 subject 78 440 438 871 5,6 relation 9 292 48 319 5,2 date 47 581 209 589 4,4 identifier 6 426 27 666 4,3 description 43 657 180 391 4,1 source 16 996 52 506 3,1 creator 21 908 67 503 3,1 title 210 745 227 039 1,1 2009-09-30ECDL 2009, Corfu, Greece

15 Format In 99% of descriptions: MIME type(eg. text/html, image/x.djvu) Language In most cases: ISO 639-2 (pol, ger, lat, fre etc.) Sometimes one value pol, ger instead of pol, ger Rights Name of the institution which holds the original object Type … 2009-09-30ECDL 2009, Corfu, Greece

16 Values for Type (top 20) Number of objects with the value % of aggregated objects % of aggr. obj. (after clean-up) czasopismo 44 70920,9% 33,8% gazeta 32 92115,4% 31,3% gazety 23 11910,8% Czasopismo 20 9659,8% książka 12 5035,8% Gazeta 11 0985,2% pocztówka 5 7682,7% czasopisma 4 9622,3% text 4 4522,1% grafika 3 8631,8% fotografia 3 5961,7% artykuł z czasopisma 3 1641,5% 2,6% artykuł 2 4551,1% Czasopisma 1 7100,8% dzienniki urzędowe 1 5160,7% stary druk 1 2220,6% 1,1% starodruk 1 2210,6% rysunek 1 0940,5% rękopis 1 0620,5% mapa 1 0280,5% Sum85,1%68,9% 2009-09-30ECDL 2009, Corfu, Greece

17 DC Element Number of unique values How many times values were used in metadata Average number of uses per one value format 39 209 789 5 379,2 language 195 210 529 1 079,6 type 822 211 816 257,7 rights 1 192 246 093 206,5 coverage 66 2 390 36,2 publisher 18 002 310 764 17,3 contributor 12 979 83 464 6,4 subject 78 440 438 871 5,6 relation 9 292 48 319 5,2 date 47 581 209 589 4,4 identifier 6 426 27 666 4,3 description 43 657 180 391 4,1 source 16 996 52 506 3,1 creator 21 908 67 503 3,1 title 210 745 227 039 1,1 2009-09-30ECDL 2009, Corfu, Greece

18 (Polish version of objects description) ValueNo. of associations% of all associations gazety regionalne122142,56% czasopisma77161,62% prasa polska54241,14% czasopisma niemieckie50091,05% gazety sublokalne49681,04% Grodków49621,04% Grottkau49611,04% Wielkopolska44220,93% 19 w.42490,89% Prusy41640,87% Czasopisma regionalne i lokalne polskie -19 w.41400,87% wiadomości polityczne40940,86% Gazety polskie - 1918-1939 r.40770,85% kultura40710,85% czasopisma sublokalne38130,80% Górny Śląsk37310,78% architektura35660,75% Wrocław35150,74% Śląsk34480,72% budownictwo33880,71% 2009-09-27ECDL 2009, Corfu, Greece Confused with coverage:temporalspatial

19 (Polish version of objects description) ValueNo. of associations% of all associations Poznań5494312,62% Telecomp Service na zlecenie PBI223105,12% Kraków136623,14% Warszawa112452,58% Toruń112212,58% Katowice81871,88% Drukarnia Polska79981,84% Drukarnia Dziennika Poznańskiego T.A.68281,57% Warszawa : Telecomp Service na zlecenie PBI68241,57% Drukarnia Dziennika Poznańskiego S.A.57851,33% Nakładem F[ranciszka] T[adeusza] Rakowicza54061,24% Kielce52921,22% Krakowskie Wydawnictwo Prasowe RSW "Prasa"51371,18% Breslau51301,18% E. Neugebauer49591,14% Wangefield49591,14% Grottkau49591,14% Bydgoszcz47521,09% Drukarnia Dziennika Poznańskiego39230,90% Drukarnia J. I. Kraszewskiego38690,89% 2009-09-27ECDL 2009, Corfu, Greece Geographical location…

20 We have over 40 digital libraries in Poland which are filled with content and metadata coming from hundreds of institutions from different domains We harvest the metadata and provide a single point of access to it The PIONIER Network Digital Libraries Federation (http://fbc.pionier.net.pl/)http://fbc.pionier.net.pl/ The software used for this service will be released as an open-source by the end of this year Cooperation with Europeana (but not only this) requires cleaning-up and normalization of metadata This is currently our biggest challenge But we do not want to solve it only by technical means on the level of our aggregator Close cooperation with content providers and some organizational changes prepared by them should effect in more efficient and sustainable metadata improvement process than a purely technical solution 2009-09-30ECDL 2009, Corfu, Greece

21 Cezary Mazurek (mazurek@man.poznan.pl) Marcin Werla (mwerla@man.poznan.pl) Poznań Supercomputing and Networking Center (Poznań, Poland) 2009-09-30ECDL 2009, Corfu, Greece Thank you for your attention. Any questions?


Download ppt "Cezary Mazurek Marcin Werla Poznań Supercomputing and Networking Center (Poznań, Poland) 2009-09-30ECDL."

Similar presentations


Ads by Google