Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digitization and scientific digital libraries Martin Lhoták Knihovna AV ČR, v. v. i. Academy of Sciences Library 3.6.2009 UISK, Universita Karlova v Praze.

Similar presentations


Presentation on theme: "Digitization and scientific digital libraries Martin Lhoták Knihovna AV ČR, v. v. i. Academy of Sciences Library 3.6.2009 UISK, Universita Karlova v Praze."— Presentation transcript:

1 Digitization and scientific digital libraries Martin Lhoták Knihovna AV ČR, v. v. i. Academy of Sciences Library 3.6.2009 UISK, Universita Karlova v Praze

2 Content  Digitization Centre of Acad. of Sci. Library  Kramerius – software for dissemination  Digital Library of the Academy of Sciences  Software for metadata creation  „Digitization Registry CZ“ project

3 Digitization Centre of the AS Library In operation since 1.1.2004 Builded with support from EU Solidarity fund after floods in Czechia in 2002 Main aim - to build a digital library of scientific publications (books, articles,…), published in the Academy of Science of the Czech Rep. Digital Library of ASCR Partner of DML-CZ: Czech Digital Matemathical Library project since 2005

4 The Academy of Science of the Czech Republic > 50 scientific institutes 8000 employees, (4000 R&D) > 11 000 articles, reports, etc. a year publish > 90 journals (circa 3000 articl.) > 100 years history

5 Digitization Centre of the AS Library 1 x A0 color scanner ProServ ScanTech 600i 1 x A1 color scanner Digibook 10000 2 x A2 bw scanners Zeutschel OS 7000 1 x A4 fast production scan. Panasonic Staff – 8 to 10 people Provides servis also to other institutions Monthly production 40 - 50.000 pages Overall production > 2.000.000 pages Planned acquisition – ScanRobot http://www.treventus.com/ http://www.treventus.com/

6 Image Adjusting Software Book Restorer from i2S Designed to process scanned books Geometrical correction Crop Blur Binarization Despecle

7

8

9

10

11

12

13

14

15

16

17

18

19

20 Basic Metadata XML (DTD of The Czech National Library) Title basic biblographic data Book/Journal structure Physical size of the book/journal Numbers of pages Software Sirius (CZ)

21

22

23 OCR Fine Reader 8.1 2 runs: - 1. to recognize language of paragraph - 2. to do OCR with right language OCR workflow developed by DML-CZ team of Dr. P. Sojka Output – double layer PDF: - 1. layer scanned picture - 2. layer „OCRed“ text

24

25 Kramerius – development group and used technology  Open source – development from 2003  Main purpose – accessing/dissemination of digitized documents (monographs and periodicals)  Czech National Library, Academy of Sciences Library, Qbizm technologies, Moravian Library in Brno  Funded mostly from Ministry of Culture and Academy of Sciences Grant Agency  Used technologies: JAVA, Linux, Apache, Tomcat, Postgres SQL, Lucene

26 Kramerius – current status  version: 3.3.0, build: 29.7.2008,

27 Kramerius – current status  DTD for periodicals a monographs  Import of XML, TXT and graphic files  Grafický formát DjVu, JPG, PNG, PDF  Fulltext search (Lucene)  Replication of the data between individual instalations  OAI-PMH – for metadata harvesting  METS, PREMIS, MIX – metadata standards

28

29

30

31

32

33

34

35

36

37

38

39 Kramerius – current status  International an national Connections: - The European Library http://www.theeuropeanlibrary.org - Uniform Innformation Gateway JIB http://www.jib.cz/  Links to libraries OPACs  Persistent URLs enables persistent linking

40 Kramerius – new plans of development  Fundamental change – use of the FEDORA repository (open source USA)  Reasons – FEDORA is robust engine with support of compound objects and it is also usefull by means of long term preservation  Enhancement of administration – users and access rights  Batch operations with digitized documents  New types of docs (maps, audio, video,…)

41 Kramerius – institutional users  Czech National Library, Moravian Library in Brno, State Technical Library, Academy of Sciences Library  Regional Scientific Libraries: Havlíčkův Brod, Hradec Králové, Olomouc, Ostrava, Zlín  Muzeum Libraries: UPM Praha, ŽM Praha, DA Praha, MVČ Hradec Králové  In total circa 5.500.000 pages (circa 500 periodical titles amd 4500 monographs)

42 Academy of Sciences Digital Library  Funded by Academy of Sciences (2004-2009)  Digitization of historical issues (1890-1990),  Digitized circa 1 500 000 pages  Development of Kramerius system  Accesible 1 000 000 pages, (no articles separation)  Fulltext search  http:\\kramerius.knav.cz

43 Academy of Sciences Digital Library  New issues – different approach  Open source E-prints (Uni of Southampton)  Agreements with the Academy Institutes – conditions of dissemination  Final goal – merge of both digital libraries (solution probably Drupal/FEDORA – Islandora?)

44

45

46

47

48 Collaboration with Google  Digitized journals from Kramerius system - indexing of fulltexts, automatic detection of articles, link from Google to article’s first page or abstratct  New articles in E-prints - indexing of fulltexts, link from Google

49 Academy of Sciences Central Data Repository  Huge amount of data from digitization  Disk array 30 TB with mirror  Tape library up tp 500 tapes  3 different location for long term storage  Long term preservation for R&D outputs of the Czech Academy of Sciences  Institutional Repository

50 System for journal publishing administration  Proven professional system (Manusript Central, Editorial Manager)  Better price for implementation and every year service fees with purchase as consortium  On-line submission system  Complete evidence of authors, reviewers and articles  Automated administration of peer review  Recently 8 journals

51 Software for metadata creation  Project by Moravian Library in Brno – funded by Ministry of Culture  Open source which should enable to create metadata in Kramerius format  Metadata – descriptive, technical, administrative  Bibl. record from library inf. system  Outputs also in other formats – Manuscriptorium, Dspace, FEDORA

52 Project „Digitization Registry CZ“  Project partners: Academy of Sciences Library and National Library  Funded by R&D program of Ministry of Culture  Central registry of digitized documents in CR  Monitoring of digitization workflow  Linking with libraries OPACs  Possible move to international level (EU project)

53 Thank you! Questions? Martin Lhoták lhotak@knav.cz www.knav.cz


Download ppt "Digitization and scientific digital libraries Martin Lhoták Knihovna AV ČR, v. v. i. Academy of Sciences Library 3.6.2009 UISK, Universita Karlova v Praze."

Similar presentations


Ads by Google