Presentation is loading. Please wait.

Presentation is loading. Please wait.

Croatian Internet serials 1 Croatian Electronic Publishing Results of a survey on e-serials and usage of metadata Sofija Klarin, Sonja Pigac, Damir Pavelić.

Similar presentations


Presentation on theme: "Croatian Internet serials 1 Croatian Electronic Publishing Results of a survey on e-serials and usage of metadata Sofija Klarin, Sonja Pigac, Damir Pavelić."— Presentation transcript:

1 Croatian Internet serials 1 Croatian Electronic Publishing Results of a survey on e-serials and usage of metadata Sofija Klarin, Sonja Pigac, Damir Pavelić sklarin@nsk.hr, spigac@nsk.hr, dpavelic@efzg.hr National and University Library, Zagreb Faculty of Economics, Zagreb

2 2 Topics Part 1 Context: facts, presumptions and questions Part 2 Results of the survey Croatian remote access e-serials Part 3 Use of metadata in e-serials, possibilities for Croatia

3 3 1. Electronic publishing using the Internet explosion of publishing activities since 90s raises the problems of searching, retrieval, identification and preservation of electronic documents World Wide Web (1995  ) Cataloguer-based management vs. Author-based management(Koehler)

4 4 1.1 How big is the Web? Lawrence & Giles (1999): 800 million web pages 15 TB of information 6 TB of text BrightPlanet - LexiBot software(2000) 19 TB - the “surface” Web 7,500 TB - the “deep” Web Kulturarw3 project - Sweden web harvesting 7.5 million files 300 GB Croatia (since 1991) 8000.hr domains types, number of files? types of resources? publishers? Too big?

5 5 1.2 Lawrence & Giles (1999): 83% of sites contain commercial content and 6% contain scientific or educational content in the Web Valuable material? 05.08.2000most visited Croatian sites (Proof)

6 6 1.3 Persistence of Web documents (Koehler,1999) Web pages are unstable –go under change (in a year 99% of web pages - some degree of change) –disappear –5% return within a specific period of time Two types of change –change of content (20% in a week) –change of structure (20% in a week)Too ephemeral ?

7 7 1.4 Low use of metadata on the WWW Lawrence & Giles (1999) the simple HTML "keywords" and "description" metatags are only used on the homepages of 34% of sites only 0.3% of sites use the Dublin Core metadata standard who are Web “publishers”? –can they accept standards for management and interchange of metadata? Search/retrieval? Reliability? Authenticity? Interchange? Publishers?

8 8 1.5 Products of electronic publishing local access hybrid remote access resources monograph publications (finite publications.) continuing resources? –serials –integrating resources data or/and programs public access restricted access static dynamic New types of resources?

9 9 2. The survey (January 2000 - April 2001) The aim of the survey on e-serials: quantity, categories, persistence, publishers, metadata usage in Croatian web space... sample: - electronic publications which consist of successive parts with numerical or chronological designations - in Croatian or produced by Croatian publishers, available via WWW items excluded: OPACs or databases, lists/archives, web sites, online services, advertisements

10 10 2.1 Identification Lists, directories, portals, search engines: CroLinks http://www.crolinks.com www.hr - News, media, journals Iskon - Net.hr portal http://www.iskon.hr Google, Yahoo from their print versions from publishers

11 11 2.2 Numbers Total number: 153 disappeared: 16 changed URL: 12 ceased: 2 changed the title: 1 NL Denmark - 1069 (2000) NL Norway - 299 (1999)

12 Croatian Internet serials 12 Religious magazines 2.3 Categories: Weekly/fortnightly magazines Scientific journals Student journals Serials published by universities, scientific institutes Serials published by civil services Serials of unknown type Newspapers -------------------------------------------------------------------- Sums 153 28 42 9 10 8 14 4 9 Serials published by societies Serials published by companies 11 18 Journals

13 13 2.4 Editions: electronic, both electronic and printed 110 42 + 1 e.g. Vjesnik, Večernji list, Slobodna Dalmacija both electronic and print e.g. Mountain Bikinig, Morsko prase electronic only Internet Monitor print became electronic

14 14 2.5 Place of publication: –Zagreb: 115 –Split: 6 –Rijeka: 5 –Osijek, Dubrovnik, Varaždin, Čakovec Slavonski Brod: 2 –Karlovac, Zadar, Pula Koprivnica, Ičići, Prelog, Sv. Ivan Zelina, Rovinj, Virovitica 1 –other:(AT) 1 –unknown : 4

15 15.hr 82% 2.6 URLs: Croatian domain or …? www.vjesnik.hr www.vecernji-list.hr www.slobodnadalmacija.hr www.nacional.hr www.vef.hr/vetarhiv www.nn.hr/Glasilo/index.htm www.hi-fi.hr/hgz wam.hi-fi.hr www.agr.hr/smotra/index.htm www.monitor.hr www.gradst.hr/engmod www.bug.hr etc..com 17% A www.hrvatska.com/glas- podravine duhovno-vrelo.com www.win-ini.com cyberdream.croadria.com www.zarez.com www.hrvatska.com/bilten.html www.kapital.com etc. other 1% www.moravek.net/kla www.hrvatskenovine.at C B 1 item  3 URLs / domains (.hr.com.net) 1 item  2 URLs / domains (.hr.com)

16 16 2.6.1 Domains, URLs 28 items have top-level domain name e.g. www.vjesnik.hr, www.morsko-prase.hr 12 items changed URL: –5 from first/second... level domain to top-level domain name e.g. http://www.hbk.hr/GK/gk.htm  http://www.glas-koncila.hr –5 internal changes of the site (domain) e.g. http://www.kdb.hr/projekt/paedro/index.htm  http://www.kdb.hr/paedro/ –1.hr .com –1.com .hr 16 items disappeared: –11.hr68,75%(total.hr 82%) – 5.com 31.35% (total.com 17%)

17 17 2.7 Chronological overview ‘94 ‘95 ‘96 ‘97 ‘98 ‘99 2000 2001 year titles 1994  2 1995  6 1996  11 1997  21 1998  26 1999  33 2000  24 2001  2 unknown 26

18 18 2.8 Low metadata use Croatian e-serials HTML metatags ”keywords” “description” “author” –32.8% (September 2000) –33.3% (April 2001) 1 title - DC metadata standard Lawrence & Giles (1999) simple HTML metatags are only used on the homepages of 34% of sites. Only 0.3% of sites use the Dublin Core metadata standard.

19 19 2.9 Metadata ACS-AGRICULTURAE CONSPECTUS SCIENTIFICUS

20 20 2.10 Metadata questionnaire sent in April 2001 by e-mail to 160 e-publishers, editors, webmasters… to find out more about their familiarity with metadata, and their intentions to use metadata and cooperate with librarians an effort to raise the awareness among publishers of the need for “electronic title page” to be included in their publications

21 21 Do you know what metadata is? Do you use metadata? 27 answers representing 32 publications received (17,3% or 20,6%) 6 incorrect statements: 4 claim to use metadata (they don’t!) 2 claim not to use metadata (they do!)

22 22 The benefits of metadata facilitates search and retrieval 69,6% promotes the company/publ. 56,5% helps identify the author and the content of the publication 52,2% everybody uses metadata 13% reliability and authenticity of publ. 8,7% contains copyright information 4,3% 95,7% 52,2% 60,9% 21,7%

23 23 Metadata is created by... 25,8% don’t use metadata because they: know nothing about metadata 50% don’t have enough time12,5% don’t have enough employees12,5%

24 24 Meatadata generators? (DC-dot, TagGen, DC assist, EdNA, AHDS, Reggie, Nordic DC metadata generator, SAFARI) aware of their existence 11% not aware 71% –would like to be informed 100%

25 25 Metadata is contained in... homepage only 26,1% all pages (same metadata) 17,4% all pages (different metadata) 47,8%

26 26 Metadata standardization? 1. Have you heard of metadata standardization? 2. Which metadata schema do you know of? 3. Would a metadata guideline help you? 4. Is standardization important for your work? 5. Would you like to have standardized metadata in your publ.?

27 27 Could librarians help you? librarians work on standardization of bibl. description48% I’d appreciate any help 44% librarians describe print publ.32% librarians work on standardization of metadata12% we are already familiar with library activities (ISBN,ISSN,CIP…) 24% librarians don’t know much about the Web50% webmasters should do that44% can do it by myself25%

28 28 E-journals available through the library WebPAC? YES 93,8% it’s useful information for users 75% it’s important to treat both print and e-publ. in the same way 75% it’s useful for publishers 46,4% NO 6,3% people prefer to use search engines web publications often change their URLs - “I’m not sure librarians should catalogue them”

29 29 Dublin Core Metadata Initiative survey From Feb. 20th to March 9th, 2001. The purpose of the questionnaire was to help achieve some of the DC Libraries Working Group’s objectives for 2001, including: (1) to collect and share examples of Dublin Core use in libraries and (2) to stimulate discussion that will feed into the process of drafting an application profile for the use of Dublin Core in libraries DC-General and DC-Libraries lists, CORC Users List, and The Alberta Library Metadata List 29 responses from 9 countires Most used: creator, publisher, title, rights, type, identifier, format, description Low use of qualifiers http://dublincore.org

30 30 3. Use of metadata in e- serials and possibilities in Croatia E-serials - digital / hybrid libraries - databases (publishers, vendors) cooperation (BIBLINK) hosted.ukoln.ac.uk/biblink - separately (web pages)

31 31 3.1. Using metadata 1.Inside the document – HTML (XML) metadata document described above </body) 2.Separate file - metadata records + links to e-serials (bibliography, similar serials…) - file containing metadata – link from web page with no metadata in the (DC web page)

32 32 3.2 Metadata schemes -before Internet and electronic publications (cataloguing, exchange – MARC, GILS, CIMI) -development of Internet (searching, cataloguing, exchange) Qualified Dublin Core (dublincore.org) -translations versions (21 language) -no Croatian but translation is finished

33 33 3.3. Creation & conversion tools - Creating metadata (templates) Nordic DC metadata creator (including URN generator) (choice of controlled vocabularies, classification, date format, identifier) - Creation / change of templates Reggie, Mantis (OCLC) HotMETA (search DC) - Automatic extraction / gathering from HTML (enter URL) DC-dot (results in DC, RDF, XHTML - aditional corrections possible) Donor metatagenerator (similar to Nordic DC)

34 34 - Automatic production Klarity (automatically generates metadata based on concepts found in text) Scorpion (automatic classification to DDC) - Commercial software TagGen Dublin Core edition ( number of schemes and possibilities) Metabrowser ( shows Metadata and Web Pages simultaneously) http://dublincore.org/tools 3.3. Creation & conversion tools

35 35 DC-dot - ( http://www.agr.hr/smotra )http://www.agr.hr/smotra 3.3. Creation & conversion tools

36 36 Donor - ( http://www.agr.hr/smotra )http://www.agr.hr/smotra 3.3. Creation & conversion tools

37 37 3.3. Creation & conversion tools Metabrowser – “Metabrowser is a web browser that catalogues web pages using schemas such as Dublin Core, GILS, AGLS. Metabrowser allows metadata to be added to web pages accessible from a local or network drive or sent to an external system such as a database or firewalled web server”

38 38 3.3. Creation & conversion tools Conversion: - DC -> MARC (Dan, Fin, Is, Nor, Swe, US) Nordic Metadata Project: DC to MARC converter (www.bibsys.no/mete/d2m) - Crosswalks: DC, MARC, MARC21, EAD, GILS,ISAD, FGDC ( www.ukoln.ac.uk/metadata/interoperability )

39 39 3.3. Creation & conversion tools Nordic metadata project: DC to MARC converter 008010508s 245 $a ACS-AGRICULTURAE CONSPECTUS SCIENTIFICUS 260 $b Faculty of Agriculture University of Zagreb 856 $u http://www.agr.hr/smotra

40 40 Conversion MARC -> XML -> MARC ( www.logos.com/marc)www.logos.com/marc ( www.culture.fr/BiblioML) - additional applications needed 3.3. Creation & conversion tools

41 41 3.4. Which model / scheme ? - company / organization needs - connection and cooperation with other companies / organizations - budget - standardization - softver and upgrading possibilities - exchange of data / records Libraries Publishers Vendors different needs and aims

42 42 Libraries - bibliographic control, - up-to-date record collections (users benefit), - exchange Publishers - timely, accurate and full exposure of their products and services, - search and retrieval – benefit users and publisher, - standardized record in databases for possible exchange and profit Cooperation ! 3.4.1 Choose scheme and strategy - Croatia

43 43 Use knowledge and experience from foreign projects: Biblink CORC (Cooperative online resources cataloguing) DONOR (Directory of Netherlands online resources) -Inform publishers of standards and possibilities (survey) -Point out necessity of standardization and use of one primary (major) scheme (Dublin Core ?) -Show them how to use free web-available tools 3.4.1 Choose scheme and strategy - Croatia

44 44 3.5 DC – RDF - XML Dublin Core is enough for basic description (qualified) – serves our needs for the beginning RDF (Resource Description Framework) is about to become standard (semantic web) XML (eXtended Markup Language) is already growing standard (strucure, exchange, e- business, internal control…)

45 45 RDF - development is still in process but… Many projects and tools exist (creation, conversion) Constant work, often non-commercial (learn & use) Croatia - use same metadata scheme (DC?) enriched with internal metadata scheme if needed (for publishers use) - embed it into HTML documents - convert to RDF-XML eventualy 3.5 DC – RDF - XML

46 46 3.6. Conclusion Low use of any metadata scheme opens possibility to adopt one primary scheme (DC?) and emerging standard (RDF?) Concentrate on the start and strategy, use experience from others Build environment to help publishers (similar to Biblink) Cooperation among libraries and publishers is essential

47 47 3.7 Links http://dublincore.orgwww.ifla.org www.ukoln.ac.ukwww.w3c.org www.editeur.orgwww.xml.com www.logos.com/marcwww.culture.fr/BiblioML


Download ppt "Croatian Internet serials 1 Croatian Electronic Publishing Results of a survey on e-serials and usage of metadata Sofija Klarin, Sonja Pigac, Damir Pavelić."

Similar presentations


Ads by Google