Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluation of Citation Enhanced Scholarly Databases INFOPRO 2005 Keynote address Dr. Peter Jacso Professor University of Hawaii, USA Tokyo, November, 2005.

Similar presentations

Presentation on theme: "Evaluation of Citation Enhanced Scholarly Databases INFOPRO 2005 Keynote address Dr. Peter Jacso Professor University of Hawaii, USA Tokyo, November, 2005."— Presentation transcript:

1 Evaluation of Citation Enhanced Scholarly Databases INFOPRO 2005 Keynote address Dr. Peter Jacso Professor University of Hawaii, USA Tokyo, November, 2005 Jacso

2 Japan in Science & Technology Jacso

3 Japan in Science & Technology publication Jacso

4 The birth of an idea Eugene Garfield Jacso

5 Thesaurus-based and free-text searching Nice in library schools but not in practice Is it sciatica or ischialgia ? Orthopedic or orthopaedic Center or centre Shiatsu or shiatzu Student or pupil Bad behavior or bad behaviour Jacso

6 What is a citation? Citation or reference Citation indexing, indexes or indices Citation analysis or analyzis Or is it analyses? Jacso


8 arXiv

9 Multi-disciplinary – discipline-focused WoS and Scopus largest multidisciplinary databases Google Scholar – it is free, and ….. ? Discipline-oriented databases arXiv - primarily physics NASA/ADS – astrophysics and some related PsycINFO - psychology CINAHL - nursing CiteSeer – computer science RePEC and its derivatives – economics SMEAL – business Jacso

10 Citation collecting, parsing, indexing, matching, browsing, searching, sorting, ranking outputting linking Jacso

11 The purpose of database evaluation We did it with print reference sources for a long time Content AND Software Practical and financial implication$$$$ Thermometers, pulsometers, blood pressure meters are not enough X-Ray, MRI, blood tests Quantitative and qualitative aspects Quantifiable, measurable vs. philosophical-ideal Cant do it in your lunch break Going beyond PR-info from database publishers Jacso

12 CONTENT MEASURES Database size Database dimensions Scope Composition Source coverage Journal base The special aspects of cited references as data elements Jacso

13 Database Size Guinness Book of World Records mentality Biggest, Greatest, Largest, Greenest Fastest, Strongest, Leanest, Meanest Where is quality? Aeroflot – the biggest airline …. and (one of) the worst before glasnost Sports Discus – little muscle much flab Jacso

14 Database Size WoS 1980-2005 25.5 million records WoS 1945-200536 million records Wos Century of Science 37 million records Scopus26 million records Google Scholar (GS)maybe 10 million records, but mixing fine jizake with cheap wine Jacso

15 Database Dimensions Absolute size is not everything Biggest is not always the best(est) HowHow is database A bigger than database B In what shape and form? How is the body of the database built Different disciplines - different preferences Jacso

16 Bigger horizontally (wider) vs. vertically (taller) Jacso

17 Number of records with cited references Scopus Jacso

18 Dialog ISI subset & Scopus number of records with cited references Jacso

19 The fate of two databases PsycINFO MHA Jacso

20 Noticeable lack of currency Jacso

21 Google Scholars innumeracy more for 1965 – 2005 than for 1955 - 2005?? Jacso

22 GS big in/on Japan - 86% of all of its 1955-2005 records? bigger between 1965 - 2005 than 1955 - 2005? Jacso

23 Subject Scope Not static, may have evolved in the past X years Obvious subject dominance in Scopus at the journal level Jacso

24 Much more subject dominance at the article level Jacso

25 Apparent Presence of Arts & Humanities in WoS Jacso

26 Composition Jacso

27 Composition

28 Current Science Jacso

29 there are books & conference proceedings in Scopus (but not enhanced with cited refs ) Jacso

30 All records 26,731,691 with keywords 21,706,112 with abstracts 18,538,475 with refs 8,442,048 Completeness of records Jacso

31 Completeness of records Dialog ISI subset total items & items with cited references Jacso

32 Scopus number of total items & the number of records with cited references Jacso

33 The Scientist case Jacso

34 The D-Lib Magazine claim Jacso

35 The D-Lib Magazines bold claim Jacso

36 JASIS - JASIS&T case

37 The Scientist case Jacso

38 The Scientist case Jacso

39 The Scientist case Jacso

40 The Scientist case Jacso

41 JASIS - JASIS&T case Jacso

42 JASIS - JASIS&T case

43 Evaluation of Citation Enhanced Scholarly Databases Part 2. Dr. Peter Jacso Professor University of Hawaii at Manoa, USA Tokyo, November, 2005 Jacso

44 Software Issues Software capabilities can make or break a product Cited references represent new and unusual data element New challenge, few (WoS, Scopus, CSA) can do it well For researchers adding cited references to their paper is the bane of publishing No universal standard for cited reference formats. Reference Management programs support more than 700 citation style formats Jacso

45 ProCite Jacso

46 Many chances for messing up cited references in digitization Who can mess them up? Authors, editors, copy editors at publisher Data entry operators at A/I services Programmers at database aggregators Programmers when extracting data from publishers archives Spoiled and careless programmers when doing anything Jacso

47 Selected references Jacso

48 Notes Jacso


50 Google Scholar Autonomous citation indexing is not perfect either Google Scholar mightily managed to mix up many metadata elements Is this an article published in 2006? Has it been really cited 98 times already in October, 2005. Jacso

51 No, its the page number, a Hungarian postal code, or any 4-character digit Jacso

52 Careless data entry/OCR-ing can cripple the links Jacso

53 EBSCO … or 20 th century programming can serve references dead cold Jacso


55 as opposed to the native hot-linked WoS version Jacso

56 or the hot and spicy Scopus version with cited by (citedness) score Jacso

57 …as long that cited references have no misspellings Jacso

58 typos cripple the impressive cited by feature - the best of Scopus and CSA, which cant undo the misspelling shown earlier or the one done by PsycINFO in the author name here – it is Jacso not Jasco, thank you Jacso

59 in the cited references they are crippled in more way than one, but we may feel warmed up by 6 cited by hot links to records which cite Moeds article in PsycINFO … Jacso

60 ….thats why there are relatively few as opposed to the 45 citedness score in Scopus shown earlier …. Jacso

61 …. and the 55 citedness score in Wos for the cited 1995 article of Moed HF. The citedness scores of WoS and Scopus often get close for articles published since the mid 1990s, but not for the earlier ones Jacso

62 Remember, to see the citedness score is yet a two-step process in WoS, but it likely will include soon the citedness score within the cited reference list directly as in CSA and Scopus. Jacso

63 Browsing of citing (source) and cited (target) Author and Journal Names is a must. Still only few offer adequate browsing. Scopus only for source author and source title. Jacso Browsing/Looking up citing/cited authors & journals

64 Jacso

65 Inconsistencies and inaccuracies are rampant in source journal names as in PASCAL Jacso

66 WoS can spell it consistently nearly 20,000 times as source journal - quality control, order instead of dis order Jacso

67 In cited sources all hell breaks loose Jacso

68 Without browsing and defensive searching you would miss a lot In Dialogs version of the ISI subset misspelled formats are not corrected Dialog only updates (adds new records) does not RE-LOAD (to correct old ones) Jacso

69 In CINAHL I have slim-chances without browsing the author and cited author fields before searching. Browsing is like looking in the pool before diving to see if there is water, and how much is there Be savvy & browse, browse, browse if the software allows Jacso

70 AND WHAT BROWSE OPTIONS Google Scholar offers? None Zilch Nada Zero Kotonashi Jacso

71 SEARCHING Rather limited options for cited author, cited title, cited journal Menu driven in WoS SAME (sentence) option in WoS, but … …. No searching in cited title in WoS Proximity and positional operators in Scopus Mostly command-driven in Advanced Mode in Scopus Useful but ugly prefixes in Scopus Good menus in EBSCO and Ovid Jacso

72 SEARCHING No truncation when searching in REFxxx ? Jacso

73 Result display and sorting Short result list for at- a-glance impression about sources, then sorting by citedness score! Jacso



76 Sorting & relative citedness score CSA could sort but does not offer this feature by citedness Google Scholar used to rank the result by citedness score No one offers citedness by age adjusted score even if that would be the most fair 10 year old versus 2 year old article had different chances for receiving citations My tests showed big difference for some items in ranking by absolute vs relative citedness score Jacso


78 The many dimensions of citedness scores Citedness scores can be highly informative in estimating usefulness & perceived importance of a paper by peers in form of citations (=links). Major differences because of the domains of citing sources In journal publisherss archive gathered only from digitized journals of the publisher At aggregators/facilitators from all databases hosted (except...PsycINFO for not so splendid isolation policy)) In self-published databases gathered from the database itself In Scopus gathered from 1996 onward from >10,000++ journals In WoS gathered from 1900/1945/1980 forward from <10,000 journals Jacso

79 All the above assume correct identification, matching & calculation. Enter Google Scholar – playing fast and loose with the numbers Make it very fast and very loose Jacso

80 A half-page quickie interview with the author in The Scientist cited 7,380 times? Jacso

81 You can scroll up and down in the purportedly citing Nucleic Acid Research article for the name of Kraulis and The Scientist and the title, you will not find them. Any of them. Jacso


83 Two articles by Kraulis, but neither is the 1993 piece in The Scientist Jacso

84 But what do you expect from a software that cannot even do the most basic Boolean OR operation correctly Jacso

85 Indeed, citation data is subtle stuff and requires competence Jacso

Download ppt "Evaluation of Citation Enhanced Scholarly Databases INFOPRO 2005 Keynote address Dr. Peter Jacso Professor University of Hawaii, USA Tokyo, November, 2005."

Similar presentations

Ads by Google