Download presentation
Presentation is loading. Please wait.
Published byJennifer Hahn Modified over 10 years ago
1
Evaluation of Citation Enhanced Scholarly Databases INFOPRO 2005 Keynote address Dr. Peter Jacso Professor University of Hawaii, USA Tokyo, November, 2005 Jacso
2
Japan in Science & Technology Jacso
3
Japan in Science & Technology publication Jacso
4
The birth of an idea Eugene Garfield Jacso
5
Thesaurus-based and free-text searching Nice in library schools but not in practice Is it sciatica or ischialgia ? Orthopedic or orthopaedic Center or centre Shiatsu or shiatzu Student or pupil Bad behavior or bad behaviour Jacso
6
What is a citation? Citation or reference Citation indexing, indexes or indices Citation analysis or analyzis Or is it analyses? Jacso
8
arXiv
9
Multi-disciplinary – discipline-focused WoS and Scopus largest multidisciplinary databases Google Scholar – it is free, and ….. ? Discipline-oriented databases arXiv - primarily physics NASA/ADS – astrophysics and some related PsycINFO - psychology CINAHL - nursing CiteSeer – computer science RePEC and its derivatives – economics SMEAL – business Jacso
10
Citation collecting, parsing, indexing, matching, browsing, searching, sorting, ranking outputting linking Jacso
11
The purpose of database evaluation We did it with print reference sources for a long time Content AND Software Practical and financial implication$$$$ Thermometers, pulsometers, blood pressure meters are not enough X-Ray, MRI, blood tests Quantitative and qualitative aspects Quantifiable, measurable vs. philosophical-ideal Cant do it in your lunch break Going beyond PR-info from database publishers Jacso
12
CONTENT MEASURES Database size Database dimensions Scope Composition Source coverage Journal base The special aspects of cited references as data elements Jacso
13
Database Size Guinness Book of World Records mentality Biggest, Greatest, Largest, Greenest Fastest, Strongest, Leanest, Meanest Where is quality? Aeroflot – the biggest airline …. and (one of) the worst before glasnost Sports Discus – little muscle much flab Jacso
14
Database Size WoS 1980-2005 25.5 million records WoS 1945-200536 million records Wos Century of Science 37 million records Scopus26 million records Google Scholar (GS)maybe 10 million records, but mixing fine jizake with cheap wine Jacso
15
Database Dimensions Absolute size is not everything Biggest is not always the best(est) HowHow is database A bigger than database B In what shape and form? How is the body of the database built Different disciplines - different preferences Jacso
16
Bigger horizontally (wider) vs. vertically (taller) Jacso
17
Number of records with cited references Scopus Jacso
18
Dialog ISI subset & Scopus number of records with cited references Jacso
19
The fate of two databases PsycINFO MHA Jacso
20
Noticeable lack of currency Jacso
21
Google Scholars innumeracy more for 1965 – 2005 than for 1955 - 2005?? Jacso
22
GS big in/on Japan - 86% of all of its 1955-2005 records? bigger between 1965 - 2005 than 1955 - 2005? Jacso
23
Subject Scope Not static, may have evolved in the past X years Obvious subject dominance in Scopus at the journal level Jacso
24
Much more subject dominance at the article level Jacso
25
Apparent Presence of Arts & Humanities in WoS Jacso
26
Composition Jacso
27
Composition
28
Current Science Jacso
29
there are books & conference proceedings in Scopus (but not enhanced with cited refs ) Jacso
30
All records 26,731,691 with keywords 21,706,112 with abstracts 18,538,475 with refs 8,442,048 Completeness of records Jacso
31
Completeness of records Dialog ISI subset total items & items with cited references Jacso
32
Scopus number of total items & the number of records with cited references Jacso
33
The Scientist case Jacso
34
The D-Lib Magazine claim Jacso
35
The D-Lib Magazines bold claim Jacso
36
JASIS - JASIS&T case
37
The Scientist case Jacso
38
The Scientist case Jacso
39
The Scientist case Jacso
40
The Scientist case Jacso
41
JASIS - JASIS&T case Jacso
42
JASIS - JASIS&T case
43
Evaluation of Citation Enhanced Scholarly Databases Part 2. Dr. Peter Jacso Professor University of Hawaii at Manoa, USA Tokyo, November, 2005 Jacso
44
Software Issues Software capabilities can make or break a product Cited references represent new and unusual data element New challenge, few (WoS, Scopus, CSA) can do it well For researchers adding cited references to their paper is the bane of publishing No universal standard for cited reference formats. Reference Management programs support more than 700 citation style formats Jacso
45
ProCite Jacso
46
Many chances for messing up cited references in digitization Who can mess them up? Authors, editors, copy editors at publisher Data entry operators at A/I services Programmers at database aggregators Programmers when extracting data from publishers archives Spoiled and careless programmers when doing anything Jacso
47
Selected references Jacso
48
Notes Jacso
50
Google Scholar Autonomous citation indexing is not perfect either Google Scholar mightily managed to mix up many metadata elements Is this an article published in 2006? Has it been really cited 98 times already in October, 2005. Jacso
51
No, its the page number, a Hungarian postal code, or any 4-character digit Jacso
52
Careless data entry/OCR-ing can cripple the links Jacso
53
EBSCO … or 20 th century programming can serve references dead cold Jacso
55
as opposed to the native hot-linked WoS version Jacso
56
or the hot and spicy Scopus version with cited by (citedness) score Jacso
57
…as long that cited references have no misspellings Jacso
58
typos cripple the impressive cited by feature - the best of Scopus and CSA, which cant undo the misspelling shown earlier or the one done by PsycINFO in the author name here – it is Jacso not Jasco, thank you Jacso
59
in the cited references they are crippled in more way than one, but we may feel warmed up by 6 cited by hot links to records which cite Moeds article in PsycINFO … Jacso
60
….thats why there are relatively few as opposed to the 45 citedness score in Scopus shown earlier …. Jacso
61
…. and the 55 citedness score in Wos for the cited 1995 article of Moed HF. The citedness scores of WoS and Scopus often get close for articles published since the mid 1990s, but not for the earlier ones Jacso
62
Remember, to see the citedness score is yet a two-step process in WoS, but it likely will include soon the citedness score within the cited reference list directly as in CSA and Scopus. Jacso
63
Browsing of citing (source) and cited (target) Author and Journal Names is a must. Still only few offer adequate browsing. Scopus only for source author and source title. Jacso Browsing/Looking up citing/cited authors & journals
64
Jacso
65
Inconsistencies and inaccuracies are rampant in source journal names as in PASCAL Jacso
66
WoS can spell it consistently nearly 20,000 times as source journal - quality control, order instead of dis order Jacso
67
In cited sources all hell breaks loose Jacso
68
Without browsing and defensive searching you would miss a lot In Dialogs version of the ISI subset misspelled formats are not corrected Dialog only updates (adds new records) does not RE-LOAD (to correct old ones) Jacso
69
In CINAHL I have slim-chances without browsing the author and cited author fields before searching. Browsing is like looking in the pool before diving to see if there is water, and how much is there Be savvy & browse, browse, browse if the software allows Jacso
70
AND WHAT BROWSE OPTIONS Google Scholar offers? None Zilch Nada Zero Kotonashi Jacso
71
SEARCHING Rather limited options for cited author, cited title, cited journal Menu driven in WoS SAME (sentence) option in WoS, but … …. No searching in cited title in WoS Proximity and positional operators in Scopus Mostly command-driven in Advanced Mode in Scopus Useful but ugly prefixes in Scopus Good menus in EBSCO and Ovid Jacso
72
SEARCHING No truncation when searching in REFxxx ? Jacso
73
Result display and sorting Short result list for at- a-glance impression about sources, then sorting by citedness score! Jacso
76
Sorting & relative citedness score CSA could sort but does not offer this feature by citedness Google Scholar used to rank the result by citedness score No one offers citedness by age adjusted score even if that would be the most fair 10 year old versus 2 year old article had different chances for receiving citations My tests showed big difference for some items in ranking by absolute vs relative citedness score Jacso
78
The many dimensions of citedness scores Citedness scores can be highly informative in estimating usefulness & perceived importance of a paper by peers in form of citations (=links). Major differences because of the domains of citing sources In journal publisherss archive gathered only from digitized journals of the publisher At aggregators/facilitators from all databases hosted (except...PsycINFO for not so splendid isolation policy)) In self-published databases gathered from the database itself In Scopus gathered from 1996 onward from >10,000++ journals In WoS gathered from 1900/1945/1980 forward from <10,000 journals Jacso
79
All the above assume correct identification, matching & calculation. Enter Google Scholar – playing fast and loose with the numbers Make it very fast and very loose Jacso
80
A half-page quickie interview with the author in The Scientist cited 7,380 times? Jacso
81
You can scroll up and down in the purportedly citing Nucleic Acid Research article for the name of Kraulis and The Scientist and the title, you will not find them. Any of them. Jacso
83
Two articles by Kraulis, but neither is the 1993 piece in The Scientist Jacso
84
But what do you expect from a software that cannot even do the most basic Boolean OR operation correctly Jacso
85
Indeed, citation data is subtle stuff and requires competence Jacso
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.