Presentation on theme: "PubSCIENCE A Post-mortem Analysis off. PubSCIENCE Jacsó."— Presentation transcript:
PubSCIENCE A Post-mortem Analysis off
Messy overlap among DOE databases Messy overlap among DOE Databases
Design and organization problems Scattered databases with much overlap PubSCIENCE – only journal article records; mix of DOE-created and publisher submitted ones Information Bridge – reports only but in full text image format (PDF) ECD – journal article records some overlapping with publisher submitted ones, records of DOE reports haphazardly linked, patents, etc. GrayLIT – reports including Information Bridge
The design “concept” - Discombobulating users Forcing users to do database hopping Propaganda mechanism Lies, damned lies, and PubSCI claims “Selling” the same content multiple times Getting extra budget for NEW product Should be “old” and IMPROVED
- The design “concept” Dicing, slicing, icing [on the cake] Look how much we have done We need more money Big promises + untrue claims: –“significant expansion anticipated” –“more publishers” –“over 1,300 journals” –“over 2 million citations”
Repetitio est mater studiorum but duplicates are excessive
The first official words from Walter L. Warnick, Executive Director
The ribbon cutting by Secretary Richardson reference Jacsó
Excerpt from budget justification and confabulation
Excerpt from 2002 budget request
Science regurgitates wishful thinking Jacsó
For a cool $500,000 a year what could You do?
- The anatomy of the component databases Content problems Database growth or is it decline? Composition change: DOE-created vs publisher supplied records Drastic cost reduction by minimizing DOE A/I activities Ricochet effect on the ES&T “mother” database Sharp decline in quality A/I records The fleecing of users, and paying subscribers
NISC – ES&T the largest commercial version of the ES & T database
ECD Open access subset of the ES & T database
InfoBridge PDF collection of DOE reports
Entire PubSCI subset of ECD + publisher submitted records
PubSCI-DOE & PubSCI-Partners
Content problems again The plummeting of records with controlled descriptors No abstracts in most publisher supplied records Remote vs local abstracts Idle promises of links to abstracts The farce of links Links: the good, the bad, the ugly and the dysfunctional and the non-existent
The first threat in 2001 as reported by LJ, watch for the budget
The rally cry in July, 2002 Jacsó
The poll of information professionals
Partner and journal problems Some good partners, many irrelevant Good partners but irrelevant journals The best energy journals are not included The best energy journal publishers are not partners Which are the best energy journals? Journal Citation Reports Energy & Fuel Section (66 titles) Which are the most widely held energy journals by libraries? OCLC WorldCat wonderful features(see review)see review
How many publishers? From 20 to 41
Absurd journal and publisher claims Double dipping
The Best Publishers only 2 in partnership with PubSCIENCE
The Best Publishers
Phantom data in the January 2001 PubSCIENCE flyer Over 1,300 searchable journals? No, citations + abstracts at best. Over two million citations? No, less than 1 million unique.
Phantom partners in the January 2001 PubSCIENCE flyer Over 40 partner publishers? Many publishers appear only on the flyer not in PubSCIENCE.
Where did you say Oxford University Press was? Not among the searchable publishers, but look Marcel Dekker is there
Who is Marcel Dekker? Oh, just the publisher of Physics & Chemistry of Carbons, the #1 source by Impact Factor in the Energy section of the latest JCR*. Two of its other journals, In Situ, and Petroleum Science & Technology are also among the top 50 Energy journals, but not among the journals for which PubSCIENCE would get records. * (partly due to the questionable IF-algorithm)
The JCR ranking by IF
The moment of truth comes when the journals by publishers need to be listed Nice to have Marcel Dekker, but why these and not its energy-related serials?
How many journals? From to 1400 as reported by OSTI people. Strange roller-coaster, and sudden surge. See rise from Oct speech to 35 publishers and 1,250 journals. Then again, it is a drop from the 1,400 reported on August 9., 2001
Number of journals good for PR, but you had better see the list, and whether they are indeed journals. Look at ZDNet’s offerings.
So here is the list, but records in PubSCI appear only from 2 sources, AnchorDesk, and Enterprise Computing – latter not every listed here
Maybe Marcel Dekker will impress us with a wealth of relevant articles from the 3 journals Jacsó One from each
Marcel Dekker * * Why not link to the items?
Some journals do not really fit the DOE scope of interest, no wonder that there were no records from these journals in the pre Archive section. Dumping into PubSCIENCE “whateva” they can to boost the database size
Relevance of circumcision for DOE is not immediately obvious but maybe the 20+ other articles arguing for and against circumcision will illuminate us – and look there is a good looking link
The link at least works, though what for? Then again, some DOE libraries may indeed subscribe to urology journals and are entitled to the PDF No abstract No subject headings
The PubMed record for the same article serves up at least some useful things
How many records? What the press release claims April 18, 2000
What Mr Warnick told to PITAC in September, 2000?
Misleading not only you and me but also a presidential committee Over 2 million articles and 1,400 journals?
August 2000 Energy Science News big catch That looks like 2.8 million, wow
OSTI enlightened users or maybe bamboozled them with government talk What is is? And how is ALL not all, and how is 10 years more like 13 See on next slide
May I explain? ALL means items from (roughly) 1990 onward Archive means (mostly) pre-1990 In Pull-down menu criteria of source and time are mixed. Full-text limit restricts it to DOE Partners’ records Partners’ records only in ALL (i.e. current domain) When you search by publisher name it is across time boundaries … …unless you use the Date range option, i.e
The “ateis” test a* OR t* OR e* OR i* OR s* in Entire Citation Archive size query
Archive size result
ALL subset size
ALL-LINKED subset size (query confirmation omits FTL limit parameter, but trust me, I used the check- box)
DOE subset size
Here is the skinny as of 09/28/02 Archive 563,505 ALL 763,944 Together1,327,499 Of this DOE 958,699 Partners 368,750 (with links, ahem) That’s gross (in both senses of the word) Watch for the duplicates, triplicates, quadruplicates
The real picture from yours truly
Keystone cops at work Duplicates, triplicates & quadruplicates Reloading same records time and again An indicator of the care and competency of PubSCIENCE staff
And there is an enormous volume of duplicates and triplicates in PubSCIENCE. There are far fewer duplicates in the much larger, richer, smarter Energy Citations Database which is also free. True, no links. Jacsó
Even nicer triplets from PNAS 1996 issues alone (a little more difficult to spot) but the color gizmos guide your eyes *** Jacsó * * * * * * * *
And a quadruplet 4 copies of same records ?
Protein – Quadruple Results (record #1) Remember this unique identifier
Protein – Quadruple Results (record #2) Same ID as #1
Protein – Quadruple Results (record #3) Volume 15 issue 1 This is same as in #4 Minor descriptor Broader descriptor * Means major descriptor s
Protein – Quadruple Results (record #4) Same error Same ID as in #3
Protein – Quadruple Results – not in ES&T at BiblioLine No duplicate
Protein – Quadruple Results – not in ES&T at Dialog No duplicates in Dialog version
Protein – full record
And now about those hyperlinks and cross-searchable claims: your dreams coming true, or are they? Where is that link or abstract or full text?
Take this record about functionalized xenon from PNAS
Look ma’ no link, no abstract
Beefier in-house record from ECD, bumped from PubSCIENCE in favor of publisher’s contribution
Another paltry PubSCI record. That’s what you search in “cross-searching”
This is what PubSCIENCE should have linked to
Look at the options in PNAS. Salivate.
PNAS even has modest indexing and begs to be linked to ITEM- level
Link Fails Of course, digital edition available only from 1998
Let’s go to the home page of PNAS
Search in HWP note that default is OR between words so use “ “
The item at the publisher’ site. PNAS full documents are free from 1996 (after 6-month moratorium)
Here it comes in full glory look at the DOI, and all those options
Abstract then full text with jumpers to sections within article
… and the references from within the articles are also hotlinked to several A/I records, and even to free full text
This article from Science is free for anyone, anywhere, others may be free for subscribers of print who can be recognized via AUTOMATICALLY appended id (cookie pushing).
Highly marked-up text with enlargeable color images, tables and charts
and with hotlinked cross references to articles which are cited (not shown here) AND ones that cite this article (proudly shown here).
Are we in heaven yet?
UH Manoa does have access to the digital edition from 2001
So if you do this on campus or through a proxy server then….
The smart host software will recognize you as UH affiliate and present the full enchilada
and also the Supplementary materials available only in digital format
Conclusion So, tell me again, are we in heaven yet? Not with PubSCIENCE, but soon with others, at least partially Depends on whom are you affiliated with, what disciplines are you in, which services are you using.
Is this touting reverse lobbying? What about DOE’s own EnergyCitations database? Guess why is Infotrieve recommended?
And recommended again prominently
If you go to Infotrieve you will find a nice MEDLINE record, a less nice shipping charge, and an enigmatic statement about royalty. Guys, there is no royalty for PNAS. Period. Has Infotrieve overlooked something while piggy-backing on PubMed?
Maybe on the OSTI About page Energy Citations DB is mentioned. Keep hoping.
PubSCIENCE compared to PubMED, geedily. Poor PubMed had only…
… about 600 journal titles in 2000 – REALLY?
Warnick, Quayle, Bentsen
I knew PubMed. PubSCI, you are no PubMed and Infotrieve, I want to have a word with you. How could you leave behind the links from the imported PubMed record to the free versions?
This what you should have protested. Where did the $500K go?
Go and use the much larger much more content rich DOE alternatives (which do not brag with links)
Greener pastures Go and use also the publishers’ sites to really find really energy related items Go to HighWire Press, Ingenta & CatchWord Go to PubMed if you need items about say, (ne)urology Go to Northern Light Special Collection for abstracts Go to Scirus (yes, I say so) for abstracts and occasional freebies Go to FindArticles for full (but plain) text Go to my site for a polysearch utility of the above