Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 2003Travis Brooks-Trieste1 Breaking and remaking peer review with the SPIRES databases: Our Experience Travis Brooks SPIRES Scientific Databases Manager.

Similar presentations


Presentation on theme: "May 2003Travis Brooks-Trieste1 Breaking and remaking peer review with the SPIRES databases: Our Experience Travis Brooks SPIRES Scientific Databases Manager."— Presentation transcript:

1 May 2003Travis Brooks-Trieste1 Breaking and remaking peer review with the SPIRES databases: Our Experience Travis Brooks SPIRES Scientific Databases Manager Stanford Linear Accelerator Center Pat Kreitz Director, Technical Information Services Stanford Linear Accelerator Center Thanks to Ann Redfield, Michael Peskin, Louise Addis, Heath O’Connell, and Georgia Row for useful input.

2 May 2003Travis Brooks-Trieste2 Topics Part I –History and current situation of SPIRES, arXiv, and Journals Part II –Citation counting: our experiences and views Part III –Speculation for the future

3 May 2003Travis Brooks-Trieste3 Part I Some history, some current data, and some guesses

4 May 2003Travis Brooks-Trieste4 What is SPIRES? Bibliographic records for over half a million papers –Entire literature of High-Energy Physics (HEP) –Many papers from related fields Citations for e-prints and journal articles Over 25,000 searches a day Main site and personnel at SLAC –DESY, FNAL, Durham U., Kyoto U, IHEP (Moscow)

5 May 2003Travis Brooks-Trieste5 arXiv Since 1991: –Makes full-text available for download –Links to SPIRES citation lists –Allows revisions –Divides content into hep-th, hep-ph, hep-ex and many other categories

6 May 2003Travis Brooks-Trieste6 hep-th vs. hep-ex Sharp distinction between Theory and experiment –Different from other disciplines Difference between the publishing cultures of the HEP theorist and the HEP experimentalist

7 May 2003Travis Brooks-Trieste7 th vs. ex Publishing Experiment: –Large Collaborations (>500 authors) –Difficult to referee –Reporting results Theory (my focus): –Small collaborations (<10 authors) –Self-contained papers –Conversational –hep-th and hep-ph similar

8 May 2003Travis Brooks-Trieste8 hep-th (Pr)eprints: A Timeline Mid 1960’s preprints sent by authors to select groups 1969 SLAC library began ppf (preprints in particles and fields) list –Created demand for distribution –Legitimized preprints/preprint libraries –Led to anti-ppf list

9 May 2003Travis Brooks-Trieste9 hep-th (Pr)eprints: A Timeline 1974 SPIRES-HEP database indexed preprints –Allowed more general, worldwide, distribution and retrieval of preprint titles –Still needed papers by mail –Preprints used conversationally –On WWW in 1991

10 May 2003Travis Brooks-Trieste10 hep-th (Pr)eprints: A Timeline 1991 arXiv.org allowed immediate and universal electronic access to full-text of preprints –Preprints became eprints –Demise of all HEP journals predicted

11 May 2003Travis Brooks-Trieste11 Preprints not new… arXiv is a logical extension of the movement towards preprints, not a “bolt from the blue” –Preprints have a long history of use –Preprints are more easily distributed today

12 May 2003Travis Brooks-Trieste12 History of hep-th arXiv arXiv is busy –Over 90% of papers published in Phys. Rev. D after 1995 were submitted to arXiv But authors still publish! –75% of hep-th papers (prior to 2002) have been published

13 May 2003Travis Brooks-Trieste13 When are eprints published? Difference between Phys. Rev. D publication time and eprint appearance time 6,000 articles from June 1997-2003 Mode at 5 months 17 negative times not shown

14 May 2003Travis Brooks-Trieste14 When are they published? What caused the negative times? Are the large delays from “testing the waters?” Do researchers wait for peer review to determine if an article is worth reading?

15 May 2003Travis Brooks-Trieste15 When are papers read? Q:When does most citing occur? A:Plot the citations a published hep-th article receives after its arXiv submission –8000 published papers in sample –Includes citations from journal papers and arXiv papers (essentially the same set)

16 May 2003Travis Brooks-Trieste16 Eprints, not journals Journal lag time 5 months Citation peak occurs after eprint release, not journal release Inference:HEP theorists don’t wait for the journal.

17 May 2003Travis Brooks-Trieste17 Current hep-th situation Researchers read the arXiv to find out the latest scientific information They base their work on what is in the arXiv Scientific priority is given by arXiv time stamp, not journal submission date They barely notice if it is published

18 May 2003Travis Brooks-Trieste18 HEP theorist’s viewpoint arXiv is for immediate communication –A running scientific conversation Overheard about a paper not sent to hep-ph: “He didn’t publish it, he just sent it to Phys. Rev. D ”

19 May 2003Travis Brooks-Trieste19 Journals Irrelevant? 75% of hep-th papers (prior to 2002) have been published Correlation between large cite counts and publication Journals are still very much alive

20 May 2003Travis Brooks-Trieste20 Why do authors publish? (4 guesses) 1-Inertia –There is no other system as developed or as trusted –Journals are ingrained in researchers’ psyches –But journals don’t appear to be going away (quickly)

21 May 2003Travis Brooks-Trieste21 Why do authors publish? 2-Feedback –Refereeing is useful for this paper and the next –The paper is already on arXiv while it is being refereed –But arXiv submissions generate comments and revisions as well

22 May 2003Travis Brooks-Trieste22 Why do authors publish? 3-Professional Advancement –Do tenured/secure faculty publish fewer of their eprints? Anecdotally: Witten seven 50+ cited papers as eprints only In general: interesting question to think about… –If professional advancement is the sole purpose of peer-review, could we not do better? Are we using the peer review process as a substitute for performance evaluation?

23 May 2003Travis Brooks-Trieste23 Why do authors publish? 4-Archival value –Do authors believe that arXiv is a good archive? –Will arXiv only eprints still be around (readable, accessible) in 100 years? Perception, not reality, matters here E-only journals appear no different Centralization, not media, should be the concern

24 May 2003Travis Brooks-Trieste24 Part II Cite counts and the future

25 May 2003Travis Brooks-Trieste25 Cite Counting Cite counts present a data-driven picture of the hep-th eprint culture Much work already (by many here today) –Cites to HEP eprints from journal articles are high and rising (Brown 2001, Youngen 1998, others) –arXiv impact factor is similar to journals (Fabbrichesi and Montolli, 2001) –Many other studies (often using SPIRES-HEP data)

26 May 2003Travis Brooks-Trieste26 Cite Counting Cite counting for bibliometric purposes seems reasonable (perhaps) Cite counting for peer review purposes? –Services like SPIRES (free) and ISI (fee) make cite counts available to other researchers, hiring committees, and tenure review boards.

27 May 2003Travis Brooks-Trieste27 Cite Counts = Peer Review? Are citations the electronic answer to refereed journals? Currently the only answer –Only one widely available But not a very good answer – arXiv + SPIRES cite counts are not Phys. Rev. Lett.

28 May 2003Travis Brooks-Trieste28 Cites: Pros and Cons SPIRES has been making citations available for over 25 years –We have noticed a few things about the process Some good Some bad Some merely interesting

29 May 2003Travis Brooks-Trieste29 Advantages-Dynamic Cite counts change with the field –Classics –New papers –Newly discovered classics Ex:Weinberg’s Standard Model paper –Few cites initially –Over 5,000 now Ex:M. Peskin’s topcite reviews

30 May 2003Travis Brooks-Trieste30 Advantage-Fast Cite counts begin immediately after appearance Electronic publishing means peer review is the lag time Lag time makes journals archivists rather than communicators –Led to the replacement of this function by arXiv/SPIRES/etc.

31 May 2003Travis Brooks-Trieste31 Advantage-Easy SPIRES tracks citations with 4 staff members –Total staff is about 8 –We are not that technically sophisticated –We are not even especially clever! –Still it is non-trivial

32 May 2003Travis Brooks-Trieste32 Disadvantage-Accuracy Speed, ease rely on electronic processing –Accuracy or speed? Reference lists in a paper change over an article’s life –What counts as a cite? –Which version of the paper?

33 May 2003Travis Brooks-Trieste33 Disadvantage-Relevance Theory:Citations are a measure of what scientists read But Does Citing = Reading ? –Simkin & Roychowdhury (cond-mat/0212043 and cond-mat/0305150) –Students, general public

34 May 2003Travis Brooks-Trieste34 Disadvantage-Relevance Theory:Cites are a mark of quality What about brilliant papers out of the mainstream? Are papers really even referenced for scientific reasons? –Or are they referenced for sociologic reasons? –Or are references simply copied?

35 May 2003Travis Brooks-Trieste35 Disadvantage-Relevance Tongue-in-cheek reasons for not citing prior work (humorous, but not far off…) –“If it’s old, foreign—or—old and foreign” –“They don’t cite us either” –“Rain forest preservation through paper-saving” –“I figured if you’re smart enough to read this paper, you already knew that!” from The Scientist

36 May 2003Travis Brooks-Trieste36 Interesting-Importance People take it seriously Funding, careers, reputations, etc. are perceived to depend in some way on SPIRES citation data

37 May 2003Travis Brooks-Trieste37 Interesting-Importance We receive ~50 emails a day, most of them revolving around incorrect, incomplete, or missing references –Usually from an author whose paper was cited but missed –Often marked “URGENT” –Occasionally with panicked explanations including the date that the review committee is meeting –Sometimes accusing SPIRES of sabotage, or otherwise expressing outrage at a missed citation

38 May 2003Travis Brooks-Trieste38 Importance is helpful… Importance shows that cite counting is useful (or at least used!) Users of the information are motivated to help maintain it –SPIRES is almost open source –We help eliminate authors’ typos, they help eliminate our errors

39 May 2003Travis Brooks-Trieste39 …helpful… SPIRES can replace bad cites with the correct ones –Corrects our errors –Corrects author errors –Even helps limit propagation of errors Ex: a Witten article with 1,300 cites had 100 incorrect cites, all the same typo

40 May 2003Travis Brooks-Trieste40 …but also worrisome Responsibility lies with the maintainers of the citation counts –Previously in the hands of referees and editors Self-citation –Boost counts artificially Deception –We have had it happen

41 May 2003Travis Brooks-Trieste41 Citation Counts: Summary We do it, and it works –Fast, Easy, and Fluid –Valued by the Community It is more than imperfect –Relevance and Accuracy –Does not yet replace traditional peer review

42 May 2003Travis Brooks-Trieste42 Part III What would it take to truly change peer review?

43 May 2003Travis Brooks-Trieste43 To change peer review Stakeholders in the peer review system –Editors –Referees –Authors –Readers Fundamental differences between disciplines –hep-th and hep-ex are different in their adoption of eprints

44 May 2003Travis Brooks-Trieste44 To change peer review Functions of peer review when divorced from communication One must replace (or discard) all of these –Metrics for papers –Metrics for scientists –Metrics for truth?

45 May 2003Travis Brooks-Trieste45 Peer review = “good science” ? Peer review gives a seal of approval –Laypeople Medicine, Environmental Science, etc. Refereeing process is filled with examples of weakness –Yet it feels fundamentally sound Publishers have taken this role of “vetting” science

46 May 2003Travis Brooks-Trieste46 Truth is more complex Community acceptance determines scientific truth –“Yesterday’s sensation, today’s calibration” The “test of time” is longer than the 6 month lag time for journal articles Immediacy is needed for communication and conversation But deliberation is needed for context and community judgment

47 May 2003Travis Brooks-Trieste47 An Opportunity Place an article in the context of the surrounding work –Reference linking only a baby step –Degree to which a finding has been verified or contradicted by earlier or later work Ex: M. Peskin’s Topcites reviews at SLAC –The numbers are amusing –Context is the real value

48 May 2003Travis Brooks-Trieste48 Context Another Example: Particle Data Group –Reports data from all HEP experiments –Sorts and combines data –References to comments on validity –References to interpretations of the data

49 May 2003Travis Brooks-Trieste49 PDG Example

50 May 2003Travis Brooks-Trieste50 Opportunities Intense scrutiny not possible for journals –Context is important Amazon and google – Personalized and dynamic –Citebase –Torii

51 May 2003Travis Brooks-Trieste51 A New system Any new system would need to do (at least) the following –React to changes in the scientific world “You cannot read the same paper twice” –Provide context as well as content –Be fast and easy enough to keep up with scientific conversations taking place on arXiv(es) –Provide an imprimatur of quality both for the cognoscenti and the amateurs

52 May 2003Travis Brooks-Trieste52 Summary SLAC-SPIRES and arXiv helped transform the hep-th publishing environment –Journals play no role in communication –Journals are still widely used Citation counting played a part in this transition –Counting is not a complete solution to peer review New models of peer review are farther away –Should be richer than any current example

53 May 2003Travis Brooks-Trieste53 Why HEP Theory? No proprietary/patent issues Papers can be verified by hand, by any knowledgeable reader Work is like a continuing dialog, each paper sparking new, creative ideas

54 May 2003Travis Brooks-Trieste54 Same basic style Note that the basic publication style has not really changed –HEP Theory has not moved away from papers written by a few authors to more complex technology-enabled collaborations

55 May 2003Travis Brooks-Trieste55 Other Fields HEP experiment has had more radical changes in working style –World’s largest database (>600TB) –Worldwide data processing grid –Close to 1000 authors on a paper –Technology used to push pre-paper scientific collaboration to new levels Other fields might retain traditional journal roles while using unpublished research as additions and expansions rather than substitutes

56 May 2003Travis Brooks-Trieste56 Conclusions HEP theorists have universally adopted eprints as the means of intra field communication Peer-reviewed journals are still heavily used, but for different purposes The needs of HEP theorists were very close to the traditional publication model


Download ppt "May 2003Travis Brooks-Trieste1 Breaking and remaking peer review with the SPIRES databases: Our Experience Travis Brooks SPIRES Scientific Databases Manager."

Similar presentations


Ads by Google