Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sometimes, I just want to count things Peter Millington SHERPA Technical Development Officer CRC, University of Nottingham.

Similar presentations


Presentation on theme: "Sometimes, I just want to count things Peter Millington SHERPA Technical Development Officer CRC, University of Nottingham."— Presentation transcript:

1 Sometimes, I just want to count things Peter Millington SHERPA Technical Development Officer CRC, University of Nottingham

2 Sometimes, I just want to count things Actually, thats a lie How difficult can it be? It should be as easy as One– Simplicity 2.Two– High performance 3.Three– Acceptable limits Actions speak louder than words Some datasets to play with

3 Actually, thats a lie Just give me numbers for OpenDOAR –No. of items in ~1,800 repositories –Growth rates –Number of full texts v metadata-only records More generally ( any database or resource ) –No. of records in the database –No. of records by year, month, etc. –No. of records by category

4 How difficult can it be? Screen scraping?– Uh-uh-uh OAI-PMH – counting identifiers –BIG files – e.g. DSpace – Time out! –Iterative chunks – e.g. EPrints – Yawn –completeListSize argument– If only… ORE is no better– Whatever… select count(*) from TABLE; – Duh! So back to screen scraping– Sigh

5

6 It should be as easy as …one… Simplicity Single SQL SELECT statement –Anything more is too complex and so too slow Single Call/File –No iteration Single simple schema –XML (+ optional JSON, and other renditions)

7 …two… Target Performance - Rules of Two <= 0.2 seconds –SQL execution <= 2 seconds –Rendering the output file <= 20 –Data points

8 …three Maximum limits - Rules of Twenty (?) <= 2 seconds –SQL execution <= 20 seconds –Rendering the output file <= 200 –Data points

9 Actions speak louder than words Protocol for Statistical Harvesting (PSH) –Base URL + verb + optional arguments Specification & Examples –http://opendoar.org/demos/psh_prototype.phphttp://opendoar.org/demos/psh_prototype.php Example Base URL: –http://opendoar.org/demos/psh.phphttp://opendoar.org/demos/psh.php

10 Simplest case - [base url]?verb=Count T00:05:26Z 1860

11 Optional Count Arguments &countType – units for counts –e.g. records, repositories, groups, genera, etc &setType – some sort of category –e.g. subject, region, social class, etc. &dateUnit –e.g. decade, year, month &dateType –e.g. Date added, updated, performed, extinct, etc.

12 Breakdown by year added T00:36:24Z

13 Other verbs Verbs for listing available argument values –ListSetTypes –ListDateUnits –ListDateType s –ListCountTypes Help – Technical help Identify – Information about the resource

14 Some datasets to play with OpenDOAR – open access repositories –http://opendoar.org/demos/psh.phphttp://opendoar.org/demos/psh.php SHERPA/RoMEO – Publishers policies –http://www.sherpa.ac.uk/romeo/psh.phphttp://www.sherpa.ac.uk/romeo/psh.php Folk Play Scripts database –http://mastermummers.org/scripts/psh.phphttp://mastermummers.org/scripts/psh.php Folk Play Groups & Events –http://mastermummers.org/groups/psh.phphttp://mastermummers.org/groups/psh.php

15 How could this be improved?


Download ppt "Sometimes, I just want to count things Peter Millington SHERPA Technical Development Officer CRC, University of Nottingham."

Similar presentations


Ads by Google