Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tema 3 INEbase history Statistical books 1858-1997 available on the web Celia Santos

Similar presentations


Presentation on theme: "Tema 3 INEbase history Statistical books 1858-1997 available on the web Celia Santos"— Presentation transcript:

1 Tema 3 INEbase history Statistical books 1858-1997 available on the web Celia Santos celsanto@ine.es

2 Tema 3 2004: what shall we do with past information only available in printed format?  Target: opening up to the public historical collection of INE publications only available on paper INEbase history: background INEbase history: Statistical books 1858-1997 available on the web 1996: The INE joins the Internet 2000: INEbase birth  all statistical production offered on the Internet

3 Tema 3 INEbase history: a new section of INEbase Different alternatives: Tables in pc-axis format Complete PDF versions of the books INEbase history INEbase history: Statistical books 1858-1997 available on the web

4 Tema 3 Project phases: Phase 1. 2nd half of 2004. –What should be published? Most symbolic and representative volumes of public statistical activity: Statistical Yearbooks (1858 – 1997) Population Censuses (1900 – 1970) –Outsource scanning ( + de 100,000 pages) –Outsource the software development Phase 2. 1st half of 2005 –Cataloguing starts –Software improvements suggested by use –20 publications catalogued before publishing INEbase history: Statistical books 1858-1997 available on the web

5 Tema 3 Project phases: Phase 3. July 2005 –Internet launch takes place with 20 Yearbooks and 1 Census Phase 4. October 2006 –Cataloguing and web publications of 78 Yearbooks and 9 Censuses (34 volumes) INEbase history: Statistical books 1858-1997 available on the web

6 Tema 3 Project phases: INEbase history: Statistical books 1858-1997 available on the web Phase 5. Under development, 2007 Incorporation of new publications in 2007 1st six months:  Scan the Agrarian Census and VS statistics  Programme adaptation 2nd six months: cataloguing & publication

7 Tema 3 1. Scanning and OCR Scanning using the originals –Unbinding (old and non-unique) –Guillotining (repeated and unimportant) –Microfiche (rare, old copies) TIFF files obtained OCR programme used to generate txt files  used for search engine Once PDF file is obtained  ready to be catalogued The technical process in 3 steps INEbase history: Statistical books 1858-1997 available on the web

8 Tema 3 2. Cataloguing books into the system : “cataloguer” role INEbase history: Statistical books 1858-1997 available on the web 1st step: create index with categories until we get to the final node: the statistical tables 2nd step: associate one or more PDF documents to each node

9 Tema 3 INEbase history: Statistical books 1858-1997 available on the web How is cataloguing done? Practical example Creation of a virtual book: Statistical Yearbook 2010 Node blocked

10 Tema 3 INEbase history: Statistical books 1858-1997 available on the web Creation of the index publication Creating as many chapters as needed

11 Tema 3 INEbase history: Statistical books 1858-1997 available on the web Creation of the tables and association to the corresponding PDF-doc.

12 Tema 3 INEbase history: Statistical books 1858-1997 available on the web Recreating the hierarchical tree All the publication´s documents appear associated to their corresponding table Cataloguer’s work ends here Nodes unblocked

13 Tema 3 3. Revision before publishing Cataloguing should be revised before being published Who revises?  there is a specific role, the “proof-reader”, but…. this role has not really been used and …in reality another cataloguer does the revision Once the proof-reading work is finished, the book is ready for publication INEbase history: Statistical books 1858-1997 available on the web Proof-reader’s work ends here

14 Tema 3 INEbase history: Statistical books 1858-1997 available on the web 4. Publisher Main task: to publish books; other tasks: user and trasmission control, nodes translation Blocked node Published node Unblocked node Book ready to be shown on the Internet And the translation process begins

15 Tema 3 INEbase history: Statistical books 1858-1997 available on the web Cataloguing Server Dissemination Server Trasmission process: synchronization of servers This step might not be needed

16 Tema 3 INEbase history: Statistical books 1858-1997 available on the web 5. Visualisation on the Internet:

17 Tema 3 INEbase history: Statistical books 1858-1997 available on the web Yearbooks ordered by decades

18 Tema 3 INEbase history: Statistical books 1858-1997 available on the web On the dissemination server The hierarchical tree ….. On the cataloguing programme

19 Tema 3 INEbase history: Statistical books 1858-1997 available on the web And just a click on the required table And a 9 page PDF document is shown

20 Tema 3 INEbase history: Statistical books 1858-1997 available on the web Anything else to be taken into account? Search engine Change language No. of tables Size of pdf file

21 Tema 3 INEbase history: Statistical books 1858-1997 available on the web The search engine: INEbase history has its own Direct access to the pdf document

22 Tema 3 INEbase history: Statistical books 1858-1997 available on the web The search engine is based on the table titles (sorry, only in Spanish) and the hierarchical tree (in English as well) Of course, you might as well use INE’s general search engine:

23 Tema 3 INEbase history: Statistical books 1858-1997 available on the web Population Censuses: everything is also valid

24 Tema 3 1- Economic data : Initial scanning stage: 12,000 Euros, 110,000 pages External development: 90,000 Euros INEbase history: Statistical books 1858-1997 available on the web Some interesting data… 3- Amount of scanned pages Yearbook: 70,000 pages Census: 30,000 pages Total: 100,000 pages 2- Deadlines Scaning + development programme: 6 months Cataloguing: 20 months

25 Tema 3 4- Personnel used: Cataloguing: 0 – 3 Recording assistants Indexes translator: 1 trainee Publisher: 1 – 2 Statisticians IT support team INEbase history: Statistical books 1858-1997 available on the web Some interesting data… 5- How many people use INEbase History? Page views in october: 77,623 (1.2 % of total)

26 Tema 3 IT infrastructure: a reasonably simple system: A cataloguing server houses a copy of the work from the database and the collection of PDF pages; multiple cataloguer PCs provided with a "client" application connect to the server One of the components of the family of web servers at www.ine.es houses the dissemination server (the software, plus a copy of the database and a copy of the collection of PDF pages). This is the system that serves Internet files There are copy and safety mechanisms between one environment and the other The environment is similar to a content management programme INEbase history: Statistical books 1858-1997 available on the web IT data

27 Tema 3 IT infrastructure: a reasonably simple system: Client programmes developed with Microsoft.Net. Server programme developed with Java. Catalogue and dissemination database, Oracle 9i. Programmes for working with PDF files obtained from a manufacturer specialised in this kind of software. Conceptual design. Setting requirements, selection of platforms: National Statistics Institute. Scanning of originals: Proco S.A. Tecnological partner development: Sopra Group. INEbase history: Statistical books 1858-1997 available on the web IT data

28 Tema 3 Thank-you for your attention! Any questions?


Download ppt "Tema 3 INEbase history Statistical books 1858-1997 available on the web Celia Santos"

Similar presentations


Ads by Google