Presentation is loading. Please wait.

Presentation is loading. Please wait.

Istituto di Linguistica Computazionale – Pisa

Similar presentations


Presentation on theme: "Istituto di Linguistica Computazionale – Pisa"— Presentation transcript:

1 Istituto di Linguistica Computazionale – Pisa
Andrea Bozzi Special applications for Digital Libraries: computer-aided philological and linguistic analysis of digital documents NEH/CNR Meeting Washington DC October 5, 2007

2 Presentation contents
An EU supported system for Greek papyrology A special application for browsing and searching demotic documents on ostraka; A philological workstation for digital medieval manuscripts; CHLT-LEMLAT (EC-NSF project) to perform lemmatization of Latin texts; How to integrate all these modules in a web-based open source application.

3 Presentation contents
An EU supported system for Greek papyrology A special application for browsing and searching demotic documents on ostraka; A philological workstation for digital medieval manuscripts; CHLT-LEMLAT (EC-NSF project) to perform lemmatization of Latin texts; How to integrate all these modules in a web-based open source application.

4 The philological workstation: image and text transcription

5 Image segmentation and semi-automatic word linking

6 Annotations and critical apparatus

7 Wordforms list and specific indexes

8 The web philological workstation to manage documents
of the Istituto Papirologico Vitelli in Florence (restricted use)

9 Presentation contents
An EU supported system for Greek papyrology A special application for browsing and searching demotic documents on ostraka; A philological workstation for digital medieval manuscripts; CHLT-LEMLAT (EC-NSF project) to perform lemmatization of Latin texts; How to integrate all these modules in a web-based open source application. Andrea Bozzi NEH/CNR Meeting, Washington October 5, 2007

10 Special system for teaching and retrieving linguistic
information from demotic texts on ostraka OMM 1381: E. Bresciani, S. Pernigotti, M.C. Betrò, Ostraka demotici da Narmuti, Pisa, 1983, pp ; OMM 300: Gallo P., Ostraca demotici e ieratici dall’archivio bilingue di Narmouthis, Pisa, 1997, pp ; OMM 393: R. Pintaudi, P.J. Sijpesteijn, Ostraka greci da Narmuthis, Pisa, 1993, p. 40.

11 L’archivio delle immagini digitali e la tabella dei segni demotici

12 Research results: see the blue parts (arrow) where the selected symbol has been found

13 Presentation contents
An EU supported system for Greek papyrology A special application for browsing and searching demotic documents on ostraka; A philological workstation for digital medieval manuscripts; CHLT-LEMLAT (EC-NSF project) to perform lemmatization of Latin texts; How to integrate all these modules in a web-based open source application. Andrea Bozzi NEH/CNR Meeting, Washington October 5, 2007

14 Textual criticism for medieval manuscripts
Link to the list of collated sources

15 Evaluation of the variant reading in the collated source
Selection of the variant eixens

16 Recording of the variant Eixens in the Critical apparatus

17 Variants search in different ancient printed editions of the same work
Link to the list of collated books

18 Image of the corresponding page

19 Presentation contents
An EU supported system for Greek papyrology A special application for browsing and searching demotic documents on ostraka; A philological workstation for digital medieval manuscripts; CHLT-LEMLAT (EC-NSF project) to perform lemmatization of Latin texts; How to integrate all these modules in a web-based open source application. Andrea Bozzi NEH/CNR Meeting, Washington October 5, 2007

20 Lemmatization results (C
Lemmatization results (C. Sallustius Crispus, De coniuratione Catilinae, 1-2)

21 Lemmatization results of selected wordforms

22 Presentation contents
An EU supported system for Greek papyrology A special application for browsing and searching demotic documents on ostraka; A philological workstation for digital medieval manuscripts; CHLT-LEMLAT (EC-NSF project) to perform lemmatization of Latin texts; How to integrate all these modules in a web-based open source application. Andrea Bozzi NEH/CNR Meeting, Washington DC October 5, 2007

23 Pinakes 3.0 http://pinakes.imss.fi.it
Aim: web-based open source application to manage cultural heritage historical data in digital format. Partners: Fondazione Rinascimento Digitale, Florence; Istituto e Museo della Storia della Scienza, Florence; Ministero per i Beni Culturali, Rome CNR, Istituto di Linguistica Computazionale, Pisa

24 Technology Programming language: JAVA (Jdk1.5)
Servlet Engine: Tomcat 5.5.x + Apache HTTP Connectors. Web server: Apache httpd server 2.2.x. Web Applications Framework: Jakarta Struts Web Service Framework: Apache Axis 1.4 Database Engine: Postgres 8.1 Programming environment: NetBeans Final development: Hibernate

25 Standards DCMI (Dublin Core Metadata Initiative) TEI (Text Encoding Initiative) OWL (Ontology Web Language) RDF-XML (Resource Description Framework) SPARQL (Query Language fo RDF) UTF8 (Unicode Transformation Format).


Download ppt "Istituto di Linguistica Computazionale – Pisa"

Similar presentations


Ads by Google