Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer-aided learning of transitive non-locative constructions with a concrete direct object in Modern Greek Kyriaki Ioannidou Eleni.

Similar presentations


Presentation on theme: "Computer-aided learning of transitive non-locative constructions with a concrete direct object in Modern Greek Kyriaki Ioannidou Eleni."— Presentation transcript:

1 Computer-aided learning of transitive non-locative constructions with a concrete direct object in Modern Greek Kyriaki Ioannidou kiroanni@auth.gr Eleni Tziafa etziafa@lit.auth.gr Rania Voskaki rvoskaki@hotmail.com Laboratory of Translation and Language Processing Aristotle University of Thessaloniki GREECE 7 th International Technology, Education & Development Conference, Valencia 4–6/3/2013

2 Presentation Plan  Theoretical and methodological framework  Linguistic resources used  Linguistic resources created  Modern Greek CDO lexicon-grammar table  Parameterised finite state automaton (FSA)  FSA for Greek noun phrases  In the context of CLIL and CEFR

3 Research aim: Computer-aided learning of transitive non-locative constructions with a concrete direct object (CDO) for Modern Greek learners Theoretical Framework adopted: Transformational Grammar defined by Z. S. Harris (1968) [1] Methodology Model adopted: Lexicon-Grammar developed by M. Gross (1975) [2]

4 Linguistic resources used in our research a corpus of 7.000.000 words (journalistic and educational discourse) ▫3.000.000 words of the Makedonia newspaper ▫2.000.000 words of the TA NEA newspaper ▫2.000.000 words of school books by the Pedagogical Institute a morphological dictionary of 2.000.000 flectional forms belonging to the Laboratory of Translation and Language Processing of the Aristotle University of Thessaloniki

5 Linguistic resources created a syntactic-semantic lexicon in the form of a table (Lexicon-Grammar Table comprising 300 verbs). a parameterised FSA allowing the use of the above lexicon. a set of 382 chunking FSA for the noun phrase structure description.

6 CDO in Modern Greek constructions are transitive Η Μαίρη έσπασε το βάζο Mary broke the vase the complement is one and obligatory Ο Γιάννης επιδιορθώνει το ψυγείο John is repairing the fridge * Ο Γιάννης επιδιορθώνει * John is repairing

7 CDO in Modern Greek the complement is not prepositional Η μητέρα της καθάρισε τα τζάμια Her mother cleaned the windows ΧΠάει στο πάρκο Χ He goes to the park the complement is a concrete direct object Ο Ερρίκος σιδέρωσε το πουκάμισό του Eric ironed his shirt Χ Η γραμματέας ενημέρωσε το αφεντικό της ΧThe secretary informed her boss

8 Lexicon-Grammar Table constructed

9 Application to corpora Use of a parameterised FSA Parameterised FSA are meta-graphs allowing the automatic generation of a set of graphs, on the basis of a lexicon-grammar table. They refer to the columns of the lexicon-grammar table in the form of parameters or variables. They allow the recognition of certain syntactic terms, direct objects in our study. They describe all possible constructions of the verbs studied. They have the format of a FSA (Hopcroft, Motwani & Ullman 2001) [3] They are created via the Unitex platform (Paumier 2003)[4]

10 Parameterised FSA

11 Parameterised FSA enrichment Parameterised FSA focus on recognising: ▫the syntactic role of constituents ▫all possible transformations described in the lexicon- grammar table Parameterised FSA do not recognise : ▫complex structures of syntactic terms A detailed description of noun phrase structure is needed

12 Noun phrase description Based on the approach of Ramshaw & Marcus (1995) [5] ▫Base noun phrases : non-recursive noun phrases; noun phrases that contain no nested noun phrases Αποστείρωσε το μικρό μπουκάλι (He sterilised the small bottle) ▫Maximal-length noun phrases : base noun phrases modified by other base noun phrases Πλένω την κούκλα και τα ρούχα της (I am washing the doll and its clothes)

13 Noun phrase description Structures recognised: ▫all base noun phrases (nouns, pronouns, nominals) Άρπαξε το ποτήρι (She grabbed the glass) ▫maximal-length noun phrase with the use of genitive case Οι εργάτες έκαψαν το σπίτι του εργοδότη τους (The workers burnt the house of their employer)(Lit. transl.) (The workers burnt their employer’s house) ▫maximal-length noun phrase with coordination Πληκτρολόγησε το βιογραφικό της και τη συνοδευτική επιστολή (She typed her CV and the motivation letter)

14 Noun phrase description In NLP noun phrase description can be considered as equal to noun chunking (Abney 1991 [6]; Voutilainen 1993 [7]; Tjonk Kim Sang 2000 [8]; Bai, Li, Kim & Lee 2006 [9]) Description is made by the use of FSA (chunking graphs) (Brill 1993 [10], Roche 1993 [11]; Abney 1996 [12]; Blanc et al. 2007 [13]; Mokrane et al. 2008 [14]) FSA were created via the Unitex platform (Paumier 2003 [4])

15 Chunking FSA

16 Sample concordances […] και αγγίξαμε [όλες τις επιφάνειες των σχημάτων]NP1αγγίξαμε [όλες τις επιφάνειες των σχημάτων]NP1 ([…] we touched [all the surfaces of the shapes]NP1)(Lit. tr.) ([…]we touched [all the shapes surfaces]NP1) Κάθε μέρα πήγαινε εκεί, άνοιγε [την κάνουλα]NP1 […]άνοιγε [την κάνουλα]NP1 (Every day he went there, opened [the faucet]NP1 […]) Έτρωγα [το πρωινό και το μεσημεριανό]NP1Έτρωγα [το πρωινό και το μεσημεριανό]NP1 που’φερνε […] (I ate [the breakfast and the lunch]ΝP1 […])(Lit. transl.) (I ate [breakfast and lunch]NP1 […])

17 In the context of CLIL  The corpus used comprises the following thematic units: (i) economics, (ii) book presentation, (iii) visual arts, (iv) arts and culture, (v) biology, (vi) history, (vii) translated Ancient Greek texts, (viii) physics, (ix) chemistry, (x) economics theory, (xi) Modern Greek learning, (xii) religion, (xiii) mathematics.  and the following text types: (i) sports news, (ii) reportage, (iii) gastronomy, (iv) interview, (v) advertisements, (vi) science news, (vii) artistic reviews, (viii) biographies, (ix) curriculum vitae), (x) short stories, (xi) stories, (xii) forecast, (xiii) dialogs, (xiv) research, (xv) e-mail.

18 In the context of CEFR The proposed method could serve as a complement to learners of B2, C1, C2 levels, since in B2 level: circumstantial structural oversight is allowed, C1 level: use of complex structures with a few structural errors is acquired, C2 level: use of complex structures is strongly required. Out of sample tests and past papers (last five years) of the Certificate of Attainment in Modern Greek (Center for the Greek Language) activities requiring the right order have been observed.

19 Perspectives Enrichment of the existing corpora Improvement of the FSA, so as to eliminate ambiguities Syntactical tagging part of the corpus, in order to evaluate the obtained results (by recall and precision) Recognition of other types of direct objects (e.g. ‘human object’, ‘body part object’, etc)

20 References [1] Harris, Z. S. (1968). Mathematical Structures of Language, New York, Wiley. [2]Gross, M. (1975). Méthodes en syntaxe. Régime des constructions complétives. Paris : Hermann. [3]Hopcroft, J. E., Motwani, R., & Ullman, J. D. (2006). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. [4]Paumier, S. (2003). De la reconnaissance de formes linguistiques à l'analyse syntaxique. Thèse de doctorat, Paris, Université de Marne-la-Vallée. [5]Ramshaw, L. A. & Marcus, M. P. (1995). "Text Chunking using Transformation-Based Learning". ACL Third Workshop on Very Large Corpora, pp. 82-94. [6]Abney, S. (1991). Parsing by Chunks. In S. A. Robert Berwick, Principle-Based Parsing. Dordrecht: Kluwer Academic Publishers. [7]Voutilainen, A. (1993). NPTool, a detector of English noun phrases. Proceedings of the Workshop on Very Large Corpora, ACL, pp. 48-57. [8]Tjong Kim Sang, E. F. (2000). "Noun Phrase Recognition by System Combination". Proceedings of the 1st North American chapter of the Association for Computational Linguistics Conference (pp. 50-55). Stroudsburg, PA: Association for Computational Linguistics. [9]Bai, X.-M., Li, J.-J., Kim, D.-I. & Lee, J.-H. (2006). "Identification of Maximal-Length Noun Phrases Based on Expanded Chunks and Classified Punctuations in Chinese". 21st International Conference on the Computer Processing of Oriental Languages 2006, pp. 268-276. Berlin: Springer-Verlag. [10]Brill, E. (1993). A Corpus-Based Approach to Language Learning. University of Pennsylvania. [11]Roche, E. (1993). Analyse syntaxique transformationnelle du français par transducteurs et lexique- grammaire. Université Paris 7. [12]Abney, S. (1996). Chunk stylebook. Technical report, SfS, University of Tübingen. [13]Blanc, O., Constant, M., & Watrin, P. (2007). "Segmentation en super-chunks." Actes de TALN 2007. Toulouse: ATALA. [14]Mokrane, A., Friburger, N., & Antoine, J.-Y. (2008). "Cascades de transducteurs pour le chunking de la parole conversationnelle : l'utilisation de la plateforme CasSys dans le projet EPAC." TALN 2008. Avignon.


Download ppt "Computer-aided learning of transitive non-locative constructions with a concrete direct object in Modern Greek Kyriaki Ioannidou Eleni."

Similar presentations


Ads by Google