Presentation is loading. Please wait.

Presentation is loading. Please wait.

Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia, 2012-10-28.

Similar presentations


Presentation on theme: "Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia, 2012-10-28."— Presentation transcript:

1 Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia,

2 NEDERBOOMS Exploitation of Dutch treebanks for research in linguistics CLARIN-VL Project Centre for Computational Linguistics (CCL) Dutch Grammar and Language Use (NGTG) Goals: –User-friendly tools –Access to large data files

3 NEDERBOOMS How can we combine the data-oriented approach of treebank mining with the knowledge-oriented method of theoretical and descriptive linguistics?

4 QUERYING LASSY Existing search tools: dtsearch (Kloosterman 2007) Dact (de Kok 2010) stand-alone tools query language: XPath = standard query language for xml trees

5 QUERYING LASSY Some examples: “look for all NP nodes in which the head noun is modified by the adjective ‘politiek’, e.g. politieke discussies” and and

6 QUERYING LASSY Some examples: “look for all NP nodes in which the head noun is modified by the adjective ‘politiek’, e.g. politieke discussies” and and “look for verb clusters in which a separable verb particle occurs between two verb forms, e.g. “Hij zegt dat ze spoedig zullen kennis maken” and and and and

7 QUERYING LASSY XPath –Not user-friendly –Knowledge of Alpino grammar necessary = problematic for non-technical linguists Verify theory through data with corpus or treebank examples Time consuming, requires some effort

8 QUERYING LASSY XPath –Not user-friendly –Knowledge of Alpino grammar necessary = problematic for non-technical linguists Verify theory through data with corpus or treebank examples Time consuming, requires some effort How to make interaction between computational linguistics and theoretical linguistics possible?

9 GrETEL Greedy Extraction of Trees for Empirical Linguistics Search tool based on example sentences Input = natural language No explicit knowledge of formal query language nor Alpino grammar required Bridge gap between descriptive and computational linguistics Available online (optimised for Mozilla Firefox)http://nederbooms.ccl.kuleuven.be

10 GrETEL - online

11

12 GrETEL - input Green versus red word order in Dutch –green: past participle – auxiliary De NAVO stelt dat ze er alles aan gedaan heeft –red: auxiliary – past participle De NAVO stelt dat ze er alles aan heeft gedaan “The NATO claim that they have done everything in their power” (deredactie.be)

13 GrETEL - input

14 >> parsed with Alpino

15 GrETEL - annotation

16 >> info added to Alpino parse

17 GrETEL – query info

18 input example

19 GrETEL – query info input example Alpino parse

20 GrETEL – query info input example query treeAlpino parse

21 GrETEL – query info input example query treeAlpino parse XPath query

22 GrETEL – query info input example query treeAlpino parse XPath query treebanks

23 GrETEL – query tree

24 >> subtree extraction

25 GrETEL – query tree query tree

26 GrETEL – query tree query tree>> XPath generator >>

27 GrETEL – query tree query tree and and and XPath expression>> XPath generator >>

28 GrETEL – treebanks

29 LASSY Small in PostgreSQL database >> no local installation required

30 GrETEL – results

31 >> (adapted) query

32 GrETEL – results >> (adapted) query >> quantitative information

33 GrETEL – results

34 >> tree viewer

35 GrETEL – results >> tree viewer >> list of results

36 Thanks for your attention! Questions?


Download ppt "Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia, 2012-10-28."

Similar presentations


Ads by Google