Presentation is loading. Please wait.

Presentation is loading. Please wait.

Visualizing Natural Language Resources Kristina Kocijan University of Zagreb, Faculty of Humanities and Social Sciences, Department of Information and.

Similar presentations


Presentation on theme: "Visualizing Natural Language Resources Kristina Kocijan University of Zagreb, Faculty of Humanities and Social Sciences, Department of Information and."— Presentation transcript:

1 Visualizing Natural Language Resources Kristina Kocijan University of Zagreb, Faculty of Humanities and Social Sciences, Department of Information and Communication Sciences Zagreb, Croatia krkocijan@ffzg.hr

2 Is it about beautiful pictures? Sooo, what is this presentation about?

3 “ ” Beauty is in the eye of the beholder. 3rd century BC, Greek saying Baudelaire’s beauty: data is beautiful if it is the result of reason and calculation. Thoreau’s beauty: data is beautiful by its very plainness.

4 About beautiful pictures! Sooo, what is this presentation about?

5 New ways of presenting data? Sooo, that’s it – only beautiful pictures?

6 “ ” The hope is that, in not too many years, human brains and computing machines will be coupled together very tightly and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know today.” J.C.R. Licklider, in ‘Man-Computer Symbiosis’, March, 1960.

7 Reading the same data In different forms

8 Reading the same data Slowly, slowly, very slowly Faster, alas lucking info NounsCommonCollectiveProper Fem 8 34413 177 Mas. 6 24923 189 Neut. 5 520366 No gender 0036 362 Total per type 20 113642 794 Total nouns 62 913

9 Reading the same data Speedy, and empowering

10 Reading the same data Speedy, and empowering

11 Reading the same data Speedy, and empowering

12 Presenting the same data Statistics for the nouns in a dictionary Statistics for the nouns in a corpus NounsCommonProper Fem 39.84 %3.61 % Mas. 32.26 %4.64 % Neut. 15.36 %0.12 % No gender 0 %4.17 % Total per type 87.46 %12.54 % Total nouns 1 048 570 NounsCommonProper Fem 13.26 %5.05 % Mas. 9.93 %5.07 % Neut. 8.77 %0.10 % No gender 0 %57.80 % Total per type 31.97 %68.03 % Total nouns 62 907

13 Distribution of top 10 paradigmas In DIC: ALAT ASTRONOM BLAGOST BRATIĆ CRTANJE DAVOR FABIANA GUSJENICA LEPTIR MEDO In Corp: ALAT BAT BESKRAJ BLAGO BLAGOST BRATIĆ CRTANJE GUSJENICA MEDO PROLAZNIK

14 Genitive+sg endings In DICIn Corpus

15 Genitive+sg endings In Corpus

16 Genitive+sg endings - weighted In Corpus

17 Visual Story As told by Data

18 “ ” Often the most effective way to describe, explore and summarize a set of numbers – even a very large set – is to look at pictures of those numbers. Edward R. Tufte in ‘Visual Display of Quantitative Information’, 2001.

19 Story behind the NLR data Instrumental Genitive Vocative Dative Accusative Locative

20 Story behind the NLR data

21

22

23

24

25

26 Thank you! Visualizing Natural Language Resources Kristina Kocijan University of Zagreb, Faculty of Humanities and Social Sciences, Department of Information and Communication Sciences Zagreb, Croatia krkocijan@ffzg.hr Questions?


Download ppt "Visualizing Natural Language Resources Kristina Kocijan University of Zagreb, Faculty of Humanities and Social Sciences, Department of Information and."

Similar presentations


Ads by Google