Presentation is loading. Please wait.

Presentation is loading. Please wait.

Which Log for which Information? Gathering Multilinguality Data from Different Log File Types Maria Gäde, Vivien Petras, and Juliane Stiller Humboldt-Universität.

Similar presentations


Presentation on theme: "Which Log for which Information? Gathering Multilinguality Data from Different Log File Types Maria Gäde, Vivien Petras, and Juliane Stiller Humboldt-Universität."— Presentation transcript:

1 Which Log for which Information? Gathering Multilinguality Data from Different Log File Types Maria Gäde, Vivien Petras, and Juliane Stiller Humboldt-Universität zu Berlin CLEF 2010 Padova, 21 September 2010

2 2 / 16 Premise Assume you are building a multilingual digital library and could log every user action with particular consideration for multilingual activities.  Which questions could one ask?  (Which questions cannot be answered by logging?) Outline: Europeana Log file types Logging multilingual information Europeana ClickStreamLogger

3 3 / 16 Europeana 1,000+ content providers Portal + APIs Services September 2010: 7.8 mio. images 4.6 mio. texts 127,000 videos 68,000 sounds “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament, 27 September 2007

4 4 / 16 Multilingual Europeana Interface Search Browse Results

5 5 / 16 Multilingual Europeana

6 6 / 16 Log File Types 123.123.123.123 - - [11/Mar/2010:09:42:06 +0100] "GET /cache/image/?uri=http://images.scran.ac.uk/rb/images/ thumb/0098/00980252.jpg&size=BRIEF_DOC&type=IMAGE HTTP/1.0" 200 2843 "http://www.europeana.eu/portal/brief- doc.html?start=1&view=table&query=italy" "Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.9.2) Gecko/20100115 Firefox/3.6 (.NET CLR 3.5.30729)" Example Apache web server log

7 7 / 16 Log File Types Example Google Analytics Map overlay (IP address) Languages (system language)

8 8 / 16 Log File Types – Missing Information Web server log (Apache) Interface language missing Certain actions cannot be distinguished (browse = search) Ajax / Flash actions (saved searches, tags, filter) Reconstruct sessions Search engine log (Solr) Only queries Google Analytics Queries missing

9 9 / 16 Logging Multilingual Information Stages of the interaction: Approaching the system / background information Launching queries / browsing Viewing results Interacting with the results (filter, save, tag, repeat) User background Interface language Query language Query type Query content Query translation Search results Result set views Result translation Query reformulation User-generated content Saved searches / docs

10 10 / 16 Logging Multilingual Information - Background User background information Country of access, system language, referrer site Interface language Change  stronger intervention

11 11 / 16 Logging Multilingual Information - Query Query language Query processing Adapting languages to system Query type Simple, advanced, fielded (e.g. language restriction) Pre-selected categories for browsing Query content Named entities, dates, numbers (language ambiguous) Query translation

12 12 / 16 Logging Multilingual Information - Results Search results Document languages Result set views Detailed view, external click  stronger intervention Result translation

13 13 / 16 Logging Multilingual Information – User Activities Query reformulation / refinement Language switch Filtering (language), related-item search User-generated content Language of tags Language of documents being tagged Saved searches / documents ???

14 14 / 16 Europeana ClickStreamLogger Interface language state + change for every activity Search Result numbers, distribution of results by language / country Filtering and related searches Browse Browsing activities + starting points Navigation Move outside Europeana Ajax Save / remove searches / tags User management Account creation etc.

15 15 / 16 What happens now… Soft roll-outs of new releases change site Analysis of log data Interpretation Re-iteration of “useful information” categories Re-design user interaction?

16 16 / 16 www.europeana.eu www.europeanaconnect.eu


Download ppt "Which Log for which Information? Gathering Multilinguality Data from Different Log File Types Maria Gäde, Vivien Petras, and Juliane Stiller Humboldt-Universität."

Similar presentations


Ads by Google