The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of.

The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of July, 2013.

Agenda Background of the study Theoretical framework Research methods Results Summary and closing remarks

Background to the search test Initiated and partially cofinanced by the Danish National IT and Telecom Agency Purpose: To investigate how automatic assignment of metadata can contribute to the intention of increased efficiency and effectiveness in (Danish) e-government

Building on indexing/categorization: Early Cranfield tests Categorization is helpful: when the query is vague, broad, general, or ambiguous when result rakings are deficient (Käki, 2005) in supporting exploratory searches in understanding large search sets (Kules & Shneiderman, 2004; 2005)

Research methods Case study in the Danish Tax Authorities Search test: Controlled lab test Comparison test Professional users Domain specific search tasks Pre test questionnaire Log data Post search interview

Data: Search test System characteristics: BPrototype of the corporate intranet Bwww.skat.dk content and internal informationwww.skat.dk 2 search systems: BFree text indexing (SYSTEM A) BCategorization (SYSTEM B) 32 test persons 3 controlled and 1 natural search task per session, 2 tasks per system

Search test: General findings Variables System A Sessions N=64 Queries N=229 System B Sessions N=64 Queries N=335 Number of terms in queries (averages) 2.252.43 Search filter ‘document type’ applied (percentages) 43.231.6 Number of sessions with reformulations (percentages) 65.682.8 Number of reformulations in sessions (averages) 2.584.23 Query success (percentages) 30.621.5 Session success (percentages)89.184.4

Success at task level Sim1 Sim2 Sim3 NWT Total SysASysBSysASysBSysASysBSysASysBSysASysB Session succeed ed 15 (93.8) 16 (100.0) 15 (93.8) 9 (56.3) 16 (100.0 11 (68.8) 13 (81.3 ) 57 (89.1) 54 (84.4) Query succeed ed 18 (58.1) 23 (33.3) 17 (30.4) 11 (9.7) 20 (27.8) 22 (25.6) 15 (21.4) 16 (23.9 ) 70 (30.6) 72 (21.5) At task level the success of the two systems differs

Task level results Sim1Sim2Sim3NWTTotal System A 1.94 (n=16)3.50 (n=16)4.50 (n=16)4.38 (n=16)3.58 (n=64) System B 4.31 (n=16)7.06 (n=16)5.38 (n=16)4.19 (n=16)5.23 (n=64) Total3.13 (n=32)5.28 (n=32)4.94 (n=32)4.28 (n=32)4.41 (n=128) Sim1Sim2Sim3NWTTotal System A 2.32 (n=31)2.39 (n=56)2.42 (n=72)1.94 (n=70)2.25(N=229) System B 2.54 (n=69) 2.88 (n=113) 1.79 (n=86)2.39 (n=67) 2.43 (N=335) Total 2.47 (n=100) 2.72 (n=169) 2.08 (n=158) 2.16 (n=137) 2.36 (N=564)

Reformulations Total SysASysB No reformulations 69 (30.1)62 (18.5) Category -114 (34.0) Query terms 97 (42.4)47 (14.0) Document type 28 (12.2)8 (2.4) Search operators 8 (3.5)5 (1.5) >1 types simultaneously 27 (11.8)99 (29.6) Total229 (100)335 (100)

System B (cat.) omissions Number of sessions in system B Number of successful sessions system B System B26 (40.6)22 (40.7) Combined system B sessions 38 (59.4)32 (59.3) Total64 (100.0)54 (100.0)

System B (cat.) omissions Highly relevant documents are discovered before a category has been selected Relevant documents are located while waiting for B (cat.) to categorize search results Categorization is not relevant when few documents are retrieved

Summary Categorization is useful: When employees do not posess extensive knowledge about the task at hand In offering new perspectives on the composition of a qury In understanding facets of queries When task knowledge is present, categorization is used to support the assumptions of a correct search

Summary Categorization is omitted when: Search results are limited When relevant documents are ranked at the top of the results

National IT & Telecom Agency: Findings The participants start out with free text indexing and supplement with the other when necessary The indexing methods compared are complementary To meet the variety of information needs several indexing methods should be represented simultaneously

Thank you for your attention! ?

The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of.

Similar presentations

Presentation on theme: "The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of.

Similar presentations

Presentation on theme: "The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of."— Presentation transcript:

Similar presentations

About project

Feedback