Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of.

Similar presentations


Presentation on theme: "The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of."— Presentation transcript:

1 The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of July, 2013.

2 Agenda Background of the study Theoretical framework Research methods Results Summary and closing remarks

3 Background to the search test Initiated and partially cofinanced by the Danish National IT and Telecom Agency Purpose: To investigate how automatic assignment of metadata can contribute to the intention of increased efficiency and effectiveness in (Danish) e-government

4 Building on indexing/categorization: Early Cranfield tests Categorization is helpful: when the query is vague, broad, general, or ambiguous when result rakings are deficient (Käki, 2005) in supporting exploratory searches in understanding large search sets (Kules & Shneiderman, 2004; 2005)

5 Research methods Case study in the Danish Tax Authorities Search test: Controlled lab test Comparison test Professional users Domain specific search tasks Pre test questionnaire Log data Post search interview

6 Data: Search test System characteristics: BPrototype of the corporate intranet Bwww.skat.dk content and internal informationwww.skat.dk 2 search systems: BFree text indexing (SYSTEM A) BCategorization (SYSTEM B) 32 test persons 3 controlled and 1 natural search task per session, 2 tasks per system

7 Search test: General findings Variables System A Sessions N=64 Queries N=229 System B Sessions N=64 Queries N=335 Number of terms in queries (averages) 2.252.43 Search filter ‘document type’ applied (percentages) 43.231.6 Number of sessions with reformulations (percentages) 65.682.8 Number of reformulations in sessions (averages) 2.584.23 Query success (percentages) 30.621.5 Session success (percentages)89.184.4

8 Success at task level Sim1 Sim2 Sim3 NWT Total SysASysBSysASysBSysASysBSysASysBSysASysB Session succeed ed 15 (93.8) 16 (100.0) 15 (93.8) 9 (56.3) 16 (100.0 11 (68.8) 13 (81.3 ) 57 (89.1) 54 (84.4) Query succeed ed 18 (58.1) 23 (33.3) 17 (30.4) 11 (9.7) 20 (27.8) 22 (25.6) 15 (21.4) 16 (23.9 ) 70 (30.6) 72 (21.5) At task level the success of the two systems differs

9 Task level results Sim1Sim2Sim3NWTTotal System A 1.94 (n=16)3.50 (n=16)4.50 (n=16)4.38 (n=16)3.58 (n=64) System B 4.31 (n=16)7.06 (n=16)5.38 (n=16)4.19 (n=16)5.23 (n=64) Total3.13 (n=32)5.28 (n=32)4.94 (n=32)4.28 (n=32)4.41 (n=128) Sim1Sim2Sim3NWTTotal System A 2.32 (n=31)2.39 (n=56)2.42 (n=72)1.94 (n=70)2.25(N=229) System B 2.54 (n=69) 2.88 (n=113) 1.79 (n=86)2.39 (n=67) 2.43 (N=335) Total 2.47 (n=100) 2.72 (n=169) 2.08 (n=158) 2.16 (n=137) 2.36 (N=564)

10 Reformulations Total SysASysB No reformulations 69 (30.1)62 (18.5) Category -114 (34.0) Query terms 97 (42.4)47 (14.0) Document type 28 (12.2)8 (2.4) Search operators 8 (3.5)5 (1.5) >1 types simultaneously 27 (11.8)99 (29.6) Total229 (100)335 (100)

11 System B (cat.) omissions Number of sessions in system B Number of successful sessions system B System B26 (40.6)22 (40.7) Combined system B sessions 38 (59.4)32 (59.3) Total64 (100.0)54 (100.0)

12 System B (cat.) omissions Highly relevant documents are discovered before a category has been selected Relevant documents are located while waiting for B (cat.) to categorize search results Categorization is not relevant when few documents are retrieved

13 Summary Categorization is useful: When employees do not posess extensive knowledge about the task at hand In offering new perspectives on the composition of a qury In understanding facets of queries When task knowledge is present, categorization is used to support the assumptions of a correct search

14 Summary Categorization is omitted when: Search results are limited When relevant documents are ranked at the top of the results

15 National IT & Telecom Agency: Findings The participants start out with free text indexing and supplement with the other when necessary The indexing methods compared are complementary To meet the variety of information needs several indexing me- thods should be represented simultaneously

16 Thank you for your attention! ?


Download ppt "The Role of Automated Categorization in E-Government Information Retrieval Tanja Svarre & Marianne Lykke, Aalborg University, DK ISKO conference, 8th of."

Similar presentations


Ads by Google