2Motivation Underexplored subsets of Web content temporary pages, Limited scope and richness of indexed content, which may not include relevant components of the deep webtemporary pages,pages behind forms, etc.Basic search interfaces, where there is little collaboration or history beyond independent keyword searchComplex, task-based, dynamic searchTemporal dependencyRich interactionsComplex, evolving information needsProfessional usersA wide range of search strategies
4Why “Dynamic domain”? Domain-specific Search Deep Web Under explored dataProfessional usersComplex information needs
5Dynamic Information Retrieval Temporal change of Documents, Deep Web, emerging topicsUsers change behavior over time, user historyDynamic UsersDynamic DocumentsDomain-specific SEDynamic RelevanceDynamic QueriesRich user-system interaction through queriesTime, geolocation and other contextual change, change in user perceived relevanceDynamic Information NeedsKnowledge evolves over time
6Our GoalThe TREC Dynamic Domain Track envisions a new paradigm, where one can quickly and thoroughly search and organize a subset of the Internet relevant to one's interests.We aim to encourage new research and new systems that provideFast, flexible, and efficient access to domain-specific contentValuable insight into a domain that previously remained unexploredand addresses shortcomings of centralized Web searchWe develop evaluation methodologies forsystems that discover, organize, and present domain relevant contentTechnologies for cross-domain adaptation
8domains Domain Corpus Counterfeit Pharmaceuticals (Pharma) 30k forum posts from 5-10 forums (total ~300k posts)Which users are working together to sell illicit goods?EbolaOne million tweets300k docs from in-country web sites (mostly official sites)Who is doing what and where?Local Politics300k docs from local political groups in Pacific Northwest and British Columbia. Who is campaigning for what and why?
9Domain I Counterfeit pharmaceuticals Sell ineffective or deadly medicationsSell Addictive drugsIndirectly fund botnets and hackers
10Online Pharmaceutical Value Chain In the pharma arena, that chain is awfully complex. Some of it we can observe with tools like those being developed here.Some we cannot and so need to draw on other sources to inform our interpretation of what we see here.
11Under Ground Forum Ads Learn about major affiliation programs Handles of employees and connectionsActivities
12Domain II – Ebola (Crisis IR) Ongoing crisis3.3 million Tweets over five days for GPS tagged conversations about Ebola around the globe.300k docs from in-country web sites (mostly official sites)A set of questions:Where (counties / country) are personalities organizing support of Ebola Viral Disease (EVD) success or perceived failure? What is causing the population to report or not report cases of flu-like symptoms within current or future Ebola Treatment Unit (ETU) sites? How will the local population conduct EVD awareness based off religious, ethnic and tribal education? Where will individuals attempt to garner support and build trust within Liberia?
13Domain III – Local Politics Public personasElected officialsSchool boardsFirst Nation activismKBA StreamCorpus:19 months of timestamped news, blogs, forums>500M tagged by quality NER (BBN Serif)Investigating re-using the KBA query entitiesPart of ground truthing is already completeSubtopic truthing still required86 online personas (people) from the Seattle – Vancouver area
15Task An interactive, multiple runs of search Starting point: System is given a search queryIterateSystem returns a ranked list of 5 documentsAPI returns relevance judgmentsgo to next iteration of retrievaluntil done (system decides when to stop)The goal of the system is to find relevant information for each topic as soon as possibleOne-shot ad-hoc search is includedIf system decides to stop after iteration oneA debate of size of the returned documents: 1, 5, 10.
16Topics Assessors know topic descriptions Topics contain multiple subtopicsChief Sean AtlioS1: Who did he meet withS2: Issues he is pushingS3: What crises are affecting his tribeThe systems are given the topic/query to start the searchNot the subtopicsThe importance of each subtopic can be weighted ??
17Multiple runs of Relevance Judgments Graded relevance judgments0, 1, 2, 3Multiple runs of relevance judgmentsSuppose a topic with 3 subtopicsRun 1:Systems returns d1, d2, d3, d4, d5Relevance judgments:d1: s1 4, s2 2, s3 0d2: s1 1, s2 0, s3 0d3: s1 0, s2 0, s3 0d4: s1 0, s2 0, s3 2d5: s1 0, s2 0, s3 3Run 2:Systems returns another set of d1, d2, d3, d4, d5Another set of relevance judgments…Run N
18Outline Introduction Domains Task Example Topics Evaluation Timeline Discussion
19Pharma Nick Danger, aka HellRaiser Who is he selling to What is he sellingWhat are other aliases in other forumsTools and TechniquesMotivations?
20Ebola Where are untrained health professionals going to provide care? Find health care locationsFigure out how to tell an untrained health professional from trainedIdentify individualsTrack them
21Local politics Chief Sean Atlio Continue from KBA Who did he meet with Issues he is pushingWhat crises are affecting his tribeBackground knowledge (childhood, etc)Protests or events being plannedContinue from KBA
23Evaluation metricsFind relevant information as much as possible and as fast as possibleThe system decides when to stopMetrics handle relevance, novelty, time/effort, and task completionMulti-dimensional evaluationCandidate Evaluation Metrics:Cube Test (Luo et al., CIKM 2013)u-ERR – cascades as user gathers resultsSession nDCG (Kanoulas et al., SIGIR 2011)
24Evaluation - Cube Test Task Cube [Luo et al. CIKM 2013]An emptytask cube fora search taskwith 6 subtopics24
25Evaluation - Cube Test [Luo et al. CIKM 2013] An empty task cube for a search task with multiple subtopicsA stream of “document water” fills into the task cubeA new coming relevant document will increase waters in all its relevant subtopicsThe total height of the water in one cuboid represents the accumulated relevance gain for a subtopicThere is a cap for GainsTotal volume in the task Cube is the total GainCube Test (CT) calculates the rates of how fast a search system can fill up the task cube as much as possible
26Unexpected Expected Reciprocal Rank (u-ERR) Variant of ERR for multiple search iterations with feedback:Submit query to search engineReceive ranked list of resultsStart reading through the list:User examines position nIf user finds new knowledge:Update profileGo to 1 with updated topic as queryelsen += 1Go to 4u-ERR = 1 / (expected list position of surprise)Figure of merit: depth in the listwhere user discovers new knowledge
27TIME Line TREC Call for Participation: January 2015 Data Available: MarchDetailed Guidelines: April/MayTopics, Tasks available: JuneSystems do their thing: June-JulyEvaluation: AugustResults to participants: SeptemberConference: November 2015
28Why you should participate Unique, underexplored research directionGood for academicsNew researchGreat funding opportunitiesEasy and Exciting!
29Familiar, Easy Hard = Exciting Unit of retrieval = DocumentCorpus tiny: 1-2 M docsSpecific domains with rich, interesting content featuresContent is cleansed, deduplicated, utf8, NER tagged, sentence parsesIterative, explicit relevance judgment (feedback) from user (API)Three different domainsSystems submit ranked lists in small batches of five at a timeRelevance judgment consists of:On topic: True or FalsePassage(s):Char offsetsSubtopics_idGraded relevance judgment
31ReferencesJiyun Luo, Christopher Wing, Hui Yang, and Marti Hearst. The Water Filling Model and The Cube Test: Multi-Dimensional Evaluation for Professional Search. CIKM 2013.Evangelos Kanoulas, Ben Carterette, Paul D. Clough, Mark Sanderson. Evaluating Multi-Query Sessions. SIGIR 2011.
32Thank you TREC Dynamic Domain Website: Google group: Google group:https://groups.google.com/forum/#!forum/trec-dd/
33Domain I Counterfeit pharmaceuticals Simple product space (though various dosages)ViagraCialisVicodinPercocetComplex online advertising spaceThousands of online pharmacy storefrontsSpam advertising
34Domain-specific Search Web SearchDomain-specific Searcheveryday usersone-shot querylarge user query logsrelevance at document levela single, straightforward information needkeyword searchprofessional searchersa sequence of queries or actions (e.g. click a node to browse)rich interaction data within the sessionstricter requirements for relevance - evidencemultiple. complex and task-based information needsa wide range of search strategies
35An Exploratory Process Find what city and state Dulles airport is in, what shuttles ride-sharing vans and taxi cabs connect the airport to other cities, what hotels are close to the airport, what are some cheap off-airport parking, and what are the metro stops close to the Dulles airport.InformationneedUserSearch Engine
37Data Gathered Aug 1 – Oct 31, 2010 7 URL/spam + 5 botnet feeds 968M URLs17M domainsCrawled domains for 98% of URLs with1000s of Firefox instancesSignificant IP diversity (overcome blacklisting)~200 purchases from all major programsWhat was required to do this at the scale you did?What would be required to do this on an ongoing basis as part of Memex?Why purchasing cannot be scaled – security measures they put in place: they call, identity backstops, etc..
38Search Engines and Pharma But the real problem is even worse….Ephemeral websites – multiple URLs all link to one siteCompromised websitesHacked sites redirect to pharmacy storesNeed to ID underlying sites and hacking patternsCrawler evasionCloaking to only show site to customersSimple crawlers won’t get to sales sites
39Online Pharmaceutical Economy (Customer)As in many settings tracking the train from user to underlying entity mapping what the user does in cyber space to some thing that happens in the real world.39