Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Frame Augmentation of Free Form Queries for Constraint Based Document Filtering Andrew Zitzelberger.

Similar presentations


Presentation on theme: "Data Frame Augmentation of Free Form Queries for Constraint Based Document Filtering Andrew Zitzelberger."— Presentation transcript:

1 Data Frame Augmentation of Free Form Queries for Constraint Based Document Filtering Andrew Zitzelberger

2 Problem

3 Constraint Based Queries

4 Queries Test Queries 1) Find me a Wii game. 2) Find me a Honda for under 15 thousand dollars. 3) Roller Coaster more than 150 feet high 4) mountains at least 15K feet 5) games under $25 6) mountains less than 4 km 7) ps games < $40 8) coasters longer than 1000 feet 9) car for under 5 grand newer than 1990 with less than 115K miles 10) more than 15K miles under 5 grand newer than 2004

5 Keywords + Semantics Semantic queries are computationally expensive Keyword queries are fast and simple o People are used to keyword queries Synergistic solution: o extract numerical constraints from the query o use keywords to quickly narrow the search space o use constraints as a filter

6 Data Frames Price internal representation: Double external representation: \$[1-9]\d{0,2}(,\d{3})*|...... right units: (K)?\s*(cents|dollars|[Gg]rand|...) canonicalization method: toUSDollars comparison methods: LessThan(p1: Price, p2: Price) returns (Boolean) external representation: (less than|<|under|...)\s*{p2}|...... end

7 Data Frame Library

8 Free Form Query Car under 6 grand newer than 1990 with less than 115K miles

9 Step 1: Condition Extraction Car under 6 grand newer than 1990 with less than 115K miles Extracted Conditions o (Price < 6000) o (Year > 1990) o (Distance < 115000)

10 Step 2: Remove Condition Values Car under newer than with less than

11 Step 3: Remove Stopwords Car

12 Step 4: Perform Keyword Search

13 Step 5: Filter Document on Constraints Keep page if every constraint is satisfied by at least one extracted value

14 Experimental Setup 300 web documents o 100 car+trucks pages from http://provo.craigslist.org o 100 video gaming pages from http://provo.craigslist.org o 50 mountain pages from http://en.wikipedia.org o 50 roller coaster pages from http://en.wikipedia.org 10 queries o 8 with usable conditions 2 data sets o test-development o blind test

15 Results Summary Precision increase for 56% of queries o 75% for test-dev, 50% for blind-test Precision never worse than keyword query Most effective for short, focused documents

16 Discussion Issues: 1.inadequate narrowing or ranking of search space 2.noise caused by other numbers Distance < 115000

17 Future Work Scalability o Indexing data frame extracted terms Precision vs Recall trade-offs Pay-as-you-go search construction

18 Related Work Question-Answering Systems Keyword search over databases and semantic stores

19 Questions?

20 Results (Test-Dev Set)

21 Results (Blind Test Set)


Download ppt "Data Frame Augmentation of Free Form Queries for Constraint Based Document Filtering Andrew Zitzelberger."

Similar presentations


Ads by Google