Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Segmentation and Structured Annotation via NLP Rifat Reza Joye Panagiotis Papadimitriou.

Similar presentations


Presentation on theme: "Query Segmentation and Structured Annotation via NLP Rifat Reza Joye Panagiotis Papadimitriou."— Presentation transcript:

1 Query Segmentation and Structured Annotation via NLP Rifat Reza Joye Panagiotis Papadimitriou

2 Problem Caloricious.com: – Semantic search engine for food items Free-text queries over structured data – Query: gluten free high protein bars – Data: Each food item is database record with attributes name, brand, category, nutrients, allergens,.. Query segmentation and structured annotation glutenfreehighproteinbars ALLERGENNUTRIENTCATEGORY

3 1 st Approach MEMM with Synthetic Training Data Seems as instance of NER Problem: No labeled queries to train MEMM Solution: Generate synthetic labeled queries – Query study in 100 queries 96% queries contain 1–3 segments. One of the segments in 98% queries refers to Name or Category or Brand – Algorithm Pick a food item at random Pick 1-3 attributes and generate a query

4 2 nd Approach Segmentation & MaxEnt Classification Query Segmentation Train language model on structured data text Use model to find segment probabilities Find the ML segmentation through DP Segment Annotation Annotate each segment with an attribute using MaxEnt classifier Training: For each attribute training examples come from the corresponding entries of database products glutenfreehighproteinbars glutenfreehighproteinbars

5 Results

6 Conclusions – Future Work Combination of Language Model, Dynamic Programming and MaxEnt classification provides very good accuracy without labeled data It would be interesting to compare with NER on a big labeled set We also plan to compare with the state-of-the art algorithm in the context of a research submission.

7 More Results… Evangelos March 12, 2011 @ 9.14am 19.5 inches 6lbs 11oz


Download ppt "Query Segmentation and Structured Annotation via NLP Rifat Reza Joye Panagiotis Papadimitriou."

Similar presentations


Ads by Google