Download presentation
Presentation is loading. Please wait.
Published byMicheal Strait Modified over 10 years ago
1
F o o d i eF o o d i e Marc Greenberg – mgreenberg@cs.usfca.edu A study in collecting and parsing recipes…
2
Im Hungry!... Enter ingredients at your disposal Foodie lists recipe options Rate recipes It learns what you like, and your eating habits… (thats another presentation)
3
But We Need To Populate The Device Food and recipe database needed Collect and parse recipes instead of manual entry Recipe collection from different sources –Predictable vs. non-predictable URLs –Regular vs. irregular recipe format
4
Collecting Recipes Two types of crawlers (written in python) –URL Substitution: Epicurious.com, http://www.epicurious.com/recipes/recipe_views/printer_friendly/11311 http://www.epicurious.com/recipes/recipe_views/printer_friendly/11311
5
Collecting Recipes Two types of crawlers (written in python) –URL Substitution: Epicurious.com, http://www.epicurious.com/recipes/recipe_views/printer_friendly/11311 http://www.epicurious.com/recipes/recipe_views/printer_friendly/11311 –Link Crawler: RecipeSource.com (serving, title, minute, hour,.6) http://www.recipesource.com/fgv/rice/03/rec0362.html http://www.recipesource.com/fgv/rice/03/rec0362.html FoodNetwork.com, (recipe, serving, yield, time, print, minute,.8) http://www.foodnetwork.com/food/recipes/recipe/0,,FOOD_9936_17273,00.html Need to identify good and bad pages
6
Finding the Ingredients Induction wrappers Layout Character and grammar structure
7
Parsing Recipe metadata –Title, summary, serving size, prep time, etc. Ingredient list –Amount, unit, food item Directions
8
Existing Software MasterCook TM, leading software product Manual import features Slow full text search Starting database has just over 8000 recipes
9
? Questions
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.