Presentation is loading. Please wait.

Presentation is loading. Please wait.

Craigslist++ sean anastasi joseph chen tatiana gershanovich andreas sekine cse454 craigslist++

Similar presentations


Presentation on theme: "Craigslist++ sean anastasi joseph chen tatiana gershanovich andreas sekine cse454 craigslist++"— Presentation transcript:

1 craigslist++ sean anastasi joseph chen tatiana gershanovich andreas sekine cse454 craigslist++

2 to enhance craigslist’s interface – show related items also being sold at craigslist – show related items from other third-party sites our goal cse454 craigslist++

3 main components – crawler (heretrix) – clusterer (carrot2) – relevance sorting – user interface (greasemonkey) – other stuff how we do it cse454 craigslist++

4 specific crawling needs – volatile data – questionable legalities heritrix – only crawling one domain – problematic setup our setup – 2 crawlers for new posts, 1 cleaner crawler cse454 craigslist++

5 Carrot2 – what to cluster (title, body or title + body)? – need of reclustering and combination WordNet – combination of synonym clusters clusterer cse454 craigslist++

6 relevance sorting cse454 craigslist++

7 relevance sorting (cont.) cse454 craigslist++

8 greasemonkey – show related posts (grouped by clusters) – show which items have data jquery – folding item lists – mouseover details/images user interface cse454 craigslist++

9 amazon product advertising api yahoo term extraction botnet other cse454 craigslist++

10 greasemonkey plugin – https://addons.mozilla.org/en-US/firefox/addon/748 https://addons.mozilla.org/en-US/firefox/addon/748 craigslist++ script – http://cubist.cs.washington.edu/~lidor7/craigslistpp.user.js http://cubist.cs.washington.edu/~lidor7/craigslistpp.user.js craigslist – http://seattle.craigslist.org / http://seattle.craigslist.org / demo cse454 craigslist++


Download ppt "Craigslist++ sean anastasi joseph chen tatiana gershanovich andreas sekine cse454 craigslist++"

Similar presentations


Ads by Google