Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to download prices and track price changes — competitive price monitoring and price matching.

Similar presentations


Presentation on theme: "How to download prices and track price changes — competitive price monitoring and price matching."— Presentation transcript:

1 How to download prices and track price changes: competitive price monitoring and price matching guerrillahub.com

2 Let’s pretend this is a valid intro where I tell you why price matching and price monitoring are important and get to the point ______________

3 Today you will learn about: ● Crawling ● Fetching data ● Parsing the right elements ● Storing and analyzing data And more importantly you will learn how to download price lists from your competitors’ websites.

4 Required tools Netpeak Spider It’s a desktop website crawler we’ll need to fetch data from target websites. It costat $14/mo and there’s a 2 weeks free trial. Google Sheets Or excel. I’m using GS because I need to share my projects, but Excel is more capable. Formulas Depending on your competitor’s website architecture you may need to remove duplicates, unnecessary data from cells and etc.

5 How to download price list from any website: 1. Inspect elements of the page where target data is stored 2. Analyze code to learn how this data is provided across all pages 3. Set up crawling to fetch information from identical code on other pages 4. Test run 5. Crawl entire website and fetch data 6. Create a spreadsheet, remove duplicates and unnecessary info 7. Save the list of remaining URLs to repeat crawling with same settings and track changes

6 Step 1 Inspect elements of the page where target data is stored

7 Open a product page highlight the price, right click on it and click inspect

8 A console will open where this element will be highlighted

9 Step 2 Analyze code to learn how this data is provided across all pages

10 You need to tell the crawler which elements to parse in order to fetch the data It can be: ● XPath ● CSS Selector ● HTML I’ll show you how to get data from XPath which works for most stores and one example of a store which assigns unique IDs to products which makes the process more complicated.

11 XPath Best way to test if fetching data from XPath will work is copying XPaths from two different pages and comparing results. They should be identical ● //*[@id="u-skip-anchor"]/span/span[1] If XPath contains unique ID, this method won’t work. ● //*[@id="new-price-465333"]/span/span[1] ● //*[@id="new-price-244103"]/span/span[1]

12 CSS Selector Fetching data from CSS Selector works on all websites, but sometimes you’ll get a lot of unnecessary information along with what you’re looking for. In this case you’d fetch price, discount, how much you save, VAT and Shipping. All of these can be removed with in Excel or Google Sheets with formulas.

13 CSS Selector Fetching data from CSS works similar to fetching it from XPath, except in this case after opening the console you’ll have to hover over the div which contains necessary information.

14 Step 3 Set up crawling to fetch information from identical code on other pages

15 Netpeak Spider Download and install: https://netpeaksoftware.com/spiderhttps://netpeaksoftware.com/spider

16 Disable all parameters to speed up crawling Crawling settings → Parameters → Uncheck all boxes

17 Enable Custom Search Crawling settings → Custom Search → Use Custom Search

18 Custom Search Settings After you find out how product names and prices are housed on product pages you can set up extraction from corresponding elements. Select the extraction method that fits your requirements (XPath, CSS Selector, HTML)

19 Custom Search Settings Add another custom search field by clicking the green button and repeat the process for any other element from the page that you are interested it

20 Step 4 Test run

21 Analyze few product pages from your target website with these parameters

22 Step 5 Crawl entire website and fetch data

23 If test run was successful start crawling entire website

24 Track progress in the Search tab Found shows how many pages contained prices and product names in corresponding names Not found shows the number of pages where prices and names were either not found (contacts page for example) or where prices were housed in different elements (category pages and lists)

25 Track progress in the Search tab It’s not unusual for crawler to not find any results on the first few hundred pages, since product pages are usually not the closest ones to the main page

26 Export data

27 Step 6 Create a spreadsheet, remove duplicates and unnecessary info

28 Removing duplicates Duplicates appear when crawler visits the same page twice, this can happen for a number of reasons. The best way to get rid of duplicates is to delete them from URL list. That way you’ll only have one instance of each product page on your list. This Google Chrome add-on is great for removing duplicates from Google Sheets.Google Chrome add-on

29 Removing unnecessary info Some websites have a complex structure, which means the only way to download prices is to download a larger CSS Selector: It means that along with price, crawler will fetch everything within this field:

30 Removing unnecessary info To remove everything except price you need to trim data in your cell. Here’s a formula you can use to remove everything from the cell before or after a certain character, word or symbol: =TRIM(LEFT(A1,FIND("word/character/symbol",A1))) — This formula will remove everything from cell A1 after a desired character =TRIM(RIGHT(A1,FIND("word/character/symbol",A1))) — This formula will remove everything from cell A1 before a desired character Create a separate column next to the initial one and apply formula to it.

31 Step 7 Save the list of remaining URLs to repeat crawling with same settings and track changes

32 After you’ve gone through previous steps, you will have a table that looks like this:

33 Copy the list of URLs from the first column and set Netpeak Spider to crawl these URLs only

34 Recrawl these URLs with the same parameters whenever you want to get an update Feel free to copy my spreadsheet with it’s formatting settings: LINKLINK

35 That’s it. Thank you for your attention and good luck with your projects


Download ppt "How to download prices and track price changes — competitive price monitoring and price matching."

Similar presentations


Ads by Google