Presentation is loading. Please wait.

Presentation is loading. Please wait.

Contents:  1 – Introduction to the subject of web mining and techniques  2 – Overview of research conducted (both theory and practical)  3 – Software.

Similar presentations


Presentation on theme: "Contents:  1 – Introduction to the subject of web mining and techniques  2 – Overview of research conducted (both theory and practical)  3 – Software."— Presentation transcript:

1 Contents:  1 – Introduction to the subject of web mining and techniques  2 – Overview of research conducted (both theory and practical)  3 – Software applications on which to test web mining techniques  4 – Demonstration (Digital Solutions and Repairs)  5 – Evaluating results (suitability and practicality) Student Name: Colin Hopson Student Number: 0482647 Course Title: MSc Computer Science (Internet Engineering) 7ET023 – MSc Dissertation Research Question: What is the most suitable web mining technique for a specified business and mobile application case study?

2 7ET023 – MSc Dissertation 1 – Introduction to the subject of web mining and techniques  Sequential research of techniques for an empirical study  Initial research into data mining (databases)  Previous knowledge of web services (RSS, REST, etc.)  Research into theory of web mining  Web usage mining – logs to examine navigation patterns  Web structure mining – examine link hierarchy  Web content mining – “the discovery of useful information from the Web by examining the data that is contained in the Web site” (Pendharkar, 2003 pg.243) * Pendharkar, P.C. (2003) Managing data mining technologies in organizations: techniques and applications, Idea Group Pub, Hershey.  Data extraction from HTML (machine learning algorithms)  Wrapper Induction  Semi-Automatic Extraction

3 7ET023 – MSc Dissertation 2 – Overview of research conducted (both theory and practical)  Researching Theory of Data and Web Mining Empirical research method to acquire knowledge, Research into data mining, web mining, data extraction algorithms, etc., Sequential investigation of applicable techniques.  Artefact Design and Development E-commerce prototype website (Digital Solutions and Repairs), Mobile application (Mobile Shopper).  Practical Research to Implement Techniques Resolution of web services (Amazon APIs), HTML extraction technique using XML; DOM; Xpath; PHP Arrays, Consuming Google API with REST; DOM; Xpath; PHP Arrays, Third-Party Software (Newprosoft and Automation Anywhere), Functionality of XSLT.

4 7ET023 – MSc Dissertation 3 – Software applications on which to test web mining techniques

5 7ET023 – MSc Dissertation 4 – Demonstration (Digital Solutions and Repairs)  Web Mining Technique 1 Amazon API (coded class/methods)  Web Mining Technique 2 HTML Extraction (DOMDocument, Xpath and PHP Arrays)  Web Mining Technique 3 Google API (REST, DOMDocument, XPath and PHP Arrays)  Web Mining Technique 4 Third-Party Software (Automation Anywhere and Newprosoft)  Web Mining Technique 5 None Implemented, but XSLT investigated Website Demonstration >>>

6 7ET023 – MSc Dissertation 5 – Evaluating results (suitability and practicality)  Web Mining Technique 1: Amazon API Requires registration and associate keys, Product Advertising API has most requirements (plus more), ASINs assist administration system, Top quality delivery and discounts, Regular updates although lengthy documentation.  Web Mining Technique 2: HTML Extraction No cost, but requires programming knowledge, Bespoke algorithm specific for HTML format, Limited to one online organisation.  Web Mining Technique 3: Google API Requires registration and associate keys, Searches products from many online organisations, GoogleId does not assist administration system, Web service retrieves limited product information, Top security measures, but lengthy documentation.  Web Mining Technique 4: Third-Party Software Limited free trial with subscription costs, Possible difficulty with integration with administration system  Web Mining Technique 5: XSLT investigated Limited free trial with subscription costs, Integration difficulties with administration system

7 7ET023 – MSc Dissertation SUMMARY Questions?  Study of web mining and some of its techniques Empirical study, data mining, web services, web content mining, data extraction algorithms.  Sequential research conducted (theory and practical) Web services (APIs), HTML extraction, Third-Party software, XSLT.  E-commerce prototype website and mobile application ‘Digital Solutions and Repairs’ and ‘Mobile Shopper’.  Demonstration of web mining techniques DSR computer repairs administration system  Evaluation of web mining techniques investigated Comparison between APIs, HTML extraction, third-party software and XSLT.


Download ppt "Contents:  1 – Introduction to the subject of web mining and techniques  2 – Overview of research conducted (both theory and practical)  3 – Software."

Similar presentations


Ads by Google