Contents:  1 – Introduction to the subject of web mining and techniques  2 – Overview of research conducted (both theory and practical)  3 – Software.

Slides:



Advertisements
Similar presentations
Library Automation Overview of Results January 24 th 2006 Jomo Kenyatta Memorial Library.
Advertisements

BSc Honours Project Introduction CSY4010
Semiautomatic Generation of Data-Extraction Ontologies Master’s Thesis Proposal Yihong Ding.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Discovering Computers Fundamentals, 2011 Edition Living in a Digital World.
The Semantic Web Week 1 Module Content + Assessment Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module.
Web Usage Mining - W hat, W hy, ho W Presented by:Roopa Datla Jinguang Liu.
Instuctor Background Instructor: Michael J. McCarthy Associate Teaching Professor of Information Systems at Carnegie Mellon University from 1999 until.
Applying Multi-Criteria Optimisation to Develop Cognitive Models Peter Lane University of Hertfordshire Fernand Gobet Brunel University.
Extracting Test Cases by Using Data Mining; Reducing the Cost of Testing Andrea Ciocca COMP 587.
Overview of Web Data Mining and Applications Part I
CIS 451: eCommerce Application Development Dr. Ralph D. Westfall January, 2009.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
Web Content Management at GCN.com The Gilbane Conference: Content Technologies for Government Alec Dann SVP of Internet Publishing PostNewsweek Tech Media.
Power to the People: The IUB Libraries' Website Digital Asset Management System Doug Ryner, Tadas Paegle, & Julie Hardesty.
Effect of Open API, NDSL Open Service (NOS) on Sharing Technical Reports in Korea Dec. 2, 2013 Seon-Hee Lee, Mi Hwan Hyun Korea Institute of Science and.
Internet & for Learning The ICT in Schools Initiative of the Department of Education and Science 1 Internet & for Learning Course Overview.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Joel Bapaga on Web Design Strategies Technologies Commercial Value.
IT Introduction to Website Development Welcome!
First... Background Topics Schedule Self Study Me Willem de Bruijn PhD candidate at Vrije Universiteit.
Example XML Applications/Languages. Objectives To Review uses of XML To investigate some Language applications of XML XHTML RSS WML Web Services.
©2010 John Wiley and Sons Chapter 11 Research Methods in Human-Computer Interaction Chapter 11- Analyzing Qualitative.
Web 2.0: Concepts and Applications 6 Linking Data.
Analysis of DOM Structures for Site-Level Template Extraction (PSI 2015) Joint work done in colaboration with Julián Alarte, Josep Silva, Salvador Tamarit.
2 InfoTrac College Edition Over 20 million online articles. Nearly 6,000 full-text journals Instant access to periodicals. Includes journals, magazines,
Extracting tabular data from the Web. Limitations of the current BP screen scraper. Parsing is done line by line. Parsing is done line by line. Pattern.
Ihr Logo Chapter 7 Web Content Mining DSCI 4520/5240 Dr. Nick Evangelopoulos Xxxxxxxx.
Automatically Extracting Data Records from Web Pages Presenter: Dheerendranath Mundluru
Here you are at your computer, but you don’t have internet connections. Your ISP becomes your link to the internet. In order to get access you need to.
Cecil Urena Michael Phillips Abigail Fabien. Project Review Overview Providing a service that would identify small business (e.g. Mechanic Shops, Plumbers,
James Williams e: eTutor Project SUMMARY OF KEY FINDINGS for 2 Pilot studies of the.
Web 2.0 Pragith Prakash Vikram Singh By The Era of.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
University Web Training: Introduction to Web Editing Web Services.
BSc Honours Project Introduction CSY4010 Amir Minai Module Leader.
UFCEUS-20-2 Web Programming Lecture 1 Module Introduction & Outline.
Managing Content with SharePoint 2007 Module 0. Overview  Introduction  About This Course  Course Outline  Using Virtual PC.
NCR Confidential NCR RETAIL ONLINE Ecommerce Made Simple 1.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
 Focus  Strategize  Evaluate  Subscription Database- pay to search  AccessPA website- EBSCO, SIRSDiscover  Subject Directories- catalog of websites.
Power to the People IU Bloomington Libraries’ Content Management System Doug Ryner, Tadas Paegle, Julie Hardesty.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
Ohio Technology Standards August 9, 2005 Why Standards in Technology? No Child Left Behind Technology Literacy requirement Computer and Multimedia Literacy.
REVIEW OF ACTIVITIES OF THE WORK GROUP FOR INTERNET AND e -TECHNOLOGIES Prof. Dr Milena Stanković Faculty of Electronic Engineering TEMPUS Project CD-JEP.
BSc Honours Project Introduction CSY4010 Amir Minai Module Leader.
Importance of Databases. Information Literacy Information literacy is a set of abilities requiring individuals to recognize when information is needed.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Determining the Suitability of Online Research Materials Beth Thompson.
CS-EE 481 Spring February, 2007 University of Portland School of Engineering Project ZigZag Team Adam Russell Will French Matt Heye Advisor Dr. Rylander.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
G042 - Lecture 09 Commencing Task A Mr C Johnston ICT Teacher
Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services
BSc Honours Project Introduction CSY4010 Amir Minai Module Leader.
Search Engine Optimization Miami (SEO Services Miami in affordable budget)
Advanced Higher Computing Science The Project. Introduction Worth 60% of the total marks for the course Must include: An appropriate interface using input.
Searching the Web for academic information Ruth Stubbings.
Data mining in web applications
Advanced Higher Computing Science
Web Programming Language
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
Web Mining Ref:
Bulk SMS Provider. BULK SMS MARKETING LONG CODE/ SHORT CODEBULK MARKETINGWEBSITE DESIGNINGSEOPPC/ GOOGLE ADWORDS.
browser search engine web page
CS & CS Capstone Project & Software Development Project
Web Mining Department of Computer Science and Engg.
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Knowledge Sharing Mechanism in Social Networking for Learning
Presentation transcript:

Contents:  1 – Introduction to the subject of web mining and techniques  2 – Overview of research conducted (both theory and practical)  3 – Software applications on which to test web mining techniques  4 – Demonstration (Digital Solutions and Repairs)  5 – Evaluating results (suitability and practicality) Student Name: Colin Hopson Student Number: Course Title: MSc Computer Science (Internet Engineering) 7ET023 – MSc Dissertation Research Question: What is the most suitable web mining technique for a specified business and mobile application case study?

7ET023 – MSc Dissertation 1 – Introduction to the subject of web mining and techniques  Sequential research of techniques for an empirical study  Initial research into data mining (databases)  Previous knowledge of web services (RSS, REST, etc.)  Research into theory of web mining  Web usage mining – logs to examine navigation patterns  Web structure mining – examine link hierarchy  Web content mining – “the discovery of useful information from the Web by examining the data that is contained in the Web site” (Pendharkar, 2003 pg.243) * Pendharkar, P.C. (2003) Managing data mining technologies in organizations: techniques and applications, Idea Group Pub, Hershey.  Data extraction from HTML (machine learning algorithms)  Wrapper Induction  Semi-Automatic Extraction

7ET023 – MSc Dissertation 2 – Overview of research conducted (both theory and practical)  Researching Theory of Data and Web Mining Empirical research method to acquire knowledge, Research into data mining, web mining, data extraction algorithms, etc., Sequential investigation of applicable techniques.  Artefact Design and Development E-commerce prototype website (Digital Solutions and Repairs), Mobile application (Mobile Shopper).  Practical Research to Implement Techniques Resolution of web services (Amazon APIs), HTML extraction technique using XML; DOM; Xpath; PHP Arrays, Consuming Google API with REST; DOM; Xpath; PHP Arrays, Third-Party Software (Newprosoft and Automation Anywhere), Functionality of XSLT.

7ET023 – MSc Dissertation 3 – Software applications on which to test web mining techniques

7ET023 – MSc Dissertation 4 – Demonstration (Digital Solutions and Repairs)  Web Mining Technique 1 Amazon API (coded class/methods)  Web Mining Technique 2 HTML Extraction (DOMDocument, Xpath and PHP Arrays)  Web Mining Technique 3 Google API (REST, DOMDocument, XPath and PHP Arrays)  Web Mining Technique 4 Third-Party Software (Automation Anywhere and Newprosoft)  Web Mining Technique 5 None Implemented, but XSLT investigated Website Demonstration >>>

7ET023 – MSc Dissertation 5 – Evaluating results (suitability and practicality)  Web Mining Technique 1: Amazon API Requires registration and associate keys, Product Advertising API has most requirements (plus more), ASINs assist administration system, Top quality delivery and discounts, Regular updates although lengthy documentation.  Web Mining Technique 2: HTML Extraction No cost, but requires programming knowledge, Bespoke algorithm specific for HTML format, Limited to one online organisation.  Web Mining Technique 3: Google API Requires registration and associate keys, Searches products from many online organisations, GoogleId does not assist administration system, Web service retrieves limited product information, Top security measures, but lengthy documentation.  Web Mining Technique 4: Third-Party Software Limited free trial with subscription costs, Possible difficulty with integration with administration system  Web Mining Technique 5: XSLT investigated Limited free trial with subscription costs, Integration difficulties with administration system

7ET023 – MSc Dissertation SUMMARY Questions?  Study of web mining and some of its techniques Empirical study, data mining, web services, web content mining, data extraction algorithms.  Sequential research conducted (theory and practical) Web services (APIs), HTML extraction, Third-Party software, XSLT.  E-commerce prototype website and mobile application ‘Digital Solutions and Repairs’ and ‘Mobile Shopper’.  Demonstration of web mining techniques DSR computer repairs administration system  Evaluation of web mining techniques investigated Comparison between APIs, HTML extraction, third-party software and XSLT.