CS7280: Special Topics in Data Mining Information/Social Networks

Slides:



Advertisements
Similar presentations
Advanced Data Mining: Introduction
Advertisements

2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
SFU, CMPT 741, Fall 2009, Martin Ester 418 Outlook Outline Trends in KDD research Graph mining and social network analysis Recommender systems Information.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
1 Data Mining Techniques Instructor: Ruoming Jin Fall 2006.
An Overview of Our Course:
CMPT 884, SFU, Martin Ester, Special Topics in Database Systems Martin Ester Simon Fraser University School of Computing Science CMPT 884 Spring.
Data Mining – Intro.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
1 EEL 6935: Embedded Systems Seminar. 2 General Information Instructor: Ann Gordon-Ross Office: Benton Office Hours – By appointment.
CS6501 Information Retrieval Course Policy Hongning Wang
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, 2.
CS598CXZ (CS510) Advanced Topics in Information Retrieval (Fall 2014) Instructor: ChengXiang (“Cheng”) Zhai 1 Teaching Assistants: Xueqing Liu, Yinan Zhang.
Chapter 1 Introduction to Data Mining
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Social Networks in Most Visible Form. Social Networking Techniques in Business Several social networking techniques can help us in reaching maximum number.
CS 6961: Structured Prediction Fall 2014 Course Information.
Overview of CS Class Jiawei Han Department of Computer Science
Syllabus CS479(7118) / 679(7112): Introduction to Data Mining Spring-2008 course web site:
Data Warehousing/Mining 1 Data Warehousing/Mining Comp 150DW Course Overview Instructor: Dan Hebert.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
General Information 439 – Data Mining Assist.Prof.Dr. Derya BİRANT.
ITIS 4510/5510 Web Mining Spring Overview Class hour 5:00 – 6:15pm, Tuesday & Thursday, Woodward Hall 135 Office hour 3:00 – 5:00pm, Tuesday, Woodward.
January 17, 2016Data Mining: Concepts and Techniques 1 What Is Data Mining? Data mining (knowledge discovery from data) Extraction of interesting ( non-trivial,
Tallahassee, Florida, 2016 CIS4930 Introduction to Data Mining Introduction Peixiang Zhao.
CSCE 5073 Section 001: Data Mining Spring Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,
February 13, 2016 Data Mining: Concepts and Techniques 1 1 Data Mining: Concepts and Techniques These slides have been adapted from Han, J., Kamber, M.,
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
July 7, 2016 Data Mining: Concepts and Techniques 1 1.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
CS583 – Data Mining and Text Mining
Term Project Proposal By J. H. Wang Apr. 7, 2017.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Data Mining – Intro.
CS6501 Advanced Topics in Information Retrieval Course Policy
CS583 – Data Mining and Text Mining
CS583 – Data Mining and Text Mining
Data warehouse & Data Mining: Concepts and Techniques
CS598CXZ (CS510) Advanced Topics in Information Retrieval (Fall 2016)
E-Commerce Theories & Practices
CS410: Text Information Systems (Spring 2018)
Jiawei Han Department of Computer Science
Special Topics in Data Mining Applications Focus on: Text Mining
Data Mining: Concepts and Techniques Course Outline
Integrating Meta-Path Selection With User-Guided Object Clustering in Heterogeneous Information Networks Yizhou Sun†, Brandon Norick†, Jiawei Han†, Xifeng.
CS583 – Data Mining and Text Mining
CS510 (Fall 2018) Advanced Topics in Information Retrieval
CSE591: Data Mining by H. Liu
Data Warehousing and Data Mining
Smart Portal To Protect Child Online
Data Mining: Concepts and Techniques
CS583 – Data Mining and Text Mining
Data Mining: Concepts and Techniques
Jiawei Han Department of Computer Science
Data Mining Techniques As Tools for Analysis of Customer Behavior
CS583 – Data Mining and Text Mining
Lecture 1a- Introduction
Data Mining: Concepts and Techniques
Welcome! Knowledge Discovery and Data Mining
CSCE 4143 Section 001: Data Mining Spring 2019.
CS583 – Data Mining and Text Mining
Statistical Relational AI
Lecture 1a- Introduction
CSE591: Data Mining by H. Liu
Presentation transcript:

CS7280: Special Topics in Data Mining Information/Social Networks 1: Introduction Instructor: Yizhou Sun yzsun@ccs.neu.edu November 15, 2018

Course Information Course homepage: http://www.ccs.neu.edu/home/yzsun/classes/2014Spring_CS7280/index.html Class schedule Slides Announcement Paper presentation sign-up … Piazza: https://piazza.com/northeastern/spring2014/cs7280/home

Meeting Time and Location When Tuesdays 11:45-1:25pm, Thursdays 2:50-4:30pm Change? Where Snell Library 115

Instructor Information Instructor: Yizhou Sun Homepage: http://www.ccs.neu.edu/home/yzsun/ Email: yzsun@ccs.neu.edu Office: 320 WVH Office hour: Tuesdays 2:30-4:30pm

Goal of the Course The goal of the course is to learn the most cutting-edge topics, models and algorithms in information and social network mining, and to solve real problems on real-world large-scale information/social network data using these techniques. The students are expected to read and present research papers, and work on a research project related to this topic.

Prerequisites No official prerequisites However, this is a research-driven seminar course The students are expected to have knowledge in data structures, algorithms, basic linear algebra, and basic statistics. It will be a plus if you have already had some background in data mining, machine learning, and related courses.

Grading Paper presentation: 40% Research project: 50% Participation: 10%

Grading: Paper Presentation Everyone is asked to register 2 research topics Each research topic has 2-3 slots (2-3 papers) The students in charge of the research topic need to read all the papers and discuss with each other Make presentations of all the papers in that topic (e.g., each one leads one paper) Answer questions from the audience Lead the discussion The papers are given, but you can choose other papers with my consent two weeks before your presentation

Grading: Research Project Group project (2 people for one group) We now have 9 PhD students and 5 MS students It is a research project A new problem? A new method? Improvement of an existing method? You need to Form group (By Jan 16.) Proposal submission (By Feb. 6) Midterm report (By Mar. 13) Presentation (April 17/24) Final report (April 24) (hopefully it can be turned to a conference paper submission)

Grading: Participation This is a seminar course, so everyone needs to read the papers in advance and ask questions in class You can also raise and answer questions online (e.g., Piazza) Mendeley

CS 6220 Data Mining Course Review By data types: matrix data set data sequence data time series graph and network By functions: Classification Clustering Frequent pattern mining Prediction Similarity search Ranking

Multi-Dimensional View of Data Mining Data to be mined Database data (extended-relational, object-oriented, heterogeneous, legacy), data warehouse, transactional data, stream, spatiotemporal, time-series, sequence, text and web, multi-media, graphs & social and information networks Knowledge to be mined (or: Data mining functions) Characterization, discrimination, association, classification, clustering, trend/deviation, outlier analysis, etc. Descriptive vs. predictive data mining Multiple/integrated functions and mining at multiple levels Techniques utilized Data-intensive, data warehouse (OLAP), machine learning, statistics, pattern recognition, visualization, high-performance, etc. Applications adapted Retail, telecommunication, banking, fraud analysis, bio-data mining, stock market analysis, text mining, Web mining, etc.

Matrix Data Sources:http://www.analytictech.com/networks/kindsofmatrices.htm

Set Data

Sequence Data Source: http://www.dpgp.org/syntenic_assembly/

Time Series Source: http://swampland.time.com/2011/04/25/obamas-pump-problem-why-gas-prices-could-swing-a-close-election/

Graph / Network Source: http://www.thenetworkthinkers.com/2008/06/birds-of-feather-grow-fat-together.html

Course Overview Overview Ranking for infonet Clustering / community detection Matrix factorization and Blockmodel Classification / label propagation / node or link profiling Probabilistic models for infonets Similarity search Diffusion / Influence maximization Recommendation Link / relationship prediction Trustworthy analysis Large graph computation

Information Networks Are Everywhere Social Networking Websites Biological Network: Protein Interaction Research Collaboration Network Product Recommendation Network via Emails

DBLP Bibliographic Network The IMDB Movie Network The Facebook Network Venue Paper Author DBLP Bibliographic Network The IMDB Movie Network Actor Movie Director Studio The Facebook Network

Some Concepts Graph Social Network Information Network Homogeneous information network Heterogeneous information network

Ranking for Infonet PageRank HITS

Clustering SCAN Spectral Clustering Matrix Factorization Blockmodel

Classification Collective classification Label propagation Link sign prediction

Probabilistic models Markov Random Field Conditional Random Field Factor graph

Similarity Search SimRank P-Rank PathSim

Diffusion / Influence maximization Information diffusion through blogspace Influence maximization

Recommendation M. Jamali and M. Ester. A matrix factorization technique with trust propagation for recommendation in social networks. (KDD’10) Personalized Entity Recommendation: A Heterogeneous Information Network Approach (WSDM‘14)

Link / relationship prediction Link prediction Citation prediction Relationship prediction

Trustworthy analysis X. Yin and W. Tan, Semi-Supervised Truth Discovery, WWW’11 B. Zhao, B. Rubinstein, J. Gemmell, and J. Han, A Bayesian approach to discovering truth from conflicting sources for data integration, VLDB’12

Large Graph Computation GraphChi PEGASUS: Peta-Scale Graph Mining System