Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee.

Slides:



Advertisements
Similar presentations
Nokia Technology Institute Natural Partner for Innovation.
Advertisements

Business Analytics for the 21 st Century TRENDS AND HOT TOPICS.
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Educational Programs in Bioinformatics at UNO Hesham H. Ali Department of Computer Science College of Info Science and Technology University of Nebraska.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Chapter 14 The Second Component: The Database.
Data Analytics Program at Drake Brad C. Meyer, Chair Information Management and Business Analytics.
Summary of “New Ways to Exploit Raw Data May Bring Surge of Innovation, a Study Says” Steve Lohr, New York Times, May 13th, 2011 Presented by: Zhe Jiang.
Information systems: creating tomorrow’s business innovations
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Computational Social Science Ariel Brio July 23, 2009.
© Heikki Topi Computing and University Education in Analytics ACM Education Council San Francisco, CA November 2, 2013 Heikki Topi, Bentley University.
Big Data Course Plans at Purdue Ananth Iyer. Big Data/Analytics Coursera course on Big Data by Bill Howe claims that Big Data involves issues of
Basic Concepts in Big Data
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Chapter 11 Databases.
CSI315CSI315 Web Development Technologies Continued.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
If BIG DATA is the answer, then what was the question?
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Charles Tappert Seidenberg School of CSIS, Pace University
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Vision + Focus + Execution Meiliu Lu, RVR 5016, For CSc 209 Spring 2003, 5/6/03.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
Keuze semester big data
BOARD OF ADVOCATES October 9, UPDATES ABET Accredited until 2021 Passed with only 1 concern Must be addressed by next visit “[Students will have]
Mrs Drudeisha Madhub Data Protection Commissioner Big Data Framework: Data Privacy & Data Protection Guidelines for Development.
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
SUPPLY CHAIN OF BIG DATA. WHAT IS BIG DATA?  A lot of data  Too much data for traditional methods  The 3Vs  Volume  Velocity  Variety.
IoT Meets Big Data Standardization Considerations
LECTURE 2: DATA MINING. WHAT IS DATA MINING? 2 D ATA M INING AND D ATA W AREHOUSES ? It evolved in to being as the science of databases evolved Database.
D ATA S CIENTISTS Who are they and what do they do?
Machine Learning. Definition Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational.
Big Data Yuan Xue CS 292 Special topics on.
Statistics The Hottest Career of the 21 st Century.
1 Seattle University Master’s of Science in Business Analytics Key skills, learning outcomes, and a sample of jobs to apply for, or aim to qualify for,
Chapter 8: Web Analytics, Web Mining, and Social Analytics
CS 784: Advanced Topics in Data Management This semester’s focus: Data Science AnHai Doan.
Data Science Interview Questions 1.What do you mean by word Data Science? Data Science is the extraction of knowledge from large.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Book web site:
Business & Computer Science Education Department
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
Business Intelligence Minor
From BI to Big DatA.
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Information technology skills for the 21st century
Business analytics Lessons from an undergraduate introductory course
School of Information Management Nanjing University China
Introduction Data Mining for Business Analytics.
Experiences with Business Analytics Curriculum Implementation
Data Warehousing and Data Mining
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
The Hottest Career of the 21st Century
Course Introduction CSC 576: Data Mining.
Data Mining: Introduction
Mathematics in the Data Science Movement
Big DATA.
Welcome! Knowledge Discovery and Data Mining
Data Warehousing & DATA MINING (SE-409) Lecture-1 Introduction and Background Huma Ayub Software Engineering department University of Engineering and Technology,
Presentation transcript:

Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee State University 1 st Annual Workshop on Data Sciences

Outline Data, Big Data and Challenges Data Science Introduction Why Data Science Data Scientists What do they do? Major/Concentration in Data Science What courses to take.

Data All Around Lots of data is being collected and warehoused Web data, e-commerce Financial transactions, bank/credit transactions Online trading and purchasing Social Network

How Much Data Do We have? Google processes 20 PB a day (2008) Facebook has 60 TB of daily logs eBay has 6.5 PB of user data + 50 TB/day (5/2009) 1000 genomes project: 200 TB Cost of 1 TB of disk: $35 Time to read 1 TB disk: 3 hrs (100 MB/s)

Big Data Big Data is any data that is expensive to manage and hard to extract value from Volume The size of the data Velocity The latency of data processing relative to the growing demand for interactivity Variety and Complexity the diversity of sources, formats, quality, structures.

Big Data

Types of Data We Have Relational Data (Tables/Transaction/Legacy Data) Text Data (Web) Semi-structured Data (XML) Graph Data Social Network, Semantic Web (RDF), … Streaming Data You can afford to scan the data once

What To Do With These Data? Aggregation and Statistics Data warehousing and OLAP Indexing, Searching, and Querying Keyword based search Pattern matching (XML/RDF) Knowledge discovery Data Mining Statistical Modeling

Big Data and Data Science “… the sexy job in the next 10 years will be statisticians,” Hal Varian, Google Chief Economist The U.S. will need 140, ,000 predictive analysts and 1.5 million managers/analysts by McKinsey Global Institute’s June 2011 New Data Science institutes being created or repurposed – NYU, Columbia, Washington, UCB,... New degree programs, courses, boot-camps: e.g., at Berkeley: Stats, I-School, CS, Astronomy… One proposal (elsewhere) for an MS in “Big Data Science”

What is Data Science? An area that manages, manipulates, extracts, and interprets knowledge from tremendous amount of data Data science (DS) is a multidisciplinary field of study with goal to address the challenges in big data Data science principles apply to all data – big and small

What is Data Science? Theories and techniques from many fields and disciplines are used to investigate and analyze a large amount of data to help decision makers in many industries such as science, engineering, economics, politics, finance, and education Computer Science Pattern recognition, visualization, data warehousing, High performance computing, Databases, AI Mathematics Mathematical Modeling Statistics Statistical and Stochastic modeling, Probability.

Why is it sexy? Gartner’s 2014 Hype Cycle

Data Science

Real Life Examples Companies learn your secrets, shopping patterns, and preferences For example, can we know if a woman is pregnant, even if she doesn’t want us to know? Target case studyTarget case study Data Science and election (2008, 2012) 1 million people installed the Obama Facebook app that gave access to info on “friends”

Data Scientists Data Scientist The Sexiest Job of the 21 st Century They find stories, extract knowledge. They are not reporters

Data Scientists Data scientists are the key to realizing the opportunities presented by big data. They bring structure to it, find compelling patterns in it, and advise executives on the implications for products, processes, and decisions

What do Data Scientists do? National Security Cyber Security Business Analytics Engineering Healthcare And more ….

Concentration in Data Science Mathematics and Applied Mathematics Applied Statistics/Data Analysis Solid Programming Skills (R, Python, Julia, SQL) Data Mining Data Base Storage and Management Machine Learning and discovery