Big Data Young Lee BUS 550
Big data
Big Data Explosion of information Iot Analytics Not just SQL (Structured query language)but unstructured data Transformation from a entity based data to transactional databases
Industry https://www.youtube.com/watch?v=eVSfJhssXUA Billion dollar industry Corporate investments
who IBM Google Sears Amazon Social media applications: facebook
Why Insights Predictions Customer value Efficiency Costs savings Product development Makes AI possible Analytics
How Cloud computing Cognitive computing Artificial intelligence Software implementations
Who needs big data Insurance companies Airlines Retail Hospitals Traffic Manufacturers
Concept of big data
DatA Data is like a dam Gartner security High volume, high velocity and high variety (unstructured) Veracity (trustworthy) security
IBM take on Big data
Tools of big data Map reduce Hadoop Big Table Kaggle Tool design by google to functions large amount of data Hadoop Run Map reduce on large cluster Big Table Google developed distributed storage Kaggle
Hadoop The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. https://hadoop.apache.org/
Kaggle Kaggle is an online community of data scientists and machine learners, owned by Google, Inc. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.
issues Security Personal information Constant monitoring Safety Storage Errors (https://www.ft.com/content/21a6e7d8-b479-11e3-a09a-00144feabdc0) Scary (Forbes)
Questions Who made map reduce?