Presentation is loading. Please wait.

Presentation is loading. Please wait.

Denis Reznik Data Architect, Intapp, Inc. Microsoft Data Platform MVP

Similar presentations


Presentation on theme: "Denis Reznik Data Architect, Intapp, Inc. Microsoft Data Platform MVP"— Presentation transcript:

1

2 Denis Reznik Data Architect, Intapp, Inc. Microsoft Data Platform MVP
Data Driven Future Denis Reznik Data Architect, Intapp, Inc. Microsoft Data Platform MVP

3 About Me Denis Reznik Kyiv, Ukraine Data Architect at Intapp, Inc.
Microsoft Data Platform MVP PASS Regional Mentor, CEE Ukrainian Data Community Kyiv Co-Founder Co-author of “SQL Server MVP Deep Dives vol. 2” Organizer of SQLSaturday Kyiv Conference

4 Agenda Data is a new Oil (c) Data and Science Data in Big Companies
Data and Application Development Data-Driven Future

5 Data is a New Oil “Data is the new oil. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.” (c) Clive Humby, UK Mathemetician

6 Data and Science Thousands of years Few hundreds of years
Empirical Few hundreds of years Theoretical Last fifty years Computational “Query the world” Last twenty years eScience (Data Science) “Download the world”

7 Data Science is a new term
Data Science is a new term. But in the same sense as Columbus was discovered NEW continent 1000 years ago (c) Hector Garcia-Molina. Professor in the Departments of Computer Science and Electrical Engineering at Stanford University

8

9 Unsupervised Learning
Machine Learning Supervised Learning Unsupervised Learning Classification Regression

10 Distance from the Continent
Linear Regression Training Data Learning Algorithm Ocean Temperature Oil Derricks in Area Distance from the Continent Whales Population h h - Hypothesis

11 DEMO Linear Regression

12 Data in Big Companies

13 source: http://www. visualcapitalist

14 source: http://www. visualcapitalist

15 source: http://www. visualcapitalist

16 source: http://www. visualcapitalist

17 source: http://www. visualcapitalist

18 Parallel Processing Q: How many times temperature was above the norm during the last week? Temperature Sensor Datasets (n Items) A: 5 Time: 2 hours Algorithmic Complexity: O(n)

19 Parallel Processing Q: How many times temperature was above the norm during the last week? Temperature Sensor Datasets (k Items in each one) A: 1 A: 0 A: 3 A: 4 Time: 0.5 hour Algorithmic Complexity: O(n/k)

20 Map-Reduce Map -> COUNT(*) WHERE Value > 40 A: 1 A: 0 A: 3 A: 4
Reduce -> COUNT(*) Reduce A: 5

21 DEMO Map-Reduce

22 RDMS Commercial Success
Database History Amazon Dynamo Paper RDBMS Ingress System R Object Databases CODASYL IMS Google BigTable Paper SQL NewSQL (?) 1960s 1970s 1980s 1990s 2000s Nowadays E.F. Codd’s Paper RDMS Commercial Success NoSQL (Johan Oskarsson)

23 NoSQL SQL

24 Databases Key-Value Relational Column-Family Graph Document

25 … … Index (B-Tree) - Seek SELECT * FROM Users WHERE Id = 523 1 .. 1M
1M-2K .. 1M 1 .. 2K 2K K 801..1,5K 1,5K+1..2K

26 … … Index (B-Tree) - Scan SELECT * FROM Users 1 .. 1M 1M-2K .. 1M
2K K 801..1,5K 1,5K+1..2K

27 Hashtable Hash Function John Snow Jim Beam John Snow Jim Beam
2 3 1 4 Jim Beam Jim Beam Peter Parker John Snow Peter Parker Hash Function 2

28 Q&A Web Site (StackOverflow)

29 Domain Model Questions Answers Users Comments Votes

30 StackOverflow Architecture
source:

31 DEMO Relational vs. NoSQL

32 Data-Driven Future Data amount is growing and this is cool
More and more decisions are based on data More and more applications are developed It is exciting to be a Software Engineer now!

33 Thank you! Denis Reznik Blog: (rus) Facebook: LinkedIn:

34


Download ppt "Denis Reznik Data Architect, Intapp, Inc. Microsoft Data Platform MVP"

Similar presentations


Ads by Google