Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hadoop, Hive, JSON, and Data! Oh, my!! TJay Belt 1.

Similar presentations


Presentation on theme: "Hadoop, Hive, JSON, and Data! Oh, my!! TJay Belt 1."— Presentation transcript:

1 Hadoop, Hive, JSON, and Data! Oh, my!! TJay Belt 1

2 Database Administrator at Imagine Learning eMail me TJayBelt@yahoo.com Read me http://tjaybelt.blogspot.com Follow me @tjaybelt 2

3 Thanks to our Sponsors! Yearly Partners Gold Sponsors

4  Big Data ecosystem  30,000 feet view of our ecosystem  Issues found along the way Overview 4

5 Json (JavaScript Object Notation)  Lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.

6 Json (JavaScript Object Notation) { "_id": "00000000-0000-0000-0000-000000000000", "Revision": 12, "ModelData": { "GradeLevel": "Kindergarten", "FirstLanguage": "English“ }, "SetTheStageData": { "LastSetTheStageLibraryWords": 1, "LastSetTheStageTakeATest": 0 }

7 Json (JavaScript Object Notation) "TestInstances": [{ "Product": "ILE", "Lesson": "30698aac-5a3d-4464-935c-16de4ba9db70", "LessonBranch": "Main", "TestType": "PlacementTest", "TimeStarted": "2015-11-13T15:16:51.8757165+00:00", "TimeCompleted": "2015-11-13T15:26:29.9646995+00:00", "TestInstanceId": "1", "TestSectionInstances": [{ "TestSection": "Letter Recognition", "TestQuestionInstances": [{ "TestQuestion": "q43", "TimeStarted": "2015-11-13T15:17:24.965+00:00", "TimeCompleted": "2015-11-13T15:17:33.432+00:00", "TestOptionInstances": [{ "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt256" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt258" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt257" }, { "ClickCount": 1, "IsSelected": true, "ResponseLatency": -8467, "TestOption": "opt253" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt255" }, { "ClickCount": 1, "IsSelected": false, "ResponseLatency": 0, "TestOption": "opt254" }] },

8 Blob Storage  Reliable, cost-effective cloud storage for large amounts of unstructured data  Microsoft Azure Cloud

9 MongoDB  MongoDB (from humongous) is a cross-platform document-oriented database.  Classified as a NoSQL database that eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas  Making the integration of data in certain types of applications easier and faster.

10 Hadoop  is a Java-based programming framework that supports the processing of large data sets in a distributed computing environment.

11 MapReduce  is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm  on a cluster.

12 HIVE  Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis.  It supports queries expressed in a language called HiveQL, which automatically translates SQL-like queries into MapReduce jobs executed on Hadoop.

13 What do we have? 13

14 Things we tried  SQL Server Json procs  SlamData  PowerQuery  DocumentDB  MongoDirector  SQL Azure

15

16 Issues I encountered 16

17 17

18 Issues I encountered 18

19 Issues I encountered 19

20 Thank You! TJay Belt Cell(801) 735-9439 eMailTJayBelt@Yahoo.comTJayBelt@Yahoo.com Bloghttp://tjaybelt.blogspot.comhttp://tjaybelt.blogspot.com Linked Inwww.linkedin.com/in/tjaybeltwww.linkedin.com/in/tjaybelt Twitter@tjaybelt Skypetjaybelt Google+linklink

21 Thanks to our Sponsors! Yearly Partners Gold Sponsors


Download ppt "Hadoop, Hive, JSON, and Data! Oh, my!! TJay Belt 1."

Similar presentations


Ads by Google