Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hardening Hadoop for the Enterprise: Managing Diverse Workloads, Securing and Governing your Big Data Platform How does IT balance the tension between.

Similar presentations


Presentation on theme: "Hardening Hadoop for the Enterprise: Managing Diverse Workloads, Securing and Governing your Big Data Platform How does IT balance the tension between."— Presentation transcript:

1 Hardening Hadoop for the Enterprise: Managing Diverse Workloads, Securing and Governing your Big Data Platform How does IT balance the tension between “one glorious cluster that serves them all” and “one cluster, one purpose – dedicated for the particular task and not to be interfered with by anything”. If they are to contain cluster sprawl, folks need help allocating a mixed workload across a shared cluster (beyond the job tracker assigning map and reduce slots), and they want to be sure the cluster is as secure as can be. Kerberos, C-groups and YARN to the rescue! This talk describes the current practices and speculates how things get better under YARN.

2 Agenda 1.Basics 2.Cluster Evolution Vanilla Cluster Foreign Workload Introduced Node Specialization Cluster Specialization Datacenter Integration 3.YARN 4.Security

3 Hadoop – and her 2 beautiful things I will spread your data out over many servers to keep it safe I will facilitate a new idea that you should send the work to the data, not the other way around. Data

4 Copyright © 2013, SAS Institute Inc. All rights reserved. WHY DO THIS? BECAUSE IT GETS THE ANSWERS SOOOO MUCH FASTER NameNode Client

5 WOW, that’s awesome. Can we join your cluster?

6 We’ll be very very good. Really.

7 Agenda 1.Basics 2.Cluster Evolution Vanilla Cluster Foreign Workload Introduced Node Specialization Cluster Specialization Datacenter Integration 1.YARN 2.Security

8 2012 :: Have  Want

9 NameNodeDataNode SecNmNodeDataNode Vanilla Cluster

10 NameNodeDataNode SecNmNodeDataNode Vanilla Cluster (with foreign workload)

11 Foreign != MapReduce & not only ( SAS )  SAS High Performance Analytics  SAS Visual Analytics  Impala  BDAS Spark  Giraph  Solr .. Hbase

12 NameNodeDataNode SecNmNodeDataNode Vanilla Cluster (with foreign workload) 1.Add work across entire cluster 2.Add memory to accommodate 3.Derate MapReduce to accommodate 4.Time Slice? 5.No extra copy of Data

13

14 NameNodeDataNode SecNmNodeDataNode Node Specialization (for foreign workload)

15 NameNodeDataNode SecNmNodeDataNode Node Specialization (for foreign workload) 1.Add workload to some … “SASnodes” 2.Add memory to SASnodes 3.Derate MapReduce on SASnodes? 4.Cgroups to make em play nice 5.Still no extra copy of Data 6.SAS writes data to SASnodes only. (balancer)

16 NameNodeDataNode SecNmNodeDataNode Node Specialization (for foreign workload) 1.Add workload to some … “SASnodes” 2.Add memory to SASnodes 3.Derate MapReduce on SASnodes? 4.Cgroups to make em play nice 5.Still no extra copy of Data 6.SAS writes data to SASnodes only. (balancer) CDH4 Best Practice

17 NameNode DataNode SecNmNode DataNode Specialty Cluster NameNode DataNode

18 NameNode DataNode SecNmNode DataNode Specialty Cluster NameNode DataNode 1.Create new “Odd Shape” cluster 2.Optimize Hardware to fit task 3.Oops! extra copy of Data 4.Easier to contain variation

19 Copyright © 2013, SAS Institute Inc. All rights reserved. EXAMPLE ASYMMETRIC AS AN OPTION NameNode Client Controller

20 TERADATA CLIENT ORACLE HADOOP DataCenter Integration

21 Agenda 1.Basics 2.Cluster Evolution Vanilla Cluster Foreign Workload Introduced Node Specialization Cluster Specialization Datacenter Integration 1.YARN 2.Security

22 2013q4? 2014?

23

24

25

26

27

28 NameNodeDataNode SecNmNodeDataNode Node Specialization (for foreign workload)

29 Agenda 1.Basics 2.Cluster Evolution Vanilla Cluster Foreign Workload Introduced Node Specialization Cluster Specialization Datacenter Integration 1.YARN 2.Security

30 Security is Hard. Better Start right away.  Add Kerberos to your environment ASAP – right after the first POC  Integrate with the identity management on site -Don’t add unix-users to the cluster by hand! -Automate. -Engage SAS Technical Resources. -Security settings can be hard to get right. Error messages get obfuscated and tracking the true source is difficult -Easier to start with a small working system and add projects  Resist “Oh, we will add the security later”. Your users will have gotten so used to no- security they’l scream!

31 Thank You! paulmkent


Download ppt "Hardening Hadoop for the Enterprise: Managing Diverse Workloads, Securing and Governing your Big Data Platform How does IT balance the tension between."

Similar presentations


Ads by Google