Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spark Integration Into an Enterprise Stack Open Source Successes & Challenges.

Similar presentations


Presentation on theme: "Spark Integration Into an Enterprise Stack Open Source Successes & Challenges."— Presentation transcript:

1 Spark Integration Into an Enterprise Stack Open Source Successes & Challenges

2  Zen-Empiricist  Director of WANdisco Bigdata Engineering –In charge of delivering company’s enterprise grade NonStop Hadoop solution  ASF Hadoop, MRUnit committer  ASF Bigtop’s co-author  Spark/Shark contributor  Apply with caution: highly abrasive (according to most - now former - managers) Shark Integration: Challenges and Lessons Learnt/ page 2 Konstantin (Cos) Boudnik About the presenter

3  Most apparent characteristics: –Fail-fast on your own dime –Hard or impossible to control by authority (!) –Resistant to political correctness bias (aka political bulls#$t) –Creates huge competitive advantage  Resulting in –Highly successful projects –Innovations up to the limit –Technologically disruptive –Rules the world (once matured)  Empirical evidences: –Everything on the planet is “Powered by Linux” –“Bad” news: Android market share will never double again –Firefox is THE web-browser of the world  I ran out of the slide space and my time slot is limited... Shark Integration: Challenges and Lessons Learnt/ page 3 Anarchy: ἀ ν + ἀ ρχός (an + arkhos) without a ruler Open-source is a force of natural evolution

4  Open => anyone can do what they’re most interested in doing  Innovative => creates formats & standards as it goes; abandon them in passing  Stable => we’ll fix it in the next release,  Backward compatible => we might break it, but we’ll fix it  Fault tolerant and, at least, highly available => if you configure the hell out of it  Configuration management => shall scripts or Python to generate configuration  Deployment management (packages and Puppet) => here’s your tarball  Supported (there’s a throat to choke) => “Gone fishing!”  Secure => million eyeballs will find all you bugs in no time Shark Integration: Challenges and Lessons Learnt/ page 4 I am not bashing the open-source: it is my bread & butter What “open source” often-time is

5  Compatible with standards, scalable  Stable: features set, release schedules, bug fixes, upgrades  Backward compatible with itself  Fault tolerant and, at least, highly available  Configuration management (you know your environment)  Deployment management (packages and Puppet)  Supported (there’s a throat to choke)  Secure  … and more Shark Integration: Challenges and Lessons Learnt/ page 5 What “enterprise grade” really is Let’s call spade a spade

6 Shark Integration: Challenges and Lessons Learnt/ page 6 The devils is in the details The goals are aligned. How about semantics? CharacteristicOpen SourceEnterprise OpenAgileCompatible with standards StableBugs get fixed; “works for me” RHEL: - not a single change since 1867 InnovativeWe have all cool featuresNaN Backward compatibleEasy upgrade to next release; fixed on “trunk” Year 2013: - we have to run on JDK1.3 Fault tolerant & HALet’s restart damn thing$100m/min in downtime costs Configuration MgmtA script, or sketchy docsChange of control, puppet, etc. Deployment MgmtA tarballStaging environments, long upgrade paths throat to choke

7  Open JDK7 –Guess what? Not everybody are in love with Larry Ellison  Hive 0.11’ish –It is 3 light years ahead of Hive 0.9 and 5 light years behind an enterprise grade  Spark 0.8 –Hello Apache Incubation!  Shark 0.8’ish Shark Integration: Challenges and Lessons Learnt/ page 7 What we have built Case study: major telecom SI

8 Shark Integration: Challenges and Lessons Learnt/ page 8 What it implies for the development and customers alike How the stack looks like?

9 Shark Integration: Challenges and Lessons Learnt/ page 9 Memory leaks: JobConf hold by ThreadLocal Fixes that span multiple components

10 それが何を意味している Shark Integration: Challenges and Lessons Learnt/ page 10 Semantic and toolset barriers between JVM languages What does it mean?

11 Shark Integration: Challenges and Lessons Learnt/ page 11 Upstream components live their own lives oftentimes Unsynchronized release trains

12 Shark Integration: Challenges and Lessons Learnt/ page 12 I want everything on the menu! NOW! Impatient Customers

13 Shark Integration: Challenges and Lessons Learnt/ page 13 “Hold my beer!” (Famous last words) What else can possibly go wrong?

14  Proper system integration –Git & well-thought branching model –ASF Bigtop as the integration point  Close collaboration with open source community –All fixes and features are offered to appropriate projects; most are accepted  Tireless and careful back-poring  Continuous Integration and Delivery  Simplifying development where is possible –Switch from “org.apache.hive” to “edu.berkeley.cs.shark” –Keep open your version control system  Education and expectations management –“released” in open-source not always means “usable in the datacenter” Shark Integration: Challenges and Lessons Learnt/ page 14 “What to do, what to do?” (r. Bender) Lessons learnt & principles applied

15 Contact: Samantha Leggat | t: | WANdisco, Bishop Ranch 8, 5000 Executive Pkwy, Suite 270, San Ramon, CA Thank


Download ppt "Spark Integration Into an Enterprise Stack Open Source Successes & Challenges."

Similar presentations


Ads by Google