Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is it and why it matters? Hadoop. What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters.

Similar presentations


Presentation on theme: "What is it and why it matters? Hadoop. What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters."— Presentation transcript:

1 What is it and why it matters? Hadoop

2 What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It supports the processing of large data sets in a distributed computing environment. Skype Id: info.vibloo Email: info@Vibloo.com USA: +1-248-809-1418 IND: +91-40-3296-5222

3 Hadoop architecture Hadoop Common - The libraries and utilities used by other Hadoop modules Hadoop Distributed File System (HDFS) - The Java-based scalable system that stores data across multiple machines without prior organization MapReduce - A software programming model for processing large sets of data in parallel Yet Another Resource Negotiator (YARN) - Resource management framework for scheduling and handling resource requests from distributed applications. Skype Id: info.vibloo Email: info@Vibloo.com USA: +1-248-809-1418 IND: +91-40-3296-5222

4 Data access projects  Pig  Avro  Hive  Spark  Flume  Sqoop  Hcatalog  Hbase Skype Id: info.vibloo Email: info@Vibloo.com USA: +1-248-809-1418 IND: +91-40-3296-5222

5 What are the benefits of Hadoop?  Computing power  Flexibility  Fault tolerance  Low cost  Scalability Skype Id: info.vibloo Email: info@Vibloo.com USA: +1-248-809-1418 IND: +91-40-3296-5222

6 What is Hadoop used for?  Low-cost storage and active data archive  Sandbox for discovery and analysis  Staging area for a data warehouse and analytics store  Recommendation systems  Data lake Skype Id: info.vibloo Email: info@Vibloo.com USA: +1-248-809-1418 IND: +91-40-3296-5222

7 Challenges of using Hadoop  MapReduce programming is not a good match for all problems  There’s a widely acknowledged talent gap  Data security  Full-fledged data management and governance Skype Id: info.vibloo Email: info@Vibloo.com USA: +1-248-809-1418 IND: +91-40-3296-5222

8 Ready To Learn More About Hadoop? Complete Training and Tutorial on Hadoop Administration and Developing @ Vibloo.com Skype Id: info.vibloo Email: info@Vibloo.com USA: +1-248-809-1418 IND: +91-40-3296-5222


Download ppt "What is it and why it matters? Hadoop. What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters."

Similar presentations


Ads by Google