Thursday 7 July 2016

What is Hadoop? – Simplified!

Scenario 1: Any global bank today has more than 100 Million customers doing billions of transactions every month
Scenario 2: Social network websites or eCommerce websites track customer behaviour on the website and then serve relevant information / product.
Traditional systems find it difficult to cope up with this scale at required pace in cost-efficient manner.
This is where Big data platforms come to help. In this article, we introduce you to the mesmerizing world of Hadoop. Hadoop comes handy when we deal with enormous data. It may not make the process faster, but gives us the capability to use parallel processing capability to handle big data. In short, Hadoop gives us capability to deal with the complexities of high volume, velocity and variety of data (popularly known as 3Vs).
Please note that apart from Hadoop, there are other big data platforms e.g. NoSQL (MongoDB being the most popular), we will take a look at them at a later point.

Introduction to Hadoop

Hadoop is a complete eco-system of open source projects that provide us the framework to deal with big data. Let’s start by brainstorming the possible challenges of dealing with big data (on traditional systems) and then look at the capability of Hadoop solution.
Following are the challenges I can think of in dealing with big data :
1. High capital investment in procuring a server with high processing capacity.
2. Enormous time taken
3. In case of long query, imagine an error happens on the last step. You will waste so much time making these iterations.
4. Difficulty in program query building
Here is how Hadoop solves all of these issues :

LinuxWorld Informatics Pvt. Ltd Offer Bigdata hadoop Training

No comments:

Post a Comment