Five Myths about Hadoop

Apache™ Hadoop helps businesses solve one of their toughest challenges—profiting from massive volumes of data. Its popularity stems from its ability to enable organizations to gain value from big, diverse data types. As the Forrester Research, Inc. report The Forrester Wave™: Big Data Hadoop Solutions, Q1 2014, notes, “Hadoop is unstoppable as its open-source roots grow wildly and deeply into enterprises. Its refreshingly unique approach to data management is transforming how companies store, process, analyze and share big data.”

Evolving Technology

The technology has justifiably received accolades for the benefits it delivers, yet at the same time it’s been dogged by misinformation and overpromising of exactly what it can offer. Having the wrong expectations—or believing misconceptions—when implementing Hadoop can result in wasted time, inflated expenses and lacklustre performance.

Understanding what Hadoop can and can’t do, and then planning the installation accordingly, will help the implementation reach its full capability. To be successful, learn the truth about the technology and avoid these common myths:

Myth 1

Hadoop Can Replace a Data Warehouse

Truth: Hadoop is not a complete data or analytics solution by itself. It is a framework or platform that cannot serve as or replace the data warehouse. As such, Hadoop offers a cost-effective solution as a big data platform that can share its information with other databases, making it an ideal complement to a data warehouse. This gives organizations new ways to use and exploit large, diverse data volumes.

Myth 2

The Technology is a Passing Trend

Truth: Hadoop is popular and its momentum seems unstoppable, so don’t expect it to go away. The Forrester Wave™: Big Data Hadoop Solutions, Q1 2014, believes that Hadoop is a “must-have data platform for large enterprises, forming the cornerstone of any flexible, future data management platform.” To take advantage of it, next-generation data warehouses are supporting deeper Hadoop integration to manage larger and more complex data sets.

Myth 3

Hadoop is Free

Truth: Sure, Hadoop is an open-source product that anyone can download for free, but the cost to use the technology is not free or even cheap. It requires highly trained expertise to use effectively, and storing the data long term can be expensive. In fact, a data warehouse can cost less than Hadoop when considering analytics and multiple users. And besides the open-source technologies, vendors sell specific applications with various features to support and extend Hadoop to make it more beneficial to businesses.

Myth 4

The Solution is a Data Integration Tool

Truth: The technology is actually a distributed file system designed for specific data types and workloads. It lacks data integration capabilities. If the solution is not integrated with a larger data management ecosystem, is it likely to become another data silo that isolates information. But once it’s part of a data warehouse environment, information from the warehouse and from Hadoop can be used for queries.

Myth 5

Hadoop is a Single Open-Source Product

Truth: It is a library of products and technologies, including the Hadoop Distributed File System, MapReduce, Pig, Hive, Falcon, Knox and others. Hadoop products are available from a variety of vendors that add differentiating features, such as the Hortonworks® Data Platform that lets organizations capture, process and share data in any format at any scale. Some Hadoop products are open source—others are not. The demand for the products has created what Forrester calls a “cutthroat” market for vendors seeking to capitalize on selling unique options.

Unlock the Full Potential

Hadoop delivers a proven solution for storing and processing large data sets, enabling businesses to leverage the big, diverse data that was previously too expensive or complex to use effectively. Despite its purposes and advantages, the technology is not a replacement for a data warehouse or data integration tools. Instead, the value of Hadoop can be increased by integrating it with other data or analytics solutions.

Source: All the above opinions are personal perspective on the basis of information provided by Forbes and contributor Brett Martin.



No Comments Yet.

Leave a comment