Hadoop and the Future of Big Data

Some thoughts on Apache Hadoop

Apache Hadoop, the Java software framework for distributed storage and processing of huge sets of data, has skyrocketed in popularity since its debut in 2011. Is the hype surrounding Hadoop for real, or will the next five years show that this new technology was only a flash in the pan?

Big Data is Just Getting Started

Only time will reveal the answer, but what is clear is that companies' adoption of novel big data techniques is only in the early stages. A 2014 InformationWeek survey found that only 13 percent of responding organisations had adopted Hadoop, as compared to 75 percent using Microsoft SQL Server and 47 percent using Oracle. Hadoop's popularity is in part due to its open-source license, which inspired large early investments and contributions from powerhouse companies like Yahoo. Since the Hadoop ecosystem currently lacks a viable proprietary competitor, it has grown largely unfettered up until now. Some estimates claim that the Hadoop-related hardware and software industry will grow to over $50 billion annually by 2020.

Hadoop

 

Hadoop Is Not a Cure-All

Some overzealous adopters of Hadoop have treated it as a magical elixir for any possible problem in data management and analytics. The truth is much less exciting. While Hadoop is great at working with certain types of enormous datasets, it is not suited for situations such as real-time analysis, graph analysis or supercomputing. The recent rise of another Apache project, Spark, as well as of other new open-source technologies like Ceph and Kafka, shows that Hadoop has real drawbacks. For instance, Spark claims that in certain situations it can work up to 100 times faster than Hadoop MapReduce, which represents a stunning jump in productivity.

Hadoop Is One of Many Options

Spark has been called the "next big thing" in big data, a title that Hadoop held not too long ago.

And although Spark can run within the Hadoop ecosystem, there's nothing in particular requiring it to do so. Ceph, an open-source distributed storage system with its own POSIX-compatible file system, was purchased in April 2014 by Red Hat, which may look to incorporate Ceph into Red Hat Linux's own file system.

The question is now perhaps whether Hadoop enthusiasts can work to include these recent open-source technologies within the Hadoop ecosystem, or whether it will be eclipsed over time as its newer competitors catch up and eventually surpass it.

Suggested Posts

Financial Services

Download the Mind Map guide to buying Cybersecurity for Financial Services.

Netify have built a Mind Map to help IT decision makers view the regulations and key areas which must be considered when buying Cybersecurity for financial services businesses.

Sidebar Cybersecurity Mindmap for Financial Services-1

Please enter your business email address to instantly download the Cybersecurity financial services mind map.

Download now

Manufacturing

Download the Mind Map guide to buying Cybersecurity for the Manufacturing sector.

Netify have built a mind map to help IT decision makers view the regulations and key areas which must be considered when buying Cybersecurity for the manufacturing sector.

Manufacturing Mindmap Sidebar

Please enter your business email address to instantly download the Cybersecurity financial services mind map.

Download now

Search for Articles

Looking for something specific? Enter your search below to find information from all of Netify.

Explore Topics

Popular Article Topics

Find articles and helpful resources about any of the following:

Subscribe to Notifications

Netify Learning Centre

Learn about SD WAN, MPLS, UCAAS, Data Centre & Security procurement.

See All Articles