The Application Of The Apache Hadoop Open Source Database Program

1647 Words7 Pages
This paper will outline and describe three vendors that provide the Hadoop NoSQL database program to enterprises. Each of these companies see themselves as uniquely different, thus positioning themselves within a market place that has begun to become highly competitive in the “Big Data” age. I will provide an outline of the talking points that will be discussed for each company, starting with a brief description of the Hadoop NoSQL open-source database program, then I will discuss each company on the evaluation categories, and conclude with options for whether to move forward with either of these vendors. After the conclusion the reader will have the information on these vendors and the confidence to be able to decide on which choice would be best for the business. The Apache-Hadoop open-source program that provides a layer of software that can turn a grid of servers into a single unit has become known as NoSQL, or Not-Only SQL database. It provides a parallel storage and processing framework to run MapReduce, an application module that runs in two phases, first mapping data then reducing or transforming it. Using the HDFS or Hadoop Distributed File System, huge data sets are spread across servers, thus enabling programmers to write application modules and run them in parallel on the same cluster that stores the data, thus using the power and capacity of all the CPUs and hard drives. The advantage of Hadoop is that it processes large amounts of data in seconds, one
Open Document