preview

Industrial And Technical Review For Hadoop

Better Essays

This chapter introduces an industrial and technical review for Hadoop framework with other technologies used with Hadoop system to process bigdata. Hadoop project originally was built and supervised by Apache community. In addition to Apache many other companies whose businesses run on Hadoop are adding more interesting features to Hadoop, some of them announced their own Hadoop distributions replying on the original core distribution distributed by Apache.

2.1 Industry Feedback

In last Hadoop summit [19] Mike Gualtieri ‘principal analyst at Forrester’ gave a keynote about the market today’s expectations from bigdata analysis solutions, he came up with these interesting results:
“Data-related projects are at the forefront of the minds of …show more content…

Chris Twogood from Teradata [50] and Timothy Mallalieu from Microsoft talked about how important integrating Hadoop with traditional data technologies to get the most out of the new opportunities.

Arun Murthy who is a co-founder of Hortonworks [51] gave an overview of the key recent advances in Hadoop2.0, including the benefits of the YARN ResourceManager [18], and Apache STORM [73] to process data streaming with Hadoop.

The list of companies and institutions are using or planning to switch to Hadoop is getting longer every day. Adobe, Amazon, IBM, Pivotal, Google, Facebook, Twitter, CloudERA, MapR, Hortonworks, Dell, Intel, HSBC, Deutsche Telecom, ...etc are just few examples of many organizations switched or just announced they are switching to Hadoop.

2.2 Previous Work and Critique

2.2.1 Data Warehouse Systems DWHs

Data Warehouse Systems “DWHs” are platforms used to report and analyse the data by integrating data from different disparate sources and creating a central repository of data "data warehouse". This is achieved through storing the current and the historic data to create reports for management or technical reporting based on some pre-defined configurations. Currently nearly every discussion recently of big data analysis begins with a debate over what is the best way to perform the analysis, whether the DWHs will be replaced by the new frameworks like Hadoop or not. Some went very far thinking relational databases are going to

Get Access