Big Data Has Taken The Business World By Storm

1654 Words7 Pages
Big Data has taken the business world by storm. By 2020, it is expected that the amount of digital information in existence will have grown from 3.2 zettabytes in 2014 to 40 zettabytes. Companies are doing all they can to capture this digital information and turn it into actionable insights. Currently, the total amount of data being captured and stored by industry is doubling every 1.2 years. Therefore, companies must find increasingly efficient solutions to store and analyze this incredible amount of data. One solution that continues to grow in popularity is Apache Hadoop. Hadoop, as it is known, is an open-source software library that allows for the distributed processing of large data sets across clusters of computers using simple…show more content…
In 2014 AWS captured 28% of the worldwide market almost triple that of its closest competitor Microsoft at 10%. One of the main reasons for this huge lead in market share is AWS’ breadth of offerings which have extremely wide business applicaiton. Their web services include core cloud infrastructure services like virtual servers, object, block and archive storage, and virtual private cloud. In addition they also have several services dedicated to mobile, enterprise, and analytics. Listed first under the analytics section on their website is Hadoop. Amazon Elastic Map-Reduce is the web service that provides a managed Hadoop framework. Amazon EMR is able to most big data use cases including log analysis, web indexing, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics. Another aspect that puts Amazon at the front of the IaaS industry is that they are continually updating their products. For example, this June they announced that users could now run parallel Hadoop jobs on their Amazon EMR cluster using AWS Data Pipeline. This ability to run parallel Hadoop jobs on their clusters allowed users to significantly increase the utilization of their cluster. For other cluster management issues, there is Amazon ECS. Amazon ECS focuses on the management of followers, dispatching of sub-tasks to the proper location, and state inspection of the cluster are all
Open Document