Is Hadoop A Great Data Storage Choice And Hadoop Distributed File System?

1769 Words8 Pages
Hadoop is a great data storage choice and Hadoop Distributed File System (HDFS) or Hive is often used to store transactional data in its raw state. The map-reduce processing supported by these Hadoop frameworks can deliver great performance, but it does not support the same specialized query optimization that mature relational database technologies do. Improving query performance, at this time, requires acquiring query accelerators or writing code. Every company who chose to use Hadoop needs to optimize their architecture in a way compatible to Hadoop. For example using Hadoop in the architecture would be able process large data sets and if the query performance is not optimized or if the query is not able to accept the data given, the…show more content…
Hadoop excels with managing and processing file-based data, especially when the data is voluminous in the extreme and the data would not benefit from transformation and loading into a DBMS. In fact, for the kinds of discovery analytics involved with Hadoop, it’s best to keep the data in its raw, source form. This is why Hadoop has such a well-deserved reputation with big data analytics. Using the right combination of Hadoop products and the other platforms can be sensational in terms of analytics because it has the capacity to analyze analysis of petabytes of Web log data in large Internet firms, and now is being applied to similar analytic applications involving call detail records in telecommunications, XML documents in supply chain industries (retail and manufacturing), unstructured claims documents in insurance, sessionized spatial data in logistics, and a wide variety of log data from machines and sensors. Hadoop-enabled analytics are sometimes deployed in silos, but the
Open Document