Facing The Database Conundrum : Sql, Hbase, Hive, And The Saturation Of The Environment

2785 Words12 Pages
FACING THE DATABASE CONUNDRUM SQL, HBASE, HIVE, OR SPARK: WITH THE SATURATION OF THE ENVIRONMENT WHICH DO YOU CHOOSE SAMANTHA MOHR UNIVERSITY OF MARYLAND UNIVERSITY COLLEGE SPRING 2015 ABSTRACT There is currently a conundrum facing experts in the field of Big Data. The struggle is the ability to perform large-scale data analysis and the impracticality of using relational database processing languages to handle the information that is collected/processed. Specifically, the growth of data, the sheer volume that must be stored in databases, processed by cloud analytic and queried by applications has led to a growth in the data capacity the needs to be handled. Unfortunately, this exponential growth has exceeded the hardware and…show more content…
Are dynamic columns something you require support for, then you should choose Cassandra. Do you do batch analytic modeling on your data, Hadoop may be the choice for you. For live streaming analytic modeling abilities, Apache Spark is a much better choice. So you want to work with your data as if it were SQL, then you should try Hive. This paper will provide you with a detailed knowledge of how by choosing the correct database processing and query language you are able to mitigate the processing capacity problems that are involved with the vast growth of data recently. This will help to show that while there may be no one size fits all answer, there is a fit for the problem at hand based on the storage, processing, and query needs that are to be met. INTRODUCTION BACKGROUND As a result of the appearance of big data in our world, conventional data warehousing and data analysis methods no longer have the process power needed. What is Big Data you may ask and why is it such a big deal. NIST defines big data as anywhere “[…] data volume, acquisition velocity, or data representation limits the ability to perform effective analysis using traditional relational approaches […]” (Mell & Cooper, n.d.). 1 (Gong, 2012, p.15) Today’s analyst is inundated by an ever growing number of data being created by social media, mobile phones, climate sensors, digital pictures, etc. The volume being generated is staggering (2.7 Zettabytes of data in the digital universe).While
    Get Access