INSY 5337 Data Warehousing – Term Paper
NoSQL Databases: An Introduction and Comparison between Dynamo,
MongoDB and Cassandra
Authored ByNitin Shewale Aditya Kashyap Akshay Vadnere Vivek Adithya Aditya Trilok
Abstract
Data volumes have been growing exponentially in recent years, this increase in data across all the business domains have played a significant part in the analysis and structuring of data. NoSQL databases are becoming popular as more organizations consider it as a feasible option because of its schema-less structure along with its capability of handling BIG Data. In this paper, we talk about various types of NoSQL databases based on implementation perspective like key store, columnar and document oriented. This
…show more content…
Soft State – Since the data is distributed, there is no assurance of consistency.
Eventually Consistent – The data would be consistent eventually, even if it’s not at a given point in time [5][6][10].
3. Features of NoSQL
Flexible Data Models – NoSQL allows horizontal data partitioning across different distributed systems or processors. However, relational model has a fixed schema in
contrast to NoSQL. Applications based on NoSQL have data models explicitly designed and augmented for them.
Partial Record Updates – Data models that use NoSQL emphasize on column based processing that enable data aggregation on more than one attributes and entities.
Optimized MapReduce Processing – MapReduce, a native functionality for data movement and mapping is a part of NoSQL.
Horizontal Scalability – It allows on-the-fly addition of the processors with their own resources. Each node is fed with a subset of data to process, thus increasing the efficiency of the application. Horizontal scalability is more achievable in NoSQL data model as compared to RDBMS [1].
4. Types of NoSQL Databases
Key Value -> Key value data stores references the data using a unique key. The unique key acts as a link to the data that is randomly and independently stored on the disk.
Addition of new data values can be
Because you can count the number of keys and this is an integer value, this is discrete data.
Key-value stores provide users simple yet powerful interface to data storage, which are often used in complicated systems. [2] LMDB is a framework that provides high-performance key-value storage
STRUCTURE OF DATA: The data structure of a relational database comprises of table structure. Every table is identified by a unique name or label. The data tables are described as the collection of rows and columns. Each row of the table is known as the record and each column is known as the field of the specific data table. All the data sets are well organized and logical linked to each other through definite and unique relationships. A table, therefore can also be defined as the “structured collection of relationships”. The fundamental aim of developing No SQL database systems is to easily and effectively handle vast quantities of data or information in advanced web-scale applications. In order to achieve this purpose, the No SQL systems are designed as the schema-free database systems. There are different modes to define the No SQL databases that typically depend on the requirements of the data that has to be managed. The data model for key-value store No SQL database is
The modern RDBMS advancements are not capable of supporting unstructured information with ideal space necessity. The plan winds up plainly mind-boggling and is henceforth troublesome for designers. The requirement for unstructured information administration is so annoying with conventional RDBMS arrangements (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). Moreover, RDBMS turns out to be an exorbitant answer for creating light-footed web applications with direct information investigation necessities. NoSQL is developing as a proficient possibility in this situation, which connects the issues related with RDBMS innovation. The market development can credit to creative dispatches of NoSQL arrangements, and collective endeavors by NoSQL sellers and clients. The endeavors of organizations, to enhance their market offerings, are creating the request of NoSQL, as a back-end bolster (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). The emergence of agile software development is creating the demand for NoSQL (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). They offer users much more avenues to accept data in many different forms. NoSQL is adaptable as SQL but offers many more uses that can apply to many organizations.
Lately with the development of distributed computing, issues services that utilizes web and require enormous amount of data come to forefront. For Organizations like Facebook and Google the web has developed has a vast, distributed data repository for which handling by conventional DBMS is not sufficient. Rather than extending on the hardware capabilities, a more realistic approach has been accepted. Technically, it is an instance of scaling through dynamic adding servers from the reasons increasing either information volume in the repository or the number of users of this repository. In this scenario, the big data problem is frequently examined and in addition explained on a technological level of web databases.
Data has always been analyzed within companies and used to help benefit the future of businesses. However, the evolution of how the data stored, combined, analyzed and used to predict the pattern and tendencies of consumers has evolved as technology has seen numerous advancements throughout the past century. In the 1900s databases began as “computer hard disks” and in 1965, after many other discoveries including voice recognition, “the US Government plans the world’s first data center to store 742 million tax returns and 175 million sets of fingerprints on magnetic tape.” The evolution of data and how it evolved into forming large databases continues in 1991 when the internet began to pop up and “digital storage became more cost effective than paper. And with the constant increase of the data supplied digitally, Hadoop was created in 2005 and from that point forward there was “14.7 Exabytes of new information are produced this year" and this number is rapidly increasing with a lot of mobile devices the people in our society have today (Marr). The evolution of the internet and then the expansion of the number of mobile devices society has access to today led data to evolve and companies now need large central Database management systems in order to run an efficient and a successful business.
NoSQL databases had made for unraveling the Big Data issue by utilizing a distributed system to bring out excellent performance in data storage and retrieval at very large-scale. At this scale, pieces of the system often fail and NoSQL is created to handle these failures (Chow, 2013) (Ron, Shulman-Peleg, & Bronshtein, 2015). Various companies have espouse different sorts of non-relational databases, ordinarily alluded to as
"The ever-increasing appetite of businesses to embrace emerging big data-related software and infrastructure technologies while keeping the implementation costs low has led to the creation of a rich ecosystem of new and incumbent suppliers," said Ashish Nadkarni, IDC program director, enterprise servers and storage, in a prepared statement. Nadkarni co-authored the IDC report with Dan Vesset, program VP, business analytics & big data. "At the same time, the
Currently, there are two major of database management systems which are used to deal with data, the first one called Relational Database Management System (RDBMS) which is the traditional relational databases, it deals with structured data and have been popular since decades from 1970, while the second one called Not only Structure Query Language databases (NoSQL), they have been dealing with semi-structured and unstructured data; the NoSQL term was introduced for the first time in 1998 by Carlo Strozzi and Eric Evans reintroduced the term NoSQL in early 2009, and now the NoSQL types are gaining their popularity with the development of the internet and the social media. NoSQL are intending to override the cons of RDBMS, such as fixed schemas, JOIN operations and handling the scalability problems. With the appearance of Big Data,
In Nowadays, there are two major of database management systems which are used to deal with data, the first one called Relational Database Management System (RDBMS) which is the traditional relational databases, it deals with structured data and have been popular since decades since 1970, while the second one called Not only Structure Query Language databases (NoSQL), they are dealing with semi-structured and unstructured data; the NoSQL types are gaining their popularity with the development of the internet and the social media since April 2009. NoSQL are intending to override the cons of RDBMs, such as fixed
It is used to describe the type of information that is to be stored in a
As there is a rise in data volumes, the manageability of data and storing these huge volumes of data became a cause of concern to most of the organizations. It was during this period when Number of SQL or more popularly NoSQL was introduced, to process these large amounts of data efficiently and effectively. For this purpose, various Data Store categories were developed, based on the different data models. Some of the categories are:
Data is very critical for any organization. In an organization every by year massive amounts of data will be created and how fast your business reacts to that important information determines whether you succeed or fail. The big problem is how we efficiently handle the 3 V’s of Big Data.
In comparison to relational databases, NoSQL databases are better at providing superb performance while handling data of large scale and variable structures
Increase Volume: Provide capacity to process a greater amount of activity over large volume of data.