Data Analysis in the Cloud

Decent Essays
In this section we descus the expected properties of a system designed for performing data analysis at the cloud environment and how parallel database systems and MapReduce-based systems achieve these properties.
Expected properties of a system designed for performing data analysis at cloud:

• Performance
Performance is the primary characteristic of database systems that can use to select best solution for the system.High performance relate with quality, amount and depth of analysis. High performance helps to reduce cost.Upgrading to a quicker software package will permit an organization avoid adding further nodes to application continues to scale.

• Fault Tolerance.
In transactional workloads fault tolerant means that DBMS can recover from a failure without losing any data. In the distributed databases fault tolerances means that successfully commit transactions and make progress even in the worker node failures. For read-only queries in analytical workloads, query doesn’t have to be restarted if a case of one node’s query fails.In cloud there is a high failure rate. It can happen in single node failure during long query processing. • Ability to run in a heterogeneous environment
Due to hardware failures in the system nodes in cloud not act as homogeneous. When the work is equally divided among all nodes, time takes to complete the task should be equal to time that needed slowest node needed to complete its portion of work. Because its
Get Access