Analysis Of Joins And Semi Joins

1395 Words6 Pages
Analysis of Joins and Semi-joins in Centralized and Distributed Database Queries Komala Deepthi Sapireddy 10989274 Abstract: Centralized database management and distributed database management systems are the two different approaches for storing and for managing different databases. Irrespective of whichever approach used, the retrieval of data from central repository poses as a main challenge specially when using multiple tables. Primitive operations such as joins and semi joins used for the retrieval of necessary information from multiple tables. The current paper provides a brief description of the analysis of performances of these different joins and semi joins in both centralized and distributed database systems by using metrics…show more content…
Joins and Semi Joins: To extract data from multiple tables, join and semi join operations are used. The different kinds of joins can be described as self-join, inner join, outer join, equi-join etc. of which equi-join is the frequently used join. Semi-join is more significant in relational theory. Semi-join reduces the size of relation while it increases the processing cost and the number of messages. In distributed databases, semi-join is used to reduce the data transmission. Experimental Analysis: The amount of the data transmission is very vital in distributed approach unlike in the centralized approach. The tables EMP and DEPT in the paper provides analysis of performances of joins and semi joins in both distributed and centralized approaches. The tables DEPT and EMP are placed in the same location while analyzing the centralized database system and they are placed on different sites for distributed approach. The following assumptions are further made: relations in both cases are not fragmented, the query is requested from a different site, EMP has 14 tuples in total each of 51 bytes, and DEPT on the other hand has 4 tuples each of 29 bytes. Query processing joins in centralized DB: As in centralized approach there is no redundancy it improves security and also helps in managing concurrent transactions. The importance here is given on evaluating different metrics based on Bytes accessed, Cost of
Open Document