A Survey Of Techniques Of Software Repository

1342 Words6 Pages
A Survey of Techniques in Software Repository Mining Naveen Sahu Indian Institute of Technology Guwahati Abstract Software Repositories are used to record the history of the les in the project, info about what was modi ed, by whom and when, the extent of the modi- cation etc. Mining of the data in the repositories can give ideas about the development process of the systems. For example, whether development doc- umentation is synchronous with the implementation, what is the bug resolve rate, are the features requested implemented, information about project 's evolution, collaboration b/w the developers, their contributions, milestones in the development of the project, idea of the design of the software, nd the dependencies b/w the parts…show more content…
This paper sheds some light on the current approaches and recent developments in the mining of software repositories. The topics include  Clone Detection 1  Frequent pattern mining  Classi cation with supervised learning  Information retrieval methods 1 Clone Detection When the software are created, many times the codes are simply copy-pasted, with small changes. This is called cloning. Investigating these copy-pasted sections of source codes is called Clone Detection. It was discovered that about 10-15% of the code in a software system was copy-pasted[1] For refac- toring, or at least noting. Clone detection techniques try to nd cloned codes. Techniques for detecting clones may include programming language speci c parsing to simple text based analysis. Parsing code technique have an advantage as they are semantically aware which allows them to detect similarities between the structures of di erent code segments where the dif- ference between naming of variables and parameters is too much. The latter, on the other hand, can be used over a broad variety of material as they aren 't language speci c. NiCad[2], a clone detector tool, uses hybrid clone detection method, com- bining features of both language sensitive parsing and text based analysis. It works in three stages Parsing, Normalization and comparison. The rst stage involves parsing the input source to extract all fragments of a given granularity, such as
Open Document