This paper reports on a tool for fine-grained analysis of structural changes made between revisions of programs. The tool, called Diff/TS, calculates, visualizes and classifies edit operations including moves that will change one revision into another by means of detailed tree structural analysis on source code.
Abstract
This paper reports on a tool for fine-grained analysis of structural changes made between revisions of programs. The tool, called Diff/TS, calculates, visualizes and classifies edit operations including moves that will change one revision into another by means of detailed tree structural analysis on source code. Such analysis tends to be time consuming and inflexible. We have extended a general tree comparison algorithm with heuristics driven control configurable for multiple programming languages and have achieved both processing speed and analysis precision needed for investigating large-scale software projects. The tool is capable of processing Python, Java, C and C++ projects. We present several applications including software archaeology on a widely known open source software project and automated phylogenetic malware classification based on control flows. These examples suggest that tree differencing is useful for measuring distance or dissimilarity between tree structured artifacts, and offer good precision tests of the method.
Reference
Masatomo Hashimoto and Akira Mori. Diff/TS: A Tool for Fine-Grained Structural Change Analysis. In Proceedings of the 15th Working Conference on Reverse Engineering (WCRE ’08). IEEE Computer Society, Washington, DC, USA, pp. 279-288, 2008. DOI: 10.1109/WCRE.2008.44
Implementation
An implementation specialized for comparing ASTs (abstract syntax trees) is available here.
Demo
You can find several output samples of Diff/TS here.
Pingback: Diff/TS: Output Samples Available | Codinuum