Design and evaluation for identification, mapping and profiling of software systems
The thesis focuses on leveraging Natural Language Processing methods to gather metadata from repositories about the software development lifecycle. Aim of the thesis is to provide a light-weight approach that allows to apply these Natural Language Processing techniques to software development projects. This is achieved by supplying a methodology that is easily transferable between projects without further customization. The approach provides visualization and reporting capabilities that are the base of further analyses on topics such as social aspects of software development, security and software quality among others.
Versioning systems play a crucial role in software engineering. These systems record the changes to the software’s source code. However, current version control systems like Git and SVN have the following limitations. They only rely on textual snapshots differences; they are not aware of programming languages or the software engineering toolset, and their data is not directly accessible.
In consideration of the mentioned limitations, this thesis hypothesizes on the characteristics of the next generation of version control systems in software engineering. This work will investigate how developers benefit from the hypothesized improvements to version control. The combination of continuous code changes with the data from the version control system and the software engineering toolset will provide research opportunities far beyond the studies of this thesis.