author: René Boschma
title: An algebra of repositories
topics: Algorithms and Data Structures , Graphs , Software Technology
committee: Arend Rensink
started: April 2017
end: July 2017
type: Research Project


Versioning systems allow the user to branch, merge, commit and update in a very flexible manner. It is easy to lose the overview over what has happened in the past and how versions depend on one another. In this assignment, you are asked to

  1. Create a conceptual (meta)model of versioning concepts, abstracting away from particular versioning technologies (git, hg, svn)
  2. Describe the effect of repository operations (merging, branching etc.) on that conceptual level
  3. Design a datastructure for repository metadata besed on step 1, also reflecting the operations identified in step 2.
  4. To validate the previous steps, implement at least one prototype tool for a given versioning technology to automatically extract all metadata out of an arbitrary repository and create an instance of the datastructure designed in step 3.

Depending on the amount of work involved in the steps above, the assignment can be restricted or extended


  1. A survey and taxonomy of approaches for mining software repositories in the context of software evolution (Digital version available here)
  2. Software Intelligence: The Future of Mining Software Engineering Data (Digital version available here)
  3. Findings from GitHub: Methods, Datasets and Limitations (Digital version available here)