6 ECTS credits
150 h study time
Offer 1 with catalog number 4013052ENR for all students in the 1st and 2nd semester at a (E) Master - advanced level.
The goal of this course is to learn how to discover interesting and actionable information about software projects by analyzing the large amounts of data stored in their repositories using data mining and machine learning algorithms. This is an advanced course about selected topics from the state of the art in mining software repositories. As such, the exact content of the course can vary each year.
The initial lectures typically cover the following topics:
1. Software repositories and associated data: version control repositories, issue trackers, Q&A platforms
2. Software data analytics and inference: empirical software engineering methods, mining for idioms in snapshots, mining for change patterns in commits
3. A selection of recent success stories
In the final lectures, we study recent research to understand how the mining of software repositories is evolving. For these lectures, the students will prepare a presentation about recent research results on which they will be graded. Students will also be graded on three assignments for which they need to apply and extend data mining algorithms to real-world project data.
not applicable
Goals and competences
The goals of this course are:
- Students obtain knowledge about the analysis of large amounts of software engineering data using data mining techniques.
- Students become skilled at uncovering interesting and actionable information about software systems and projects to improve software quality.
The corresponding learning results are:
* w.r.t. knowledge:
- The student can describe the process needed to build a machine learning model able to predict defective components.
- The student can illustrate and discuss the strengths and weaknesses of the features to extract for training the model.
- The student can describe how to choose a classifier, and outline the differences between white-box and black-box techniques.
- The student can illustrate how to effectively and efficiently tune a machine learning model using search-based techniques.
* w.r.t. applying knowledge:
- The student can independently build a machine learning model to predict defective software components.
- The student can independently tune a machine learning model using search-based techniques.
* w.r.t. analysing:
- The student can recognise which features should be extracted from a source code repository.
- The student can recognise whether a prediction model is effective.
- The student can recognise whether tuning a prediction model is effective and efficient.
* w.r.t. evaluating:
- The student can compare machine learning techniques and decide which one to apply.
- The student can evaluate the applicability of different search-based techniques in order to tune a prediction model.
* w.r.t. creating:
- The student can generate alternative prediction models and choose among them.
- The student is able to report about the choices he made when building a model and the rationale behind them.
The final grade is composed based on the following categories:
Oral Exam determines 40% of the final mark.
Practical Exam determines 60% of the final mark.
Within the Oral Exam category, the following assignments need to be completed:
Within the Practical Exam category, the following assignments need to be completed:
Students are evaluated on three programming assignments, and on an oral presentation that synthesizes one recent publication in the domain.
The assignments are mandatory and the deadlines are strict.
Failing to hand in an assignment implies an absent mark for the course.
This offer is part of the following study plans:
Master in Applied Sciences and Engineering: Computer Science: Artificial Intelligence (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Multimedia (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Software Languages and Software Engineering (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Data Management and Analytics (only offered in Dutch)
Master of Applied Sciences and Engineering: Computer Science: Artificial Intelligence
Master of Applied Sciences and Engineering: Computer Science: Multimedia
Master of Applied Sciences and Engineering: Computer Science: Software Languages and Software Engineering
Master of Applied Sciences and Engineering: Computer Science: Data Management and Analytics