6 ECTS credits
150 h study time

Offer 1 with catalog number 4013052ENR for all students in the 1st and 2nd semester at a (E) Master - advanced level.

Semester
1st and 2nd semester
Enrollment based on exam contract
Impossible
Grading method
Grading (scale from 0 to 20)
Can retake in second session
Yes
Taught in
English
Faculty
Faculty of Sciences and Bioengineering Sciences
Department
Computer Science
Educational team
Coen De Roover (course titular)
Activities and contact hours
12 contact hours Lecture
32 contact hours Seminar, Exercises or Practicals
80 contact hours Independent or External Form of Study
Course Content

The goal of this course is to learn how to discover interesting and actionable information about software projects by analyzing the large amounts of data stored in their repositories using data mining and machine learning algorithms. This is an advanced course about selected topics from the state of the art in mining software repositories. As such, the exact content of the course can vary each year. 

The initial lectures typically cover the following topics:
1. Software repositories and associated data: version control repositories, issue trackers, Q&A platforms

2. Software data analytics and inference: empirical software engineering methods, mining for idioms in snapshots, mining for change patterns in commits

3. A selection of recent success stories

In the final lectures, we study recent research to understand how the mining of software repositories is evolving. For these lectures, the students will prepare a presentation about recent research results on which they will be graded. Students will also be graded on three assignments for which they need to apply and extend data mining algorithms to real-world project data.

Course material
Digital course material (Required) : Digital course material on the learning platform, Canvas
Additional info

not applicable

Learning Outcomes

Algemene competenties

Goals and competences

The goals of this course are:
- Students obtain knowledge about the analysis of large amounts of software engineering data using data mining techniques.
- Students become skilled at uncovering interesting and actionable information about software systems and projects to improve software quality.

The corresponding learning results are:

* w.r.t. knowledge:
- The student can describe the process needed to build a machine learning model able to predict defective components.
- The student can illustrate and discuss the strengths and weaknesses of the features to extract for training the model.
- The student can describe how to choose a classifier, and outline the differences between white-box and black-box techniques.
- The student can illustrate how to effectively and efficiently tune a machine learning model using search-based techniques.

* w.r.t. applying knowledge:
- The student can independently build a machine learning model to predict defective software components.
- The student can independently tune a machine learning model using search-based techniques.

* w.r.t. analysing:
- The student can recognise which features should be extracted from a source code repository.
- The student can recognise whether a prediction model is effective.
- The student can recognise whether tuning a prediction model is effective and efficient.

* w.r.t. evaluating:
- The student can compare machine learning techniques and decide which one to apply.
- The student can evaluate the applicability of different search-based techniques in order to tune a prediction model.

* w.r.t. creating:
- The student can generate alternative prediction models and choose among them.
- The student is able to report about the choices he made when building a model and the rationale behind them.

 

Grading

The final grade is composed based on the following categories:
Oral Exam determines 40% of the final mark.
Practical Exam determines 60% of the final mark.

Within the Oral Exam category, the following assignments need to be completed:

  • Oral exam with a relative weight of 1 which comprises 40% of the final mark.

    Note: 1 oral presentation that synthesizes one recent publication in the domain (40%)

Within the Practical Exam category, the following assignments need to be completed:

  • Practical exam with a relative weight of 1 which comprises 60% of the final mark.

    Note: 3 written assignments in which students apply mining software repositories techniques on real software systems (20% each)

Additional info regarding evaluation

Students are evaluated on three programming assignments, and on an oral presentation that synthesizes one recent publication in the domain.
The assignments are mandatory and the deadlines are strict.
Failing to hand in an assignment implies an absent mark for the course.

Allowed unsatisfactory mark
The supplementary Teaching and Examination Regulations of your faculty stipulate whether an allowed unsatisfactory mark for this programme unit is permitted.

Academic context

This offer is part of the following study plans:
Master in Applied Sciences and Engineering: Computer Science: Artificial Intelligence (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Multimedia (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Software Languages and Software Engineering (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Data Management and Analytics (only offered in Dutch)
Master of Applied Sciences and Engineering: Computer Science: Artificial Intelligence
Master of Applied Sciences and Engineering: Computer Science: Multimedia
Master of Applied Sciences and Engineering: Computer Science: Software Languages and Software Engineering
Master of Applied Sciences and Engineering: Computer Science: Data Management and Analytics