6 ECTS credits
150 u studietijd
Aanbieding 1 met studiegidsnummer 4013052ENR voor alle studenten in het 1e en 2e semester met een verdiepend master niveau.
The goal of this course is to learn how to discover interesting and actionable information about software projects by analyzing the large amounts of data stored in their repositories using data mining and machine learning algorithms. This is an advanced course about selected topics from the state of the art in mining software repositories. As such, the exact content of the course can vary each year.
The initial lectures typically cover the following topics:
1. Software repositories and associated data: version control repositories, issue trackers, Q&A platforms
2. Sofware ecosystems and evolution
3. Software data analytics and inference: empirical software engineering methods, mining for idioms in snapshots, mining for change patterns in commits
4. A selection of recent success stories
In the final lectures, we study recent research to understand how the mining of software repositories is evolving. For these lectures, the students will prepare a presentation about recent research results on which they will be graded. Students will also be graded on three assignments for which they need to apply and extend data mining algorithms to real-world project data.
not applicable
Goals and competences
The goals of this course are:
- Students obtain knowledge about the analysis of large amounts of software engineering data coming from different ecosystems using data mining techniques.
- Students become skilled at uncovering interesting and actionable information about software systems and projects to improve software quality.
The corresponding learning results are:
* w.r.t. knowledge:
- The student can describe the process needed to build a machine learning model able to predict defective components.
- The student can illustrate and discuss the strengths and weaknesses of the features to extract for training the model.
- The student can describe how to choose a classifier, and outline the differences between white-box and black-box techniques.
- The student can illustrate how to effectively and efficiently tune a machine learning model using search-based techniques.
* w.r.t. applying knowledge:
- The student can independently build a machine learning model to predict defective software components.
- The student can independently tune a machine learning model using search-based techniques.
* w.r.t. analysing:
- The student can recognise which features should be extracted from a source code repository.
- The student can recognise whether a prediction model is effective.
- The student can recognise whether tuning a prediction model is effective and efficient.
* w.r.t. evaluating:
- The student can compare machine learning techniques and decide which one to apply.
- The student can evaluate the applicability of different search-based techniques in order to tune a prediction model.
* w.r.t. creating:
- The student can generate alternative prediction models and choose among them.
- The student is able to report about the choices he made when building a model and the rationale behind them.
De beoordeling bestaat uit volgende opdrachtcategorieën:
Examen Mondeling bepaalt 40% van het eindcijfer
Examen Praktijk bepaalt 60% van het eindcijfer
Binnen de categorie Examen Mondeling dient men volgende opdrachten af te werken:
Binnen de categorie Examen Praktijk dient men volgende opdrachten af te werken:
Students are evaluated on three programming assignments, and on an oral presentation that synthesizes one recent publication in the domain.
The assignments are mandatory and the deadlines are strict.
Failing to hand in an assignment implies an absent mark for the course.
Deze aanbieding maakt deel uit van de volgende studieplannen:
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Artificiële Intelligentie
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Multimedia
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Software Languages and Software Engineering
Master in de ingenieurswetenschappen: computerwetenschappen: afstudeerrichting Data Management en Analytics
Master in Applied Sciences and Engineering: Computer Science: Artificial Intelligence (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Multimedia (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Software Languages and Software Engineering (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Data Management and Analytics (enkel aangeboden in het Engels)