5 ECTS credits
125 u studietijd
Aanbieding 1 met studiegidsnummer 4021495ENR voor alle studenten in het 2e semester met een verdiepend master niveau.
This course introduces you to concepts and practical skills in relation to i) gathering, ii) organising, iii) integrating and iv) analysing biological data, in other words ‘data mining’. This is particulary relevant for molecular biology: there is an enormous increase in the availability of biological data, and knowing how to use these data and analyse them will help you advance the quality of your future scientific studies. The course contains an introduction to programming in Python (for data handling) and R (for statistical analysis and plot generation). It focusses on statistical concepts, such a p-values, using genome data in relation to human health.
The content of the course is, in more detail:
1. Using protein sequence databases (e.g. UniProt) to gather sequence information and create multiple sequence alignments (MSA) with these sequences.
2. Gathering data on the effect of amino acid variants (mutations) on the organism phenotype (e.g. gnomAD for human health).
3. Curating and organising the gathered data.
4. Extracting derived information from the protein sequences (e.g. secondary structure or solvent accessibility predictions).
5. Integrating the (derived) data into a single data structure.
6. Analysing the data to find patterns and significant differences. This includes generating plots to visualise distributions, correlations, ...
None
1. Awareness
Recognition of key terms in relation to data handling and analysis, as well as programming.
2. Understanding
Understanding of programming and of statistical approaches relevant for solving biological or medical questions with large data sets.
3. Communication
Ability to communicate constructively with peers in a joint project.
4. Application
Implementation of a Python/R script to gather data from external sources (databases), organise this data, integrate it with other, related, data.
5. Analysis
Analysis of (integrated) data on a large scale using statistical approaches and data visualisation using graphs.
6. Capacity to evaluate
Based on what was learned during the course, evaluate proposed analyses of biological data, with awareness of factors such as data quality, bias or overlap in the data used.
De beoordeling bestaat uit volgende opdrachtcategorieën:
Examen Andere bepaalt 100% van het eindcijfer
Binnen de categorie Examen Andere dient men volgende opdrachten af te werken:
You will be evaluated on the basis of a data analysis project, to be performed during the course in pairs, and written up in a report (50%) and a final presentation, where you present your work and will be evaluated on your data analysis knowledge and understanding (50%).
Deze aanbieding maakt deel uit van de volgende studieplannen:
Master of Molecular Biology: Standaard traject (enkel aangeboden in het Engels)
Master of Biology: Molecular and Cellular Life sciences (enkel aangeboden in het Engels)