5 ECTS credits
125 h study time
Offer 1 with catalog number 4021495ENR for all students in the 2nd semester at a (E) Master - advanced level.
This course introduces you to concepts and practical skills in relation to i) gathering, ii) organising, iii) integrating and iv) analysing biological data, in other words ‘data mining’. This is particulary relevant for molecular biology: there is an enormous increase in the availability of biological data, and knowing how to use these data and analyse them will help you advance the quality of your future scientific studies. The course contains an introduction to programming in Python (for data handling) and R (for statistical analysis and plot generation). It focusses on statistical concepts, such a p-values, using genome data in relation to human health.
The content of the course is, in more detail:
1. Using protein sequence databases (e.g. UniProt) to gather sequence information and create multiple sequence alignments (MSA) with these sequences.
2. Gathering data on the effect of amino acid variants (mutations) on the organism phenotype (e.g. gnomAD for human health).
3. Curating and organising the gathered data.
4. Extracting derived information from the protein sequences (e.g. secondary structure or solvent accessibility predictions).
5. Integrating the (derived) data into a single data structure.
6. Analysing the data to find patterns and significant differences. This includes generating plots to visualise distributions, correlations, ...
The lectures and practicals will be available online, but physical presence is preferred as this is an interactive course.
1. Awareness
Recognition of key terms in relation to data handling and analysis, as well as programming.
2. Understanding
Understanding of programming and of statistical approaches relevant for solving biological or medical questions with large data sets.
3. Communication
Ability to communicate constructively with peers in a joint project.
4. Application
Implementation of a Python/R script to gather data from external sources (databases), organise this data, integrate it with other, related, data.
5. Analysis
Analysis of (integrated) data on a large scale using statistical approaches and data visualisation using graphs.
6. Capacity to evaluate
Based on what was learned during the course, evaluate proposed analyses of biological data, with awareness of factors such as data quality, bias or overlap in the data used.
The final grade is composed based on the following categories:
Other Exam determines 100% of the final mark.
Within the Other Exam category, the following assignments need to be completed:
You will be evaluated on the basis of a data analysis project, to be performed during the course in pairs, and written up in a report (50%) and a final presentation, where you present your work and will be evaluated on your data analysis knowledge and understanding (50%).
This offer is part of the following study plans:
Master of Molecular Biology: Standaard traject