6 ECTS credits
165 h study time

Offer 1 with catalog number 4021259FNR for all students in the 1st semester at a (F) Master - specialised level.

Semester
1st semester
Enrollment based on exam contract
Impossible
Grading method
Grading (scale from 0 to 20)
Can retake in second session
Yes
Enrollment Requirements
It will give students an advantage to have already studied a bit about the internals of compilers, or the interpretation of computer programs.
Taught in
English
Faculty
Faculty of Science and Bio-engineering Sciences
Department
Computer Science
Educational team
Jennifer SARTOR (course titular)
Activities and contact hours
26 contact hours Lecture
26 contact hours Seminar, Exercises or Practicals
Course Content

As the computer industry has grown over the last 50 years, the focal point has often been performance.  Machines have gotten faster and faster, smaller and smaller.  We have much larger and more complex pieces of software now than we did 40 years ago.  This course will be about de-mystifying how to evaluate and analyze the performance of computer programs, particularly those written in managed languages.  We will look at the software stack, and discuss how each of the layers -- the application, the compiler, the (potential) runtime environment, the operating system, the processor, and the memory system -- all impact the end performance of a program.  We will look at some tools that shed light on why a program gets the execution time it does - analyzing its memory behavior or in what methods the bottlenecks might be.  These are profiling or performance monitoring tools that are relatively easy to use to analyze performance.  We will talk about metrics and statistics - once you run a program or a suite of programs, how do you accurately present and summarize results about their performance? Finally, we will touch on how the multicore era has also influenced the industry, and talk about the opportunities and limitations programming in parallel bring to performance.  We will include a discussion on heterogeneous hardware, as well, and how some computer architects have modeled these machines to help design future processors. 

The following list of topics will be discussed:

  • The software stack
  • Superscalar and out-of-order processors, at a high level
  • The memory subsystem and how memory is laid out, caches
  • Performance tools such as perf to obtain hardware performance counters
  • Details of managed language runtime environments (like the Java virtual machine) and how they impact performance
  • Garbage collection algorithms
  • Dynamic compilation and profiling
  • Statistics and how to present results - which means to use and how to do confidence intervals. 
  • Parallelism: Amdahl's Law, Gustafson's Law, speedup and scalability
  • Analytical model of processors, superscalar pipeline, simulators
  • Multicores: chip multiprocessors (CMPs) and heterogeneous machines

Of the layers of the software/system stack, we will focus more on the runtime environment and memory system than other layers.  We will touch on details about adaptive or dynamic compilers that are used in managed runtime environments.  We will also discuss   processors and how they execute instructions.  Understanding how they work at a high level helps to understand how an application achieves the performance it does.  Also, the memory system is clearly linked to the processor and has a huge impact on overall   performance.  So we will focus on how caches are laid out and used, and how a runtime environment lays out memory and reclaims dead memory.  To gather information about the hardware's behavior, we will explore hardware performance counters, and use tools that can read them and summarize results from them.

Two important aspects of analyzing performance are 1) how to properly set up experiments to measure and evaluate what you want to show, and 2) how to present results if you do perform experiments.   We will discuss proper experimental methodology (for Java   applications), such as how to eliminate non-determinism and exploring the space-time tradeoff of garbage collection. I will also teach good practices on how to repeat experiments and take averages over a set of results, and present confidence intervals in graphs.

Finally, we will explore a bit of how parallel programming affects performance.  We'll assume a shared-memory model.  We will discuss speedup and Amdahl's Law and   Gustafson's law.  We'll talk a little about cache coherence and false sharing.  We'll talk a bit about the OS's job scheduling tasks.  We will also talk briefly about the hardware reordering   instructions and memory consistency, if time allows.  We will also touch on power as a prime concern, which led to the rise of heterogeneous machines.

Additional info

I expect advanced programming skills. In particular, I assume you have basic knowledge of object-oriented programming in a mainstream OO language such as C++/Java/C#.  We will focus on Java in this course.  I will also use code examples in C, and will assume you will be able to understand the code, and write your own basic C programs.  

It is highly recommended that you have already taken a compilers course prior to this course. An interest in system-level performance details and efficiency is useful.

Learning Outcomes

General competences

Knowledge and insight:  This course will give students the opportunity to gain deep insight into a computer program’s performance – why it might be slow, how to analyze bottlenecks, how to break down each layer of the software/system stack and understand how it contributes to the program’s end performance.    Furthermore, students will understand specifically why managed languages that run on top of a runtime environment are more complicated to analyze and evaluate than native languages.  Students will have a deep understanding of how the memory subsystem contributes to performance.  Students will also know the practical usage of computer architecture simulators.

Application of knowledge and insight: After successful completion of this course, students will know how to design their own performance experiments, and how to summarize, aggregate, and present results.  Students will also be able to make changes in their program’s memory access pattern to obtain improved memory efficiency.  Furthermore, students will be able to analyze the sources of inefficient code using hardware performance counters, and analyze the sources of limited scalability in parallel programs. 

Development of judgment: After successful completion of this course, students will be able to evaluate how a program accesses memory, whether efficiently or not, and will be able to evaluate the scalability of parallel applications.

Communication skills:  This course will develop students’ ability to present meaningful and comprehensive experimental results in the form of graphs, and to be able to analyze the meaning of the results.

Learning skills: This course gives students a system-level view of the computer, across the software stack.  Thus, students will have a deep understanding of how a program gets executed, and what layers affect its overall performance.  This will enable students to more insightfully write efficient computer programs, taking the memory system into account, and understanding the ramifications of choosing a managed language.  Furthermore, students will have learned how to analyze the computer program’s behavior in detail in order to inform optimization.

Grading

The final grade is composed based on the following categories:
Oral Exam determines 40% of the final mark.
PRAC Practical Assignment determines 60% of the final mark.

Within the Oral Exam category, the following assignments need to be completed:

  • Oral Exam - new scenario with a relative weight of 8 which comprises 40% of the final mark.

Within the PRAC Practical Assignment category, the following assignments need to be completed:

  • Programming projects with a relative weight of 12 which comprises 60% of the final mark.

Additional info regarding evaluation

The grade for this course will be divided as follows.  You will complete 4 projects during the semester, each of which will be 15% of your grade, for a total of 60% during the semester.  These projects will be completed individually.   You are required to turn in all 4 projects to pass this class!  You are also required to take the oral examination at the end to pass.  There will be a written preparation part that you have to complete within the exam slot and bring to the oral exam with you.  The oral exam can cover concepts from the entire semester.  Your final exam will then be worth 40% of your total grade.  To pass the entire course, you are required to pass the project part of the course (60%) as well as the exam portion of the course (40%).

Academic context

This offer is part of the following study plans:
Master in Applied Sciences and Engineering: Computer Science: Profile Artificial Intelligence (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Profile Multimedia (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Profile Software Languages and Software Engineering (only offered in Dutch)
Master in Applied Sciences and Engineering: Computer Science: Profile Web & Information Systems (only offered in Dutch)
Master of Applied Sciences and Engineering: Computer Science: Profile Artificial Intelligence
Master of Applied Sciences and Engineering: Computer Science: Profile Multimedia
Master of Applied Sciences and Engineering: Computer Science: Profile Software Languages and Software Engineering
Master of Applied Sciences and Engineering: Computer Science: Profile Web & Information Systems