3 ECTS credits
75 u studietijd

Aanbieding 1 met studiegidsnummer 4022633FNR voor alle studenten in het 2e semester met een gespecialiseerd master niveau.

Semester
2e semester
Inschrijving onder examencontract
Niet mogelijk
Beoordelingsvoet
Beoordeling (0 tot 20)
2e zittijd mogelijk
Ja
Onderwijstaal
Engels
Faculteit
Faculteit Ingenieurswetenschappen
Verantwoordelijke vakgroep
Industriële ingenieurswetenschappen
Onderwijsteam
Jan Lemeire (titularis)
Onderdelen en contacturen
12 contacturen Hoorcollege
18 contacturen Werkcolleges, practica en oefeningen
Inhoud

The course is an advanced course on programming GPUs in an effective and efficient manner. Level by level, we will unravel the programming paradigm of modern GPUs, up to the level of warps and vector instructions. In parallel, the hardware architecture is demonstrated and analysed. After learned how to program GPUs, we will show how to identify the performance bottlenecks and discuss possible alleviations.

The programming paradigm and architecture of modern GPUs is unravelled level by level, each level digging deeper into the complexities of hardware and software. An effective sequence of examples and exercises have been set-up such that the student will follow an effective path towards grasping the necessary skills of GPU computing. We will focus on the standard OpenCL language. Since Nvidia’s proprietary CUDA language is based on the same paradigm, a one-on-one mapping of the basic concepts exist. We will show which additional functions Nvidia provide on top of the standard.

The course consists of 5 theoretical lectures, a lab and a programming project.
All information can be found on http://parallel.vub.ac.be/education/gpu

1. The power of GPUs
2. Programming GPUs
3. The GPU architecture and strategy
4. The pipeline performance model
5. Performance limiters
Insight into the peak performance will be given, as well as the issues that make the performance degenerate. The relation with algorithmic and implementations aspects will be discussed, such that the student gets a first insight into which algorithms/implementations are well-suited for GPUs and which aren't.

Each student will execute a set of related benchmarks and analyze the resulting performance. By a report he will show the understanding of the peak and actual performance, and the reasons for the performance degradation.
After that, the student will port a sequential algorithm to OpenCL in order to accelerate its run time. The programming project happens individually. The student will demonstrate the results and based on the feedback of the professor and assistant, improve his results into a final version.

 

Bijkomende info

Prerequisites

  • good knowledge of computer systems (processor and network)
  • good (!) programming experience
  • knowledge of and experience in multithreading.

Additional questions can be directed to jan.lemeire@vub.be

All information can be found on http://parallel.vub.ac.be/education/gpu

Leerresultaten

General Learning Outcomes

The student will get to understand how to implement algorithms on GPUs and which aspects to consider for efficient implementations. We will put the GPU into perspective; compare them with other technologies and discuss their weaknesses and the challenges for putting the technology into practice.
The student will acquire a thorough understanding in writing GPU kernels, launching kernels, data transfer, kernel synchronization, vector operations, debugging, the available tools, understanding and optimizing memory access, ... He will be able to apply his understanding of low-level thread and hardware characteristics to devise high-performant, scalable solutions.
By the practica and project, the student will have demonstrated that he can make good judgments about complex situations and communicate his conclusions. Specific or complex parallel solutions are possible, but these are difficult to maintain and less generic. Only simple, clever solutions are feasible. The student will be able to participate to discussions about exploiting parallelism and the proper use of modern GPU technology.

Beoordelingsinformatie

De beoordeling bestaat uit volgende opdrachtcategorieën:
Examen Andere bepaalt 100% van het eindcijfer

Binnen de categorie Examen Andere dient men volgende opdrachten af te werken:

  • Lab Report met een wegingsfactor 25 en aldus 25% van het totale eindcijfer.
  • Programming Project met een wegingsfactor 75 en aldus 75% van het totale eindcijfer.

Aanvullende info mbt evaluatie

The course is evaluated by a lab report (25%) and a programming project (75%)

Toegestane onvoldoende
Kijk in het aanvullend OER van je faculteit na of een toegestane onvoldoende mogelijk is voor dit opleidingsonderdeel.

Academische context

Deze aanbieding maakt deel uit van de volgende studieplannen:
Master of Applied Sciences and Engineering: Applied Computer Science: Standaard traject (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Artificial Intelligence (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Multimedia (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Software Languages and Software Engineering (enkel aangeboden in het Engels)
Master in Applied Sciences and Engineering: Computer Science: Data Management and Analytics (enkel aangeboden in het Engels)