Course manual 2020/2021

Course content

The course provides a comprehensive introduction into state-of-the-art programming models for concurrent computing systems from multi-core processors in everyday laptops to large-scale server systems and high-end accelerators.

We start with multithreaded programming models for shared address space systems, where we look both into OpenMP compiler directives and into more low-level Posix threads. We continue with general-purpose graphics accelerators (GPGPUs) using NVidia's programming model CUDA and end with advanced topics such as directive-based GPU programming, Intel Xeon Phi programming and cross-architecture programming using the standard-driven OpenCL programming model.

Lectures on the practice of programming multi-core and many-core systems alternate with lectures on the principles, limitations and pitfalls of parallel program organization or the assessment and visualisation of parallel program performance. The lectures are complemented by labs where participants gain first-hand experience with the various programming models and by group discussion sessions (werkcolleges) where participants present their work and discuss their achievements with each other as well as with the lecturers and lab assistants.

This course is complementary to the VU courses Programming Large-scale Parallel Systems and Parallel Programming Project in that it looks into node-level concurrency, whereas the VU courses focus on systems that are made up of many nodes.

Study materials

Literature

  • The course does not follow any specific text book. The following list summarises a number of interesting textbooks and online resources on various aspects of the course.

     

    Parallel Programming in General:

    • Kevin Dowd, Charles Severance: High Performance Computing, O'Reilly, 1998.
    • Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar: Introduction to Parallel Computing, Addison-Wesley, 2003.
    • Gregory R. Andrews: Foundations of Multithreaded, Parallel, and Distributed Programming, Addison Wesley, 2000.
    • Michael J. Quinn: Parallel Programming in C with MPI and OpenMP, McGraw-Hill, 2003.

    OpenMP:

    • Rohit Chandra, Leo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon: Parallel Programming in OpenMP, Morgan Kaufmann, 2000.
    • Barbara Chapman, Gabriele Jost, Ruud van der Pas: Using OpenMP, MIT Press, 2007.
    • www.openmp.org

    OpenACC:

    • www.openacc.org

    CUDA:

    • developer.nvidia.com/cuda-zone

Syllabus

Practical training material

  • Lecture slides and tutorials are provided via Canvas.

Software

  • We use the DAS-5 compute cluster of the Advanced School of Computing and Imaging (ASCI) for the practical training. All necessary software is pre-installed on that system.

     

    All relevant software is also available free of charge for download in the internet for installation of private machines,  but this would merely be for convenience than a necessity.

Objectives

  • To develop profound programming skills for a variety of contemporary multi-core and many-core computing systems.
  • To develop an understanding of the opportunities, challenges and limits of parallel computing and to gain practical familiarity with state-of-the-art programming models for contemporary multi-core and many-core computing systems.
  • To understand the essentials of scientific experimentation and the use of high-performance computing installations

Teaching methods

  • Lecture
  • Laptop seminar
  • Presentation/symposium
  • Self-study
  • Seminar
  • Discussion workshops

Learning activities

Activity

Number of hours

Lectures

28

Labs

24

Workshops

8

Exam

2

Zelfstudie

106

Attendance

Programme's requirements concerning attendance (TER-B):

  • In the case of a practical training, the student must attend at least 100% of the practical sessions. Should the student attend less than 100%, the student must repeat the practical training, or the Examinations Board may have one or more supplementary assignments issued.
  • In the case of a tutorial, the student must attend at least 100% of the tutorial sessions. Should the student attend less 100%, the student must repeat the tutorial, or the Examinations Board may have one or more supplementary assignments issued.

Additional requirements for this course:

Attendance of lectures and labs is highly recommended, but strictly speaking neither legally nor technically enforced.

This is your responsibility!!

Attendance of bi-weekly workshops (werkcolleges) is compulsory.

Assessment

Item and weight Details

Final grade

0.2 (20%)

Exam

Must be ≥ 5

0.2 (20%)

Assignment 1

0.2 (20%)

Assignment 2

0.2 (20%)

Assignment 3

0.2 (20%)

Assignment 4

Presentation

Must be ≥ pass

Each of the four bi-weekly assignments receives one final grade. Exceptional circumstances apart, the grade is the same for both partners and accounts for 20% of the final grade. The individual exam makes up for the remaining 20% of the final grade.

Each student must give a presentation during one of the bi-weekly workshops. The quality of the presentation can affect the final grade either way with a bonus or malus of up to 0.5. Failure to give a presentation results in failing the course. 

The exam is a digital open-books exam conducted via TestVisison. There is a cut-off grade of 5.0 for the exam to pass the course overall. Should the exam be graded below 5.0, the exam grade becomes the course grade, independent of the weekly assignments and the presentation.

Should we have reason to believe that workload and contributions in a group are not reasonably balanced, we reserve the right to conduct individual interviews, split groups, etc.

Inspection of assessed work

General feedback regarding the assignments is provided during the bi-weekly workshops. Individual feedback is provided for each assignment in written form. If needed, additional feedback can be obtained on an individual basis via the TAs during the laptop colleges.

Assignments

Assignment 1: Scientific Programming and Vectorisation

Assignment 2: Application-oriented Multi-core Programming with OpenMP Compiler Directives

Assignment 3: Machine-oriented Multi-core Programming with Posix Threads

Assignment 4: Many-core and Heterogeneous Programming with CUDA and OpenACC

All assignments are to be submitted in fixed groups of 2 students. Each assignment is graded and accounts for 20% of the final course grade. Each group of participants must present its work during one of the bi-weekly workshops. The quality of the presentation may impact the final grade via bonus/malus points of up to 0.5 for  presentations outstanding either way.

Fraud and plagiarism

The 'Regulations governing fraud and plagiarism for UvA students' applies to this course. This will be monitored carefully. Upon suspicion of fraud or plagiarism the Examinations Board of the programme will be informed. For the 'Regulations governing fraud and plagiarism for UvA students' see: www.student.uva.nl

Course structure

Weeknummer Onderwerpen Studiestof
1
  • Introduction and Motivation
  • General aspects of parallel programming
  • Vectorisation and SIMD programming
 
2
  • Application-oriented Multi-core Programming with OpenMP Compiler Directives
 
3
  • Application-oriented Multi-core Programming with OpenMP Compiler Directives
  • Common Pitfalls in Shared Memory Parallel Programming
 
4
  • Machine-oriented Multi-core Programming with Posix Threads
 
5
  • Performance Metrics and Models
  • GPU programming with CUDA
 
6
  • GPU programming with OpenCL and OpenACC,
  • Heterogeneous Computing
 
7
  • Guest lectures 
 
8
  • Exam
  • Final workshop
 

Timetable

The schedule for this course is published on DataNose.

Additional information

Prior knowledge: The practical parts of the course require a solid background in system-oriented programming in the language  C, as well as basic knowledge of computer architecture.

Contact information

Coordinator

  • dr. C.U. Grelck

Staff

  • dr. ir. Ana Varbanescu 
  • dr. Clemens Grelck 
  • Misha Mesarcik, MSc 
  • Julius Roeder, MSc