6 EC
Semester 2, period 4
5284DAPR6Y
| Owner | Master Computer Science (joint degree) |
| Coordinator | dr. ing. S. Schelter |
| Part of | Master Computer Science (joint degree), track Big Data Engineering, |
This highly technical course focuses on the preparation and life-cycle management of data for production machine learning deployments. The course starts by recapping fundamentals about relational data processing and dataflow systems. Subsequently, students learn about encoding, storing and managing vectorised feature representations of heterogeneous input data sources for machine learning applications, and the architecture of current state-of-the- art systems for this task such as Google’s Tensorflow Extended Platform. Concurrently, the students will be exposed to foundational theory for this problem space, such as incremental view maintenance for relational data, fine-grained data provenance tracking via provenance semi-rings and differential computation.
In addition, students will learn to identify, quantify and address common quality issues with respect to the completeness and consistency of the data. Furthermore, they will learn about technical challenges with respect to the compliance with regulations for private data such as the “right-to-be-forgotten” from GDPR. Finally, students will be exposed to ongoing research efforts in this space such as ML pipeline debugging or error detection techniques from data-centric AI. In addition, they will have the opportunity to discuss the practical implications of the covered technologies with invited industry experts.
Scientific papers
Book chapters
Presentation slides
Detailed information about the course and grading will be discussed in the first lecture
Programming assignments
Examplde code
Java and Python-based open source software
|
Activity |
Hours |
|
|
Hoorcollege |
28 |
|
|
Laptopcollege |
14 |
|
| Presentations | 6 | |
| Self-Study | 120 | |
Programme's requirements concerning attendance (TER-B):
Additional requirements for this course:
Participation will be measured. Attendance in the lab sessions is needed in order to attain the programming skills and background required for the assignments and the project.
| Item and weight | Details | Remarks |
|
Final grade | To pass the course, all parts should be passed. Failure to hand in an assignment or the project will result in failure of the course. | |
|
0.1 (10%) Programming Assignment 1 | Mandatory | |
|
0.1 (10%) Programming Assignment 2 | Mandatory | |
|
0.1 (10%) Programming Assignment 3 | Mandatory | |
|
0.1 (10%) Participation | ||
|
0.2 (20%) Project Presentations | Mandatory | |
|
0.4 (40%) Project Design, Implementation, Experiments & Paper | Mandatory |
Details for the grading of the assignments and project will be made available during the course.
The 'Regulations governing fraud and plagiarism for UvA students' applies to this course. This will be monitored carefully. Upon suspicion of fraud or plagiarism the Examinations Board of the programme will be informed. For the 'Regulations governing fraud and plagiarism for UvA students' see: www.student.uva.nl
| Weeknummer | Onderwerpen | Studiestof |
| 8 |