6 EC
Semester 2, period 4
5294BIDA6Y
This course will provide students with a general understanding of data-related and systems-related challenges in Big Data applications. They will gain fundamental knowledge about principled approaches to tackle such challenges, with respect to systems abstractions, programming models and execution models for parallel and distributed data-intensive applications.
The course prepares students for data-related tasks in a job as Data Engineer, ML Engineer, Applied Scientist or Researcher, and puts a focus on their implementation skills, for example by walking students through low-level MapReduce jobs for several data related problems and common data processing operators. At the same time, the course highlights ongoing research problems in the area of Big Data processing. Furthermore, the course details the history of many systems currently at the forefront of computing, e.g., it discusses the roots of Google Tensorflow in previous Big Data systems at Google.
The course will also feature guest speakers from leading companies to connect students to real world problems.
Programming exercises
Apache Hadoop, Apache Maven
Activity |
Hours |
|
Hoorcollege |
14 |
|
Laptopcollege |
14 |
|
Presentatie |
6 |
|
Werkcollege |
14 |
|
Self study |
120 |
|
Total |
168 |
(6 EC x 28 uur) |
In TER part B of this programme no requirements regarding attendance are mentioned.
Additional requirements for this course:
Participation will be measured. Attendance in the lab sessions is highly recommended in order to attain the programming skills and background required for the assignments.
Item and weight | Details |
Final grade | |
Participation | |
Programming Assignment 1 | |
Programming Assignment 2 | |
Paper Summary | |
Group project |
The 'Regulations governing fraud and plagiarism for UvA students' applies to this course. This will be monitored carefully. Upon suspicion of fraud or plagiarism the Examinations Board of the programme will be informed. For the 'Regulations governing fraud and plagiarism for UvA students' see: www.student.uva.nl
Weeknummer | Onderwerpen |
1 | Foundations of Scalable Data Processing |
2 | Abstractions for Massively Parallel Data Processing |
3 | Machine Learning on Distributed Dataflow Systems |
4 | Distributed Databases |
5 | Data Validation & Data Cleaning |
6 | Deep Learning Systems |
7 | Today's and Tomorrow's Challenges in Big Data Management |
8 | Final Presentations |
The schedule for this course is published on DataNose.
The course will be taught in English.
Basic knowledge of computing systems, machine learning and basic programming skills are required.