6 EC
Semester 2, period 5
5254MLIC6Y
| Owner | Master Chemistry (joint degree) |
| Coordinator | prof. dr. ir. B. Ensing |
| Part of | Master Chemistry (joint degree), track Molecular Sciences, |
| Links | Visible Learning Trajectories |
The course Machine Learning for Chemistry will provide a broad understanding of current deep learning methodologies and their application in chemical research. Rather than a formal exposure, it will consist of a more hands-on approach tailored to students interested in applying deep learning to (molecular) scientific problems. The course is targeted at a broad audience: from theoretical chemists who wish to dive into data-driven science, to experimental chemists keen on integrating machine learning in their work.
The course will first review briefly foundational aspects of probability and information theoretic concepts together with an overview of machine learning basics (as treated more detailed in the chemistry bachelor course AI for Science). We will then focus on a range of popular deep learning techniques that are particularly useful in chemistry, including graph-neural networks, diffusion and flow models, transformers and large language models, and Bayesian optimisation. The exposition of deep learning models will be illustrated on relevant chemical applications, such as structure-property prediction, generation of molecules with specific properties, and guiding autonomous lab experiments.
The course is provided as a lecture series (2 times 2 hours per week) plus hands-on laptop sessions (1 time 2 hours per week). The theoretical aspects of deep learning and generative AI for molecular science, taught in the lectures, will be applied by programming assignments during the laptop sessions. The laptop assignments start in the first weeks with deep learning exercises provided as Jupyter Notebooks that run in an internet browser or the Microsoft Visual Studio Code software on the laptop and contain information, open questions and (to be completed) Python computer codes. In the second part, the students will work in pairs on one larger deep learning project, during which they develop and implement a deep learning algorithm for a molecular science application. The final Jupyter notebook together with a presentation of the project in the last week will count for 25% of the final grade. The other 75% of the grade is obtained with a written exam.
Proficiency with programming in Python is very strong pre.
Having passed the Chemistry or STI Bachelor course "AI for Science", or similar, is also advantageous, but not essential.
Book: "Deep Generative Modeling" by Jakub M. Tomczak
Python, Jupyter notebooks, MS vscode
Relevant libraries: Numpy, scikit-learn, PyTorch, Torch geometric, BOTorch
Keynote lecture slides
|
Activity |
Hours |
|
|
Hoorcollege |
28 |
|
|
Laptopcollege |
14 |
|
|
Tentamen |
2 |
|
|
Werkcollege |
0 |
|
|
Self study |
124 |
|
|
Total |
168 |
(6 EC x 28 uur) |
This programme does not have requirements concerning attendance (TER part B).
Additional requirements for this course:
None of the lectures is mandatory. However, as this course does not (yet) have a course syllabus and instead refers to (rather formal) sections of books and scientific articles without a clear connection between them, it is highly recommended to attend the lectures. The lecturer aims to record all the lectures and put the videos online on Canvas, but this is not guaranteed.
| Item and weight | Details |
|
Final grade | |
|
0.25 (25%) Final Project Assignment and Presentation | |
|
0.75 (75%) Tentamen |
In the first three week, we will work on three Jupyter notebook assignments (during two hands-on laptop session and as homework), in which the basics of the machine learning workflow, neural networks (using PyTorch) and graph neural networks are practiced. These initial assignments serve as preparation for the larger (4-week) project in the second half of the course.
Since nowadays, Jupyter notebooks can largely be completed using LLMs such as co-pilot, the notebooks are not graded. Instead, we will take twice (in week 16 and 17) a fast (max. 10 minute) "pre-test" consisting of multiple choice questions at the beginning of the computer classes, to test if the student has adequately studied the notebooks and acquired the relevant skills from the assignments. A maximum of 1.0 bonus point can be gained with these two tests; the bonus only applies if the final grade from the exam (75%) plus project (25%) is higher than 5.5 (in other words, the bonus cannot be used to pass the course if the total grade from the other components is insufficient). The purpose of the two "pre-tests" is (1) to encourage the student from the beginning to focus on understanding their algorithms and python code, and (2) to give the student early feedback on their learning performance.
The final project (which can be carried out alone or in a team of max 2 students) during the second 4 weeks is graded and counts as 25% of the final grade (before bonus). Also during this final assignment, students are allowed to use co-pilot or other AI to develop their machine learning solution. Note however that the ability of the student(s) to explain their code in the end-presentation counts heavily towards the grade for the project.
The 'Regulations governing fraud and plagiarism for UvA students' applies to this course. This will be monitored carefully. Upon suspicion of fraud or plagiarism the Examinations Board of the programme will be informed. For the 'Regulations governing fraud and plagiarism for UvA students' see: www.student.uva.nl
| Weeknummer | Onderwerpen | Studiestof |
| 1 | Intro machine learning | Lecture slides |
| 2 | Generative AI | Lecture slides |
| 3 | Graphs and GNNs | Lecture slides |
| 4 | Diffusion and Flow models | Lecture slides |
| 5 | Transformers and language models | Lecture slides |
| 6 | Bayesian optimisation and self-driving labs | Lecture slides |
| 7 | MLIPs and surrogate models for molecular modeling | Lecture slides |
| 8 | Exam |
A modern laptop is needed for the computer practica. At least one week in advance of the first lecture, information on preparing the laptop will be made available on the Canvas website. In particular, installation of (mini-)conda/mamba with an environment of python packages is required to take part of the computer practica. Windows users may need to setup/install WSL with a version of Linux, such as Ubuntu, as explained on the Canvas site.