6 EC
Semester 2, period 6
5204IEIA6Y
In this course we study the techniques to ensure interpretability of modern AI techniques, and to generate explanations for the classifications, decisions and predictions that these systems make. We consider applications of these techniques in diverse subfields of AI, ranging from machine vision to translation, and from speech recognition to automatic reasoning. An important focus will be on post hoc interpretation techniques, that take an existing model (e.g., a deep learning model for object recognition, machine translation or music recommendation), and attempt to interpret the intermediate representations using visualization, attribution or probing methods. A second thread will be studying ways to a priori constrain or bias such models to arrive at more interpretable solutions, e.g. by encouraging sparse representations or by generating explanations as a secondary objective. Finally, we will consider approaches that are inherently explainable, including models with rich symbolic backbones such as neurosymbolic models. A common theme throughout the course will be the comparison of symbolic and deep learning models, and we will see that classic results from symbolic AI sometimes find a new relevance in their use for helping interpret deep learning systems, or for helping identify their shortcomings.
Core concepts covered in the course: probing (information-theoretic/counterfactual/Pareto-optimal), occlusion, Guided Backpropagation, Deconvolution, Saliency maps, (Deep)LIFT, Layer-wise relevance propagation, Integrated Gradients, Shapley values, Contextual decomposition, (Deep)SHAP, Attention flow, Attention-as-explanation, Challenge sets, Influence Functions, neurosymbolic models (semantic loss, deepproblog, logic tensor networks), Symbolic Regression, and Non-parametric Bayesian methods (including Dirichlet, Pitman-Yor, for image and text parsing).
Original research papers, made available through canvas
Python
Lectures and seminars given by lecturers and guest lecturers will lay the foundation and the scope the theoretical content. Students will experience these notions closely on extensive hands-on workshops during the lab sessions, to be delivered at the end of each week as a technical report, which will require a considerable amount of self-study. Every lecture has associated reading of which will be the content of the weekly quizzes. This will require self-study of the material, in addition to material to be presented, which will help students to internalise these notions and able their skills to successfully communicate them.
|
Activity |
Hours |
|
|
Hoorcollege |
12 |
|
|
Laptopcollege |
14 |
|
|
Presentatie |
12 |
|
|
Self study |
130 |
|
|
Total |
168 |
(6 EC x 28 uur) |
This programme does not have requirements concerning attendance (OER part B).
Additional requirements for this course:
This is an on-campus course, however we will not take attendance except the presentation sessions.
| Item and weight | Details |
|
Final grade | |
|
0.12 (12%) Journal club presentation. | |
|
0.04 (4%) Presence | |
|
0.08 (8%) Quiz 1 | |
|
0.08 (8%) Quiz 2 | |
|
0.08 (8%) Quiz 3 | |
|
0.1 (10%) Report – Week 1 | |
|
0.15 (15%) Report - Week 2 | |
|
0.15 (15%) Report – Week 3 | |
|
0.2 (20%) Final mini-project |
Presentation sessions are done w.r.t. corresponding week's reading list, which will be related to the lectures and the workshop content. Quizzes are online and multiple choice.
Once an assignment is submitted, and graded, it will be uploaded on Canvas with the feedback, and students can inspect the given grade and the background feedback
All the assignments are done individually except presentations and final mini-project.
The 'Regulations governing fraud and plagiarism for UvA students' applies to this course. This will be monitored carefully. Upon suspicion of fraud or plagiarism the Examinations Board of the programme will be informed. For the 'Regulations governing fraud and plagiarism for UvA students' see: www.student.uva.nl
| Weeknummer | Onderwerpen | Studiestof |
| 1 | Introduction and Posthoc Interpretability & Transfomers Interpretability | |
| 2 | Attribution Methods & Interpretable by design I (neuro-symbolic systems) | |
| 3 | Guest Lecture I (Mech Interp.). & Interpretable by design II | |
| 4 | Guest Lecture II (Mech. Interp.), Summary |
For this course's website, and websites of other courses of the ILLC's 'Natural Language Processing & Digital Humanities' group, see: https://cl-illc.github.io/teaching.html