Machine Leren voor Gestructureerde Data

Machine Learning for Structured Data

6 EC

Semester 1, periode 2

5083MLVG6Y

Eigenaar Bachelor Kunstmatige Intelligentie
Coördinator dr. Vlad Niculae
Onderdeel van Bachelor Kunstmatige Intelligentie, jaar 3Bachelor Bèta-gamma, major Kunstmatige Intelligentie, jaar 3
Links Zichtbare leerlijnen

Studiewijzer 2024/2025

Globale inhoud

This course prepares you for handling structured inputs and outputs in deep machine learning applications. Real-world data is complex but highly structured. Natural language is organized hierarchically into units such as sentences, phrases, and words. Natural images show objects in various spatial relationships to each other. Our very DNA is made up of small discrete units that combine in complex ways. In all these cases, long-distance dependencies and constraints are essential. This course will prepare you to use machine learning and deep neural networks to model complex, structured phenomena. By marrying machine learning with models of symbolic, global structure, we get hybrid approaches that are the best of both words.

In standard (unstructured) ML, the focus is on classification and regression problems from vector representations of data. This course covers both main situations where structure enters a model:

  • Structured inputs. Build predictive ML systems that make use of the known structure of the input data, like for social networks, documents, images.
  • Structured outputs. Build systems that can output structured objects: alignments between proteins, syntactic parsing of language, object segmentation in images.

We will study models, learning algorithms, and evaluation methods for handling structured inputs and outputs in machine learning.

Leerdoelen

  • Select appropriate structured representation for complex data: graphs, trees, sequences, grids, alignments, as encountered in language, vision, and biology applications.
  • Design machine learning models for such structured data.
  • Evaluate performance on structured data: beyond accuracy.
  • Model relational data using structured neural networks such as Graph Neural Networks and Transformers.
  • Implement algorithms for finding optimal structures: e.g., Viterbi variants
  • Characterize probability distributions over structures.
  • Formulate and solve structured and discrete optimization problems with the (integer) linear programming framework.
  • Argue for or against using structured models and representations depending on application, (including big-O complexity arguments, representation power, etc.)

Onderwijsvormen

  • Hoorcollege
  • Laptopcollege
  • Zelfstudie
  • Begeleiding/feedbackmoment
  • Zelfstandig werken aan bijv. project/scriptie

Verdeling leeractiviteiten

Activiteit

Uren

 

Hoorcollege

28

 

Laptopcollege

28

 

Tentamen

2

 

Zelfstudie

110

 

Totaal

168

(6 EC x 28 uur)

Aanwezigheid

Aanwezigheidseisen opleiding (OER-B):

  • Voor practica en werkgroepbijeenkomsten met opdrachten geldt een aanwezigheidsplicht. De invulling van deze aanwezigheidsplicht kan per vak verschillen en staat aangegeven in de studiewijzer. Wanneer studenten niet voldoen aan deze aanwezigheidsplicht kan het onderdeel niet met een voldoende worden afgerond.

Toetsing

Onderdeel en weging Details

Eindcijfer

40%

Tentamen

Moet ≥ 5 zijn

60%

Assignments

Moet ≥ 5 zijn

45%

Assignment 1: Encoding Structured Inputs

45%

Assignment 2: Predicting Structured Outputs

10%

Quiz grade

In addition to the standard threshold on the final grade (per OER),  we also require

  • minimum of 5/10 required for the exam before any rounding
  • minimum assignment grade of 5/10 before any rounding (including quiz grade, weighted as described)

Grade calculations are performed on precise (unrounded) grades with the full available precision. Just the final grade is rounded to halves.

Inzage toetsing

For exam via ans.app; for assignments in the next laptopcollege.

Opdrachten

Individual assignments, graded.

Feedback via lab sessions.

Fraude en plagiaat

Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl

Weekplanning

As this is the first edition of the course, the planning is only tentative.

Week number Tentative topic
1

 introduction to structured data and representations, ML recap

2-3 encoding structured input data. (features, graph neural networks, transformers…)
4-5 structured outputs, probabilities over structures, structured perceptrons
6 combinatorial optimization, ILP formulations: assignments, flows
7 slack time / recap

Contactinformatie

Coördinator

  • dr. Vlad Niculae

Docenten

  • T.J. Brouwer
  • Daan Heijke BSc
  • Maya Nachesa