Machine Leren voor Gestructureerde Data

Machine Learning for Structured Data

6 EC

Semester 1, periode 1

5083MLVG6Y

Eigenaar Bachelor Kunstmatige Intelligentie
Coördinator dr. Vlad Niculae
Onderdeel van Bachelor Kunstmatige Intelligentie, jaar 3Bachelor Bèta-gamma, major Kunstmatige Intelligentie, jaar 3

Studiewijzer 2022/2023

Globale inhoud

This course prepares you for handling structured inputs and outputs in deep machine learning applications. Real-world data is complex but highly structured. Natural language is organized hierarchically into units such as sentences, phrases, and words. Natural images show objects in various spatial relationships to each other. Our very DNA is made up of small discrete units that combine in complex ways. In all these cases, long-distance dependencies and constraints are essential. This course will prepare you to use machine learning and deep neural networks to model complex, structured phenomena. By marrying machine learning with models of symbolic, global structure, we get hybrid approaches that are the best of both words.

In standard (unstructured) ML, the focus is on classification and regression problems from vector representations of data. This course covers both main situations where structure enters a model:

  • Structured inputs. Build predictive ML systems that make use of the known structure of the input data, like for social networks, documents, images.
  • Structured outputs. Build systems that can output structured objects: alignments between proteins, syntactic parsing of language, object segmentation in images.

We will study models, learning algorithms, and evaluation methods for handling structured inputs and outputs in machine learning.

Leerdoelen

  • Select appropriate structured representation for complex data: graphs, trees, sequences, grids, alignments, as encountered in language, vision, and biology applications.
  • Design machine learning models for such structured data.
  • Evaluate performance on structured data: beyond accuracy.
  • Model relational data using structured neural networks such as Graph Neural Networks and Transformers.
  • Implement algorithms for finding optimal structures: e.g., Viterbi, Kuhn-Munkres, Max Flow.
  • Characterize probability distributions over structures.
  • Formulate and solve structured and discrete optimization problems.
  • Argue for or against using structured models and representations depending on application, (including big-O complexity arguments, representation power, etc.)

Onderwijsvormen

  • Hoorcollege
  • Laptopcollege
  • Zelfstudie
  • Begeleiding/feedbackmoment
  • Zelfstandig werken aan bijv. project/scriptie

Verdeling leeractiviteiten

Activiteit

Uren

 

Hoorcollege

28

 

Laptopcollege

28

 

Tentamen

2

 

Zelfstudie

110

 

Totaal

168

(6 EC x 28 uur)

Aanwezigheid

Aanwezigheidseisen opleiding (OER-B):

  • Voor practica en werkgroepbijeenkomsten met opdrachten geldt een aanwezigheidsplicht. De invulling van deze aanwezigheidsplicht kan per vak verschillen en staat aangegeven in de studiewijzer. Wanneer studenten niet voldoen aan deze aanwezigheidsplicht kan het onderdeel niet met een voldoende worden afgerond.

Toetsing

Onderdeel en weging Details

Eindcijfer

0.4 (40%)

Tentamen

Moet ≥ 5 zijn

0.4 (40%)

Assignments

Moet ≥ 5 zijn

0.5 (50%)

Assignment 1

0.5 (50%)

Assignment 2

0.2 (20%)

Quizzes and readings

In addition to the standard threshold on the final grade (per OER),  we also require

  • minimum of 5 required for the exam.
  • minimum average of 5 between the assignments.

Opdrachten

Individual assignments, graded.

Feedback via lab sessions.

Fraude en plagiaat

Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl

Weekplanning

As this is the first edition of the course, the planning is only tentative.

Week number Tentative topic
1

 introduction to structured data and representations, ML recap

2-3 encoding structured input data. (features, graph neural networks, transformers…)
4-5 structured outputs, probabilities over structures, structured perceptrons
6 combinatorial optimization, ILP formulations: assignments, flows
7 slack time / recap

Rooster

Het rooster van dit vak is in te zien op DataNose.

Verwerking feedback studenten

Hieronder vind je de aanpassingen in de opzet van het vak naar aanleiding van de feedback van studenten.

Contactinformatie

Coördinator

  • dr. Vlad Niculae