Natuurlijke Taalmodellen en Interfaces

Natural Language Models and Interfaces

6 EC

Semester 2, periode 4

5082NTIT6Y

Eigenaar Bachelor Kunstmatige Intelligentie
Coördinator W. Ferreira Aziz
Onderdeel van Minor Kunstmatige Intelligentie, jaar 1Bachelor Kunstmatige Intelligentie, jaar 2Bachelor Future Planet Studies, major Kunstmatige Intelligentie, jaar 3

Studiewijzer 2018/2019

Globale inhoud

Natural language is the main channel of communication between humans, and much of human knowledge is represented
in the form of natural language. Enabling computers to understand it is an extremely important task, and is one of the
core problems of artificial intelligence. Though full understanding still remains a remote goal, robust methods have been
developed for more shallow forms of processing, and these methods and corresponding formalisms are the focus of this
course.

In part I of this course you learn about formalisms and techniques to assign probabilities to (parts of) sentences (language
modeling) and to perform the most basic form of syntactic processing (assigning word classes to words in a sentence). In
part II of the course, we will look into techniques to predict syntactic structure (syntactic parsing) and semantic structure
(compositional semantic structure and predicate-argument structure). Together, these two parts give an overview of basic
techniques in current data-driven computational linguistics providing the building blocks for speech recognition,
language understanding and machine translation systems.

Studiemateriaal

Literatuur

  • Daniel Jurafsky & James H. Martin, 'Speech and Language Processing' (2nd Edition) Pearson Prentice Hall, 2009.
  • Additional reading: Manning & Schuetze. 'Foundations of Statistical Natural Language Processing', 1999

Leerdoelen

  1. The student is able to design statistical models to analyse and predict correlations in natural language data.
  2. The student is able to apply statistical modelling to predict linguistic generalisations such as syntactic, semantic, and morphological structure.
  3. The student can describe the computational challenges of this statistical approach.
  4. The student has experience using data structures and algorithms to deploying NLP models, implementing statistical estimators, and using NLP libraries.

­

Students will acquire knowledge of the major techniques used in modern computational linguistics for syntactic and semantic processing of natural language sentences. This includes ngram- and HMM-based language modeling, PCFG-based parsing, prediction of underlying predicate-argument structure, basics of statistical machine translation.

Skills: Programming and linux skills, joint work in groups, apply statistical concepts to language data.

Onderwijsvormen

  • Hoorcollege
  • Werkcollege
  • (Computer)practicum

The class will consist of a theoretical course and practical sessions. There will be no compulsory attendance for the practical session.

Verdeling leeractiviteiten

Activiteit

Aantal uur

Computerpracticum

24

Deeltoets

4

Hoorcollege

24

Zelfstudie

116

Aanwezigheid

Aanwezigheidseisen opleiding (OER-B):

  • Voor practica en werkgroepbijeenkomsten met opdrachten geldt een aanwezigheidsplicht. De invulling van deze aanwezigheidsplicht kan per vak verschillen en staat aangegeven in de studiewijzer. Wanneer studenten niet voldoen aan deze aanwezigheidsplicht kan het onderdeel niet met een voldoende worden afgerond.

Toetsing

Onderdeel en weging Details

Eindcijfer

50%

Exam

Moet ≥ 5 zijn

0.2 (40%)

Midterm exam

0.3 (60%)

Final exam

50%

Assignments

Moet ≥ 5 zijn

0.06 (12%)

Practical 1

0.06 (12%)

Practical 2

0.06 (12%)

Practical 3

0.06 (12%)

Practical 4

0.06 (12%)

Practical 5

0.2 (40%)

Report

Moet ≥ 5 zijn

The grade will be 50% assignments (weighted average of 5 practicals and 1 written report) and 50% written exams (weighted average of midterm and final).
Partial grades (exam and assignments) as well as grade for the written report must each be at least 5,0 (Dutch scaling) in order to pass the course.

Fraude en plagiaat

Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl

Rooster

Het rooster van dit vak is in te zien op DataNose.

Aanvullende informatie

The course will be taught in English.

Prerequisite skills: Basic probability theory, programming in python. 

 

Contactinformatie

Coördinator

  • W. Ferreira Aziz