Studiewijzer 2020/2021

Globale inhoud

Natural language is the main channel of communication between humans, and much of human knowledge is represented
in the form of natural language. Enabling computers to understand it is an extremely important task, and is one of the
core problems of artificial intelligence. Though full understanding still remains a remote goal, robust methods have been
developed for more shallow forms of processing, and these methods and corresponding formalisms are the focus of this
course.

In this course you learn about formalisms and techniques to assign probabilities to (parts of) sentences (language
modeling) and to perform basic forms of syntactic and semantic processing. These techniques are the foundation of current data-driven computational linguistics and provide building blocks for speech recognition,
language understanding, text summarisation, and machine translation systems.

Studiemateriaal

Literatuur

Daniel Jurafsky & James H. Martin, 'Speech and Language Processing' (3rd Edition) Pearson Prentice Hall, 2020.

Digital version

Leerdoelen

The student can recognise real world applications of natural language processing (NLP) technology.
The student can outline the statistical approach to NLP.
The student can explain the assumptions that underlie statistical models such as Markov models, hidden Markov models, naive Bayes classifiers, logistic regression, and conditional random fields.
The student can implement NLP models using a programming language.
The student can use statistical methods to predict properties of text (e.g., probability, syntactic structure, sentiment/opinion, topics).
The student can compare statistical models in terms of their empirical performance in a task.
The student can discuss the pros and cons of a statistical model in light of the specific NLP problem and availability of data and computational resources.
The student acknowledges the social impact of NLP technology including ethical considerations that arise in the deployment of NLP technology (e.g., demographic misrepresentation, bias confirmation, and privacy).

Onderwijsvormen

Hoorcollege
Werkcollege
(Computer)practicum

The class will consist of a theoretical course and practical sessions. Practical sessions involve coding assignments using jupyter notebooks.

Verdeling leeractiviteiten

Activiteit	Aantal uur
Computerpracticum	24
Deeltoets	4
Hoorcollege	24
Zelfstudie	116

Aanwezigheid

Aanwezigheidseisen opleiding (OER-B):

Voor practica en werkgroepbijeenkomsten met opdrachten geldt een aanwezigheidsplicht. De invulling van deze aanwezigheidsplicht kan per vak verschillen en staat aangegeven in de studiewijzer. Wanneer studenten niet voldoen aan deze aanwezigheidsplicht kan het onderdeel niet met een voldoende worden afgerond.

Toetsing

Onderdeel en weging	Details
Eindcijfer
0.5 (50%) Exam component	Moet ≥ 5 zijn
0.5 (50%) Assignments	Moet ≥ 5 zijn

The grade will be 50% homework (weighted average of assignments: quizzes, labs, and technical report) and 50% exams (weighted average of midterm and final). Both components (exam and homework) are initially graded on a scale from 0 to 10 and they must each be at least 5.0 in order to pass the course (see Dutch scaling below). If your exam component is below 5.0 you are eligible to a resit which fully replaces that component.

Dutch scaling

According to official UvA regulations your final grade has to be between 1 and 10. To avoid confusion, this is how we compute your final grade: 1 + 0.9 * (0.5 * exams + 0.5 * homework). This grade is rounded to the closest half point, or to the closest point if it falls between 5 and 6.

Inzage toetsing

Om een inzagemoment aan te vragen, kun je contact opnemen met de coördinator.

Opdrachten

Assessed practical assignments and technical report. Feedback is provided by TA through canvas and/or in person.
Quizzes and reading questions available on canvas.
Non-assessed exam-like exercises.

Fraude en plagiaat

Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl

Weekplanning

Week	Topic	Graded assignment	Exam
1	The statistical method to natural language processing	Manipulating text using python
2	Assigning probability to sequences with Markov models	N-gram language models
3	Basic syntactic analysis with Hidden Markov models	HMM
4			Midterm
5	Feature-rich linear models	Text classifiers I
6	Feature-rich nonlinear models	Text classifiers II
7	Overview of advanced methods	Written report
8			Final

Rooster

Het rooster van dit vak is in te zien op DataNose.

Aanvullende informatie

The course will be taught in English.

Prerequisite skills: Basic probability theory, programming in python.

Verwerking vakevaluaties

We are keeping the midterm
We have reduced the amount of homework

Contactinformatie

Coördinator

W. Ferreira Aziz

Eigenaar	Bachelor Kunstmatige Intelligentie
Coördinator	W. Ferreira Aziz
Onderdeel van	Minor Kunstmatige Intelligentie, jaar 1 Minor Logic and Computation, jaar 1 Bachelor Kunstmatige Intelligentie, jaar 2 Bachelor Future Planet Studies, major Kunstmatige Intelligentie, jaar 3

Natuurlijke Taalmodellen en Interfaces