Natural Language Models and Interfaces
6 EC
Semester 2, periode 4
5082NTIT6Y
Natural language is the main channel of communication between humans, and much of human knowledge is represented
in the form of natural language. Enabling computers to understand it is an extremely important task, and is one of the
core problems of artificial intelligence. Though full understanding still remains a remote goal, robust methods have been
developed for more shallow forms of processing, and these methods and corresponding formalisms are the focus of this
course.
In this course you learn about formalisms and techniques to assign probabilities to (parts of) sentences (language
modeling) and to perform basic forms of syntactic and semantic processing. These techniques are the foundation of current data-driven computational linguistics and provide building blocks for speech recognition,
language understanding, text summarisation, and machine translation systems.
Daniel Jurafsky & James H. Martin, 'Speech and Language Processing' (3rd Edition) Pearson Prentice Hall, 2019.
The class will consist of a theoretical course and practical sessions. Practical sessions involve coding assignments using jupyter notebooks.
Activiteit | Aantal uur |
Computerpracticum | 24 |
Deeltoets | 4 |
Hoorcollege | 24 |
Zelfstudie | 116 |
Aanwezigheidseisen opleiding (OER-B):
| Onderdeel en weging | Details |
|
Eindcijfer | |
|
0.3 (30%) Tussentoets | |
|
0.3 (30%) Deeltoets | |
|
0.4 (40%) Homework |
The grade will be 40% homework (weighted average of 5 assignments) and 60% exams (weighted average of midterm and final). Both components (exam and homework) are initially graded on a scale from 0 to 10 and they must each be at least 5.0 in order to pass the course (see Dutch scaling below). If your exam component is below 5.0 you are eligible to a resit which fully replaces that component.
Dutch scaling
According to official UvA regulations your final grade has to be between 1 and 10. To avoid confusion, this is how we compute your final grade: 1 + 0.9 * (0.6 * exams + 0.4 * homework). This grade is rounded to the closest half point, or to the closest point if it falls between 5 and 6.
Om een inzagemoment aan te vragen, kun je contact opnemen met de coördinator.
Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl
| Week | Topic | Graded assignment | Exam |
| 1 | The statistical method to natural language processing | Manipulating text using python | |
| 2 | Assigning probability to sequences with Markov models | N-gram language models | |
| 3 | Basic syntactic analysis with Hidden Markov models | HMM | |
| 4 | Midterm | ||
| 5 | Feature-rich models | POS tagging | |
| 6 | Text classifiers | Sentiment analysis | |
| 7 | Overview of advanced methods | ||
| 8 | Final |
Het rooster van dit vak is in te zien op DataNose.
The course will be taught in English.
Prerequisite skills: Basic probability theory, programming in python.