Linguistics and Language Processing
6 EC
Semester 2, periode 5
5082TATA6Y
Our ability to use natural language to communicate with each other and to record information is one of the main features that makes us intelligent. However, while we use language effortlessly in our everyday life, computers have a hard time processing natural languages such as English or Dutch. Computational linguistics is a subfield of artificial intelligence at the interface of linguistic theory and computer science, which aims at endowing computers with the ability to process natural language. The ultimate goal is to develop artificial agents that can automatically acquire information from text or that can communicate with humans via intelligent interfaces or in human-robot interaction.
This course introduces students to some of the core topics in computational linguistics and natural language processing. We will focus on foundational aspects, paying special attention to rule-based methods. The course provides background for the second-year course Natuurlijke Taalmodellen en Interfaces, which focuses on data-driven probabilistic methods.
The course covers the following key topics in language processing at an introductory level:
Daniel Jurafsky and James H. Martin, Speech and Language Processing (2nd Edition), Pearson Prentice Hall, 2009. Only around seven or eight chapters will be covered in this course. The book is also used in Natuurlijke Taalmodellen en Interfaces.
Week 14. Automata, Transducers, and Morphology (lectures 1 + 2)
Week 15. Formal Grammars and Syntax (lectures 3 + 4)
Week 16. From Parsing to Computational Semantics (lectures 5 + 6)
Week 17. Midterm exam
Week 19. Compositional Semantics (lectures 7 + 8)
Week 20. Lexical Semantics (lecture 9)
Week 21. Distributional Semantics, Information Retrieval, and N-gram Models (lectures 10 + 11)
Week 22. Course recap (lecture 12) + Final exam
Week 27. Resit
Het materiaal op de Blackboard site van Practicum Academische Vaardigheden (www.practicumav.nl).
Other online materials will be pointed out during the course. Slides will be available on Canvas after each lecture.
The course consists of lectures (hoorcolleges) where the theoretical material is explained and discussed and practical sessions (werkcolleges and laptopcolleges). In the practical sessions, students will work in pairs on exercises related to the contents introduced during the lectures.
|
Activiteit |
Aantal uur |
|
Hoorcolleges |
24 |
|
Werkcolleges |
12 |
|
Laptopcolleges |
12 |
|
Exams |
4 |
|
Zelfstudie |
116 |
The education in academic skills is partly allocated in Practica Academische Vaardigheden (PAV). These practica are an obligatory part of this course and are taught by the coordinator academische vaardigheden and the tutors. Second- and third-year students can request an exemption from the PAV coordinator. Please contact the PAV coordinator for any other exemption requests.
Aanwezigheidseisen opleiding (OER-B):
Aanvullende eisen voor dit vak:
It is obligatory to attend a minimum of 12 out of 15 practical sessions (the count includes the werkcolleges, laptopcolleges, and PAV werkcolleges). This means that 3 practical sessions in total can be missed at the student's discretion. Absences do not need to be reported, but should be used wisely: there will be no exceptions for additional absences, unless in case of highly exceptional circumstances discussed with your study advisor.
| Onderdeel en weging | Details |
|
Eindcijfer | |
|
0.2 (20%) Tentamen 1 | |
|
0.3 (30%) Tentamen 2 | |
|
0.3 (30%) Weekly homework | |
|
0.2 (20%) PAV project |
In order to pass the course, you must have at least a 5.5 (weighted average) calculated on the basis of all graded components of the course, i.e., mid-term exam (Tentamen 1), final exam (Tentamen 2), weekly homework, and the PAV project. The resit (hertentamen) will replace the combined results of the two exams, whatever your original results may have been.
There will be weekly homework assignments, which may be completed in pairs. Students will receive feedback from their TAs on these assignments during werkcolleges and laptopcolleges. Assignments handed in up to 24h after the deadline will still be accepted but will incur a grade penalty.
Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl
| Weeknummer | Onderwerpen | Studiestof |
| 1 | Automata, Transducers, and Morphology | |
| 2 | Formal Grammars and Syntax | |
| 3 | From Parsing to Computational Semantics | |
| 4 | Midterm exam (tentamen 1) | |
| 5 | Compositional Semantics | |
| 6 | Lexical Semantics | |
| 7 | Distributional Semantics, Information Retrieval, and N-gram Models | |
| 8 | Course recap + Final exam (tentamen 2) |
Het rooster van dit vak is in te zien op DataNose.
While the same attendance criteria apply to any student enrolled in the course, honours students can contact Noa Visser (TA coordinator) and request to be assigned to another group in case the one assigned by default overlaps with other activities.
The course will be taught in English. Basic knowledge of Python and first-order logic will be taken for granted; no other previous knowledge of linguistics is required.
Hieronder vind je de aanpassingen in de opzet van het vak naar aanleiding van de vakevaluaties.
The TAs should be the first point of contact for day-to-day issues related to the course. For unusual or extreme circumstances, e.g., exam time conflicts, contact the course coordinator.