Taaltheorie en Taalverwerking

Linguistics and Language Processing

6 EC

Semester 2, periode 5

5082TATA6Y

Eigenaar Bachelor Kunstmatige Intelligentie
Coördinator Michael Repplinger
Onderdeel van Bachelor Kunstmatige Intelligentie, jaar 1Minor Logic and Computation, jaar 1Bachelor Bèta-gamma, major Kunstmatige Intelligentie, jaar 2

Studiewijzer 2021/2022

Globale inhoud

Our ability to use natural language to communicate with each other and to record information is one of the main features that makes us intelligent. However, while we use language effortlessly in our everyday life, computers have a hard time processing natural languages such as English or Dutch. Computational linguistics is a subfield of artificial intelligence at the interface of linguistic theory and computer science, which aims at endowing computers with the ability to process natural language. The ultimate goal is to develop artificial agents that can automatically acquire information from text or that can communicate with humans via intelligent interfaces or in human-robot interaction.

This course introduces students to some of the core topics in computational linguistics and natural language processing. We will focus on foundational aspects, paying special attention to rule-based methods. The course provides background for the second-year course Natuurlijke Taalmodellen en Interfaces, which focuses on data-driven probabilistic methods. 

The course covers the following key topics in language processing at an introductory level:

  • formal languages and automata
  • syntactic structure and syntactic parsing
  • logic-based compositional semantics
  • word meaning and semantic similarity
  • vector space models and distributional semantics

Studiemateriaal

Literatuur

  • Daniel Jurafsky and James H. Martin, Speech and Language Processing (2nd Edition), Pearson Prentice Hall, 2009. Only around seven or eight chapters will be covered in this course. The book is also used in Natuurlijke Taalmodellen en Interfaces.

Overig

  • Het materiaal op de Blackboard site van Practicum Academische Vaardigheden (www.practicumav.nl).

  • Other online materials will be pointed out during the course. Slides will be available on Canvas after each lecture.

Leerdoelen

  • The student is able to analyse the structure and the meaning of natural language expressions
  • The student can use the basic concepts of computational linguistics to define an NLP problem formally
  • The student is able to apply computational processing to parse syntax, semantics, and morphology
  • The student can describe the computational challenges of this parsing process
  • The student is able to use common NLP libraries from a general programming language
  • The student can write a technical report on the approach taken inside an NLP project

Onderwijsvormen

  • Hoorcollege
  • Werkcollege
  • Laptopcollege

The course consists of lectures (hoorcolleges) where the theoretical material is explained and discussed and practical sessions (werkcolleges and laptopcolleges). In the practical sessions students will work in pairs on exercises related to the contents introduced during the lectures.

Verdeling leeractiviteiten

Activiteit

Aantal uur

Hoorcolleges

24

Werkcolleges

10

Laptopcolleges

10

Exams

4

Zelfstudie

120

Academische vaardigheden

The education in academic skills is partly allocated in Practica Academische Vaardigheden (PAV). These practica are part of this course and are taught by the coordinator academische vaardigheden and the tutors. Second/third year students can request an exemption from the PAV coordinator.

Aanwezigheid

Aanwezigheidseisen opleiding (OER-B):

  • Voor practica en werkgroepbijeenkomsten met opdrachten geldt een aanwezigheidsplicht. De invulling van deze aanwezigheidsplicht kan per vak verschillen en staat aangegeven in de studiewijzer. Wanneer studenten niet voldoen aan deze aanwezigheidsplicht kan het onderdeel niet met een voldoende worden afgerond.

Aanvullende eisen voor dit vak:

Participation in the practical sessions is obligatory. 3 practical sessions in total can be missed at the student's discretion.

Toetsing

Onderdeel en weging Details

Eindcijfer

20%

Exam 1

30%

Exam 2

30%

Weekly homework

20%

PAV project

To pass the course, you must score at least a 5.5 on average in the exams, in the weekly homework assignments (average), and the PAV project. To sit the exam retake, you must have scored at least 3.0 in one or both exams.

Opdrachten

There will be five weekly homework assignments, which have to be completed in pairs. Students will receive feedback from their TAs on these assignments during werkcolleges and laptopcolleges. Assignments handed in up to 24h after the deadline will be accepted, but receive a 2 grade point reduction.

Fraude en plagiaat

Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl

Weekplanning

Weekly planning and deadlines will be published on Canvas

Rooster

Het rooster van dit vak is in te zien op DataNose.

Aanvullende informatie

The course will be taught in English. A basic knowledge of Python and first-order logic will be taken for granted; no other previous knowledge of linguistics is required.

Contactinformatie

Coördinator

  • Michael Repplinger

TAs are the first point of contact for day-to-day issues related to the course. For questions regarding extended absence from the course (e.g. due to health issues), please contact Lieuwe Rekker.

Docenten

  • Hanane El Aajati
  • F.J. Barkhof
  • Jelle Bosscher BSc
  • Sacha Buijs
  • Joy Crosbie
  • Mara Fennema BSc
  • Dionne Gantzert BSc
  • Samar Hashemi
  • Maas Hermes
  • D.P. Jensen BSc
  • Selina Khan
  • Mart Koek BSc
  • Vincent Loos
  • Saar Schnieders
  • Simon Stallinga
  • Pepijn Stoop
  • Minke Verweij
  • Noa Visser