Taaltheorie en Taalverwerking

Linguistics and Language Processing

6 EC

Semester 2, periode 5

5082TATA6Y

Eigenaar Bachelor Kunstmatige Intelligentie
Coördinator Dieuwke Hupkes MSc
Onderdeel van Minor Kunstmatige Intelligentie, jaar 1Bachelor Kunstmatige Intelligentie, jaar 1Bachelor Bèta-gamma, major Kunstmatige Intelligentie, jaar 2

Studiewijzer 2019/2020

Globale inhoud

Our ability to use natural language to communicate with each other and to record information is one of the main features that makes us intelligent. However, while we use language effortlessly in our everyday life, computers have a hard time processing natural languages such as English or Dutch. Computational linguistics is a subfield of artificial intelligence at the interface of linguistic theory and computer science, which aims at endowing computers with the ability to process natural language. The ultimate goal is to develop artificial agents that can automatically acquire information from text or that can communicate with humans via intelligent interfaces or in human-robot interaction.

This course introduces students to some of the core topics in computational linguistics and natural language processing. We will focus on foundational aspects, paying special attention to rule-based methods. The course provides background for the second-year course Natuurlijke Taalmodellen en Interfaces, which focuses on data-driven probabilistic methods. 

The course covers the following key topics in language processing at an introductory level:

  • formal languages and automata,
  • syntactic structure and syntactic parsing,
  • logic-based compositional semantics, and
  • word meaning and semantic similarity.

Studiemateriaal

Literatuur

  • Daniel Jurafsky and James H. Martin, Speech and Language Processing (2nd Edition), Pearson Prentice Hall, 2009. Only around seven or eight chapters will be covered in this course. The book is also used in Natuurlijke Taalmodellen en Interfaces.

Overig

  • Het materiaal op de Blackboard site van Practicum Academische Vaardigheden (www.practicumav.nl).

  • Other online materials will be pointed out during the course. Slides will be available on Canvas after each lecture.

Leerdoelen

  • The student is able to analyse the structure and the meaning of natural language expressions.
  • The student can use the basic concepts of computational linguistics to define an NLP problem formally.
  • The student is able to apply computational processing to parse syntax, semantics, and morphology.
  • The student can describe the computational challenges of this parsing process.
  • The student is able to use common NLP libraries from a general programming language.
  • The student can write a technical report on the approach taken inside an NLP project.

Onderwijsvormen

  • Hoorcollege
  • Werkcollege
  • Laptopcollege
  • Zelfstudie

The course consists of lectures (hoorcolleges) where the theoretical material is explained and discussed and practical sessions (werkcolleges and laptopcolleges). In the practical sessions students will work in pairs on exercises related to the contents introduced during the lectures.

Verdeling leeractiviteiten

Activiteit

Aantal uur

Hoorcolleges

22

Werkcolleges

12

Laptopcolleges

12

Exams

4

Zelfstudie

118

Academische vaardigheden

The education in academic skills is partly allocated in Practica Academische Vaardigheden (PAV). These practica are part of this course and are taught by the coordinator academische vaardigheden and the tutors.

For BSc KI students the PAV are an obligatory part of this course. Students from other BSc programmes are exempted. Second/third year students can request an exemption from the PAV coordinator.

Aanwezigheid

Aanwezigheidseisen opleiding (OER-B):

  • Voor practica en werkgroepbijeenkomsten met opdrachten geldt een aanwezigheidsplicht. De invulling van deze aanwezigheidsplicht kan per vak verschillen en staat aangegeven in de studiewijzer. Wanneer studenten niet voldoen aan deze aanwezigheidsplicht kan het onderdeel niet met een voldoende worden afgerond.

Aanvullende eisen voor dit vak:

It is obligatory to attend a minimum of 9 out of 12 practical sessions. Absences do not need to be reported, but use them wisely: there will be no exceptions for additional absences, even in cases of emergency.

Toetsing

Onderdeel en weging Details

Eindcijfer

20%

Final Exam

20%

Mid-Term Exam

30%

Technical rapport

30%

Homeworks

In order to pass the course, you must score at least a 5.0 (weighted average) in both the practical part and the theoretical part (and you need to score at least a 5.5 overall). If you attend both exams but score at least 3.0 but less than 6.0 on one or both of them – and have at least a 5.0 in the practical part – then you may sit the retake. The retake will replace the combined results of two exams, whatever your original results may have been. If you intend to sit the retake, you must let the instructor know in June. The retake should be a last resort: students who rely on it often get unpleasant surprises.

Opdrachten

Weekly homeworks

  • Written homework exercises

Weekly lab assignments

  • Programming assignments using NLTK

There will be six weekly homework assignments, which may be completed in pairs. Students will receive feedback from their TAs on these assignments during werkcolleges and laptopcolleges. 

Fraude en plagiaat

Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: http://student.uva.nl

Weekplanning

The planning of this course and the corresponding reading materials will be published on the Canvas page of the course.

Rooster

Het rooster van dit vak is in te zien op DataNose.

Aanvullende informatie

The course will be taught in English. A basic knowledge of Python and first-order logic will be taken for granted; no other previous knowledge of linguistics is required.

Verwerking vakevaluaties

Hieronder vind je de aanpassingen in de opzet van het vak naar aanleiding van de vakevaluaties.

Contactinformatie

Coördinator

  • Dieuwke Hupkes MSc

The TAs should be the first point of contact for day-to-day issues related to the course. For unusual or extreme circumstances, e.g., exam time conflicts, contact the course coordinator.