Taaltheorie en Taalverwerking

Linguistics and Language Processing

6 EC

Semester 2, periode 5

5082TATA6Y

Eigenaar Bachelor Kunstmatige Intelligentie
Coördinator dr. J.A. Burgoyne
Onderdeel van Minor Kunstmatige Intelligentie, jaar 1Bachelor Kunstmatige Intelligentie, jaar 1Bachelor Bèta-gamma, major Kunstmatige Intelligentie, jaar 2

Studiewijzer 2017/2018

Globale inhoud

Our ability to use natural language to communicate with each other and to record information is one of the main features that makes us intelligent. However, while we use language effortlessly in our everyday life, computers have a hard time processing natural languages such as English or Dutch. Computational linguistics is a subfield of artificial intelligence at the interface of linguistic theory and computer science, which aims at endowing computers with the ability to process natural language. The ultimate goal is to develop artificial agents that can automatically acquire information from text or that can communicate with humans via intelligent interfaces or in human-robot interaction.

This course introduces students to some of the core topics in computational linguistics and natural language processing. We will focus on foundational aspects, paying special attention to rule-based methods. The course provides background for the second-year course Natuurlijke Taalmodellen en Interfaces, which focuses on data-driven probabilistic methods. 

The course covers the following key topics in language processing at an introductory level:

  • formal languages and automata
  • syntactic structure and syntactic parsing
  • logic-based compositional semantics
  • word meaning and semantic similarity

Studiemateriaal

Literatuur

  • Daniel Jurafsky & James H. Martin, 'Speech and Language Processing' (2nd Edition) Pearson Prentice Hall, 2009. This book provides a comprehensive and in-depth overview of the field. Only around seven or eight chapters will be covered in this course. The book is also used in the second-year course of the BSc AI 'Natuurlijke taalmodellen en interfaces'. There is also a draft of the third edition of Jurafsky and Martin available online. Although the second edition remains the official text for the course, wherever possible, references to the relevant chapters of the third edition will be given for reading assignments.

Software

Overig

  • Het materiaal op de Blackboard site van Practicum Academische Vaardigheden (www.practicumav.nl)
  • Other online materials will be pointed out during the course. Slides will be available on Blackboard after each lecture.

Leerdoelen

The overall aim of this course is to introduce students to fundamental topics in computational linguistics and natural language processing. By the end of the course, students should be able to:

  • Demonstrate an understanding of the basic concepts in computational linguistics by being able to define them and to apply them to natural language processing problems.
  • Analyse the structure and the meaning of natural language expressions and describe the main computational challenges involved in doing so automatically.
  • Use NLTK to conduct a small natural language processing project and write a technical report of the project and its results (PAV).

Onderwijsvormen

  • Hoorcollege
  • Werkcollege
  • Laptopcollege

The course consists of lectures (hoorcolleges) where the theoretical material is explained and discussed and practical sessions (werkcolleges and laptopcolleges). In the practical sessions students will work in pairs on exercises related to the contents introduced during the lectures. 

In addition to the lectures, practical sessions, AI students must follow the Practicum Academische Vaardigheden (1 or 2 hours per week), and all students are expected to devote 10 to 15 hours per week to self-study.

Verdeling leeractiviteiten

Activiteit

Aantal uur

Hoorcollege

24

Laptopcollege

12

Tentamen

4

Werkcollege

12

Zelfstudie

116

Academische vaardigheden

All students in the course will use NLTK to conduct a small natural language processing project and write a technical report of the project and its results (PAV). AI students must attend PAV tutoring sessions to help build up the project and the report. Non-AI students (or AI students who have already fulfilled their PAV requirement) may attend the PAV tutoring sessions, but this is not required.

Aanwezigheid

Aanwezigheidseisen opleiding (OER-B):

  • Voor practica en werkgroepbijeenkomsten met opdrachten geldt een aanwezigheidsplicht. De invulling van deze aanwezigheidsplicht kan per vak verschillen en staat aangegeven in de studiewijzer. Wanneer studenten niet voldoen aan deze aanwezigheidsplicht kan het onderdeel niet met een voldoende worden afgerond. .

Aanvullende eisen voor dit vak:

It is obligatory to attend a minimum of 9 out of 12 practical sessions. Try to attend every practical session unless it is truly impossible: after three absences, any further absence will cause a student to fail the course, even in cases of emergency.

Toetsing

Onderdeel en weging Details

Eindcijfer

100%

Overall grade

Moet ≥ 5.5 zijn

35%

Practical part

Moet ≥ 5 zijn

Homework assignments (20%)

Technical report (15%)

65%

Theoretical part

Moet ≥ 5 zijn

Tentamen 1

Tentamen 2

In order to pass the course, you must score at least a 5.0 (weighted average) in both the practical part and the theoretical part (and you need to score at least a 5.5 overall). If you earn at least a 3.0 on each exam but fail one or both of them – and have at least a 5.0 in the practical part – then you may sit the hertentamen. The hertentamen will replace the combined results of two exams, whatever your original results may have been. If you intend to sit the hertentamen, you must let the instructor know in June. The hertentamen should be a last resort: students who rely on it often get unpleasant surprises.

Opdrachten

Outside of exam weeks, there will be weekly homework assignments. These assignment will include both a written part and an NLTK part. Student may work in pairs; each student should submit the assignments Blackboard.

Fraude en plagiaat

Dit vak hanteert de algemene 'Fraude- en plagiaatregeling' van de UvA. Hier wordt nauwkeurig op gecontroleerd. Bij verdenking van fraude of plagiaat wordt de examencommissie van de opleiding ingeschakeld. Zie de Fraude- en plagiaatregeling van de UvA: www.uva.nl/plagiaat

Weekplanning

The weekly planning is available in Blackboard.

Rooster

Het rooster van dit vak is in te zien op DataNose.

Aanvullende informatie

The course will be taught in English.

Second/third year students can request an exemption for the PAV at the coordinator academische vaardigheden. They must contact the coördinator academische vaardigheden, Susanne Hendrickx MSc (s.hendrickx@uva.nl), in any case, before the course starts. Exemption will then be discussed. If students are not exempted, they will be assigned to one of the PAV groups for this course only.

Basic knowledge of first-order logic will be taken for granted; but no other previous knowledge of linguistics is required.

Contactinformatie

Coördinator

  • dr. J.A. Burgoyne

Most questions about course material should be directed first to a student's TA. Absences for the hoorcolleges and werk/laptopcolleges need not be reported, but take care not to miss more than three practical sessions. Other questions should be directed to the course coordinator. In the case of serious personal difficulties, it is often a good idea to copy the studieadviseur also, as such problems rarely affect just one course.