6 EC
Semester 1, period 3
5264ANML6Y
Timely science supporting our future planet requires well developed data handling, analysis and modelling skills. We therefore offer a modular course where students can develop these skills in the field of their choice.
Modules from which can be made a choice:
These modules are all illustrated with examples from earth science and ecology.
Manual
Data
Exercises
Each module is started with an introductory lecture to explain the matter. Students work on the modules during scheduled computer lab sessions.
|
Activity |
Number of hours |
|
| Lectures |
8 |
|
| short seminars |
8 |
|
| Self-study | 140 |
|
| Tests | 12 |
|
| Total |
160 |
|
Requirements of the programme concerning attendance (OER-B):
Additional requirements for this course:
Each module takes a week. In the end of the week students must be present for an on exam.
| Item and weight | Details |
|
Final grade | |
|
1 (100%) Tentamen 1 |
During this course ten modules are available. The student has to select four modules, according to their field of interest. At the end of each week the module will be graded by handing in a project, presenting the results or by a written exam. Each module weights 25% for the final grade. Students cannot fail in more than two modules.
One hour long practical test in the end of the week.
One hour long practical test in the end of the week covering all covered topics and approaches.
One hour long practical test in the end of the week.
Statistics is the science (and art) of learning from data, and experts in many fields use statistics to understand and analyze large data sets. In this module the emphasis is on statistical thinking by teaching key concepts, and applying them to case studies in R. This module covers basic probability, descriptive statistics, hypothesis testing (2 means or proportions, association among two variables, goodness-of-fit), analysis of variance, and simple and multiple regression. This module focuses on conceptual understanding and interpretation, and assumes that you have no experience with R. In the final assignment you will have one hour long practical test.
Statistical learning encompasses many methods, including regression modelling, which are used to make predictive models from complex data sets and systems, and for exploring cause-effect relationships between variables. The focus here is on understanding the techniques at a conceptual (non-mathematical) level. You will complete practical exercises in R, so you should already have basic R programming and statistics skills. In the final assignment you will have one hour long practical test.
Classification techniques (logistic regression, linear discriminant analysis and K-nearest neighbors) are the models which are primarily used when predicting categorical outcomes (e.g. species distribution modeling, land use classification medical applications). Hands-on exercises in this module will teach you to use R to develop and fit classification models, and compare and experiment with different modelling techniques, data sets and methods. By the end of the module, you will be able to apply what you have learned by creating your own R classification models. In the final assignment you will have one hour long practical test.
Agent based models (ABMs) are increasingly being used in ecology, economics and the humanities to understand the behaviour of complex systems. ABMs describe a system by specifying rules or relationships between basic entities (agents), and subsequently use simulation and model-experiments to study the results of these design choices. In this module you learn to build different types of ABMs and apply these to understand typical ecological systems, involving navigation, competition, and evolution. In the final assignment you will have one hour long practical test.
Resampling, model selection algorithms and criteria, regularization and semi-parametric methods to estimate non-linearity make machine learning ‘tick’ (in fact, these methods facilitate the learning component). In this module you will learn the statistical basics for each of these techniques, but also apply them in practical applications as well in R. After completion, you will be able to apply resampling methods to select among different classification and regression models, understand general model selection procedures, understand the role of regularization, and be able to evaluate non-linear relations with generalized additive models. In the final assignment you will have one hour long practical test.
Machine learning methods (tree-based methods, support vector machines as well as unsupervised techniques as dimension reduction and clustering) are algorithmic procedures that can find patterns in data and build predictive models – provided there are sufficient relevant data available for the system that is studied. In this module you learn to apply and interpret machine learning methods to earth-scientific and ecological examples in R. After completion you should be able to understand many of the machine learning techniques encountered in the current scientific literature and also apply such methods to a given data-set. In the final assignment you will have one hour long practical test.
The student selects a total of four modules from this list.
The 'Regulations governing fraud and plagiarism for UvA students' applies to this course. This will be monitored carefully. Upon suspicion of fraud or plagiarism the Examinations Board of the programme will be informed. For the 'Regulations governing fraud and plagiarism for UvA students' see: www.student.uva.nl
The schedule for this course is published on DataNose.
Basic knowledge on statistics and mathematics as can be expected from students with the BSc in Earth Sciences, Biology, Environmental Sciences or Future Planet Studies.