Course manual 2021/2022

Course content

In this class, we will learn how to use statistical models to learn from data. Rather than
memorizing many different types of tests and formulas, you will learn the fundamentals of
building statistical models and using models to understand data. Importantly, we will show
that dozens of different tests you may have heard of in statistics are just special cases of
general linear models. We will teach you to use this flexible framework to learn from
the many types of data you may come across in your research.

Above all else, the course is based on a philosophy to promote:

critical thinking about data and models
a down-to-earth attitude to data analysis (as opposed to cookbook statistics)
a comfortable attitude towards models, math, and statistics (should not be scary or
intimidating)

If you are taking this class for the second time it is important to note that the class has been completely redesigned, while there is some overlap in the topics covered, the format is entirely changed and you should treat this as a completely new class.

Study materials

Literature

https://bookdown.org/connect/#/apps/1c17bcd1-d444-46fd-aaed-7d00c47d2aa1/access

Syllabus

Software

R and RStudio

Objectives

Be able to formulate general linear models to analyze data
Be able to fit linear models to data and interpret parameters
Interpret and use probability density functions and cumulative distribution functions.
Understand and interpret uncertainties in parameter estimates of statistical models
Be able to accurately interpret p-values and identify problematic uses of null hypothesis significance testing
Interpret the results of statistical models without relying on statistical jargon.
Formulate, implement, and interpret statistical models with multiple independent variable.
Recognize common pitfalls in the application and interpretation of statistical models models.

Teaching methods

Computer lab session/practical training
Self-study
Lecture

Lectures: We will have one lecture per week (2 hrs) for 7 weeks. During lectures, you will be encouraged to actively participate in discussions, ask questions, and participate in live polls.

Lab Practicals: We will have 4 computer practicals where you will work in self-selected groups of two. If you prefer to work alone that is also fine. Each pair will work on a problem set to get practical experience analyzing data in R. The goal of the lab practicals is for you to learn how to apply the theory covered in the lectures to analyze data in R. Course instructors will be present assist groups with their work and answer questions.

Assignment: In week 6, groups of two will analyze a data set and write up a short report based on their analysis. This assignment will put into practice the theory covered in the class.

Self-study: It is expected you spend six hours per week on self-study. This involves reading and watching the assigned materials, reviewing course notes and lab practicals, attending question hours, and taking the practice exam.

Learning activities

Activity	Hours
Digital Test	2
Lecture	14
Labs	8
Assignment	8
Self study	40
Total	82	(3 EC x 28 hr)

Attendance

Programme's requirements concerning attendance (OER-B):

Participation in fieldwork is compulsory and cannot be replaced by assignments or other courses.
In case of practical sessions, the student is obliged to attend at least of 90% of the sessions and to prepare himself adequately, unless indicated otherwise in the course manual. In case the student attends less than 90%, the practical sessions should be redone entirely.
In case of tutorials/seminars with assignments, the student is obliged to attend at least 7 out of 8 seminars and to prepare thoroughly for these meetings, unless indicated otherwise in the course manual. If the course has more than 8 seminars, the student can miss up to 1 extra meeting for every (part of) 8 tutorials/seminars. If the students attends less than the mandatory tutorials/seminars, the course cannot be completed.

Additional requirements for this course:

Attendance is mandatory for the lab practicals.

Assessment

Item and weight	Details
Final grade
1 (100%) Tentamen digitaal

10% of your grade will be based on completing the lab practicals. Your grade will be based on whether you followed instructions and thoughtfully attempted to answer every question on the practical.

20 % of your grade will be based on the quality of the assignment. The assignment will be a more in-depth analysis of a complex real-world data set which you will have one week to work on as a group. A grading rubric will be provided.

70% of your grade will be determined by your score on the week 8 exam.

Assessment diagram

Every course goal will be assessed with an equal number of questions in the digital exam.

Students that were enrolled in the course in previous years

There are no special rules for students who have taken the previous course 'From Analyisis to Evidence'.

Assignments

There are no assignments in the course.

Fraud and plagiarism

The 'Regulations governing fraud and plagiarism for UvA students' applies to this course. This will be monitored carefully. Upon suspicion of fraud or plagiarism the Examinations Board of the programme will be informed. For the 'Regulations governing fraud and plagiarism for UvA students' see: www.student.uva.nl

Course structure

Week 6

Self-study:
- Watch: Everything wrong with statistics (and how to fix it) (55 minutes)
- Read Course Reader Part 1
Lecture 1: Introduction to statistical modelling/Linear models
- What are statistical models and how are they used/misused?
- Linear models (parameters, independent and dependent variables)
- Categorical vs. continuous variables

Week 7

Self-study:
- Read Course Reader Part 2
Lecture 2: Fitting models to data
- How do you ‘fit’ a model to data?
- Probability density functions
- What does it mean for a model to fit the data well?
Lab 1: Fitting models to data

Week 8

Self-study:
- Read Course Reader Part 3
Lecture 3: Quantifying uncertainty
- Sampling distributions
- Standard error of parameter values
- Cumulative distribution functions
- Confidence intervals
Lab 2: Quantifying uncertainty

Week 9

Self-study:
- Read Course Reader Part 4
Lecture 4: Null hypothesis testing
- What is a p-value?
- False positives and false negatives
- Correcting for multiple hypotheses tests
- Power analysis
Lab 3: Null Hypothesis testing

Week 10

Self-study:
- Read Course Reader Part 5
Lecture 5: Multiple independent variables
- Factorial design experiments
- Interactions
Lab 4: Multiple independent variables

Week 11

Self-study:
- Read Course Reader Part 5
Lecture 6: Multiple independent variables (part 2) and review
- Controlling for a variable
- Multicollinearity
- How do I formulate, implement and interpret statistical models?
Start assignment project

Week 12

Self-study: review notes, labs, and reader
Lecture 7: Doing reproducible statistics
- Reproducibility crisis
- P-hacking
- HARKing
Turn in project assignment
Take practice test

Week 13

Self-study: review notes, labs, and reader
Test

Timetable

The schedule for this course is published on DataNose.

Last year's course evaluation

In order to provide students some insight how we use the feedback of student evaluations to enhance the quality of education, we decided to include the table below in all course guides.

Course Name (#EC)	N
Strengths	Notes for improvement
Response lecturer:

Owner	Bachelor Future Planet Studies
Coordinator	dr. B.T. Martin
Part of	Exchange Programme Faculty of Science, specialisation BSc Future Planet Studies, year 1 Bachelor Bèta-gamma, major Aardwetenschappen, year 2 Bachelor Future Planet Studies, major Future Earth, year 2

From Data to Evidence