Time
Tues. & Thur.
13:30-15:05
Feb. 18-June 5, 2025
Venue
Lecture Hall B725,
Shuangqing Complex Building A
Speaker
Prof. Per Johansson, YMSC
OBJECTIVE
The objective is that students, after completing the course, should be able to independently conduct simpler statistical studies, interpret the results in statistical investigations, and critically assess their credibility. To achieve this, the course combines:
· Statistical theory: to reinforce previous knowledge in statistical theory (inference) with a focus on regression models for description, prediction and causal inferences.
· Statistical programming in R and Python: for organizing and managing data as well as data analysis.
· Use of subject knowledge in relation to statistical analyses: practical examples related to, for example, economics, political science, and medicine.
A student who has completed the course should:
· Have the ability to judiciously choose an appropriate approach for a statistical analysis of data.
· Be able to use the programming language R to independently read in, organize, and process data.
· Be able to use the programming language R to sensible apply a regression analysis
· Be able to critically review research reports based on statistical data.
Textbook(s):
Regression with R and Python - an introduction (At the moment: A 350 pages compendium)
Authors: Per Johansson and Mattias Nordin
Prerequisites:
The course is intended for students studying statistics at bachelor's level or for students studying other subjects where regression is used for empirical analyzes at both master's and postgraduate level. We assume that the students have a basic course in statistics and is thus familiar with elementary probability theory and inference theory. We also assumed the students knows the most basic basics of how R is used. For those who have no prior experience with R we organize an introduction to R juts before the course starts.
COURSE DESCRIPTION
The content of the course is to introduces linear regression and how it can be used to answer questions related to description, prediction, and analysis of causal relationships. Both simple and multiple linear regression are described, as well as how nonlinear relationships can be estimated. Furthermore, logistic regression for binary outcomes and time series analysis will be covered.
The book comes with a set of data materials that are freely available. In the book there are a number of red boxes with R code that use these data materials. In order to get as much as possible out of the book. We recommend the students to go through the R code with the different datasets to reproduce the analyzes presented in the red boxes.