£300Registration Fee
Register Now- Overview
- Instructors
- Schedule
Course Description
This two-day course provides practical training in statistical model building, evaluation, comparison, and selection for empirical researchers. Participants will learn principled approaches to choosing among competing models, handling multiple predictors, and accounting for model uncertainty. The course covers cross-validation, information criteria (AIC, AICc, BIC), variable selection methods including regularization (ridge, lasso, elastic net), and model averaging using Akaike weights. Special attention is given to mixed effects model selection, the problems with stepwise methods, and the critical distinction between prediction, explanation, and causal inference. Through hands-on examples with real research data, participants will develop practical workflows in R for comparing models, making model-averaged predictions, and reporting results appropriately. By the end of the course, participants will be able to move beyond automatic model selection and apply thoughtful, theory-driven approaches to their own research.
What You’ll Learn
During the course we will cover the following:
- Understand the bias-variance tradeoff, overfitting, and why in-sample fit can be misleading.
- Use cross-validation and information criteria (AIC, AICc, BIC) to evaluate out-of-sample predictive performance.
- Compare nested and non-nested models appropriately using likelihood ratio tests, F-tests, and information criteria.
- Recognize the multiple comparisons problem and distinguish between exploratory and confirmatory analysis.
- Implement variable selection methods including stepwise regression, all-subsets selection, and regularization approaches (ridge, lasso, elastic net).
- Calculate Akaike weights and make model-averaged predictions that account for model uncertainty.
- Use confidence sets of models to report when multiple models receive substantial support.
- Apply model selection methods specifically to mixed effects models, including understanding the crucial REML vs. ML distinction and strategies for comparing fixed and random effects structures.
- Distinguish between prediction, explanation, and causal inference goals and how this affects model selection.
- Report model selection results honestly and appropriately in publications.
Course Format
Interactive Learning Format
Each day features a well-balanced combination of lectures and hands-on practical exercises, with dedicated time for discussing participants’ own data, time permitting.
Global Accessibility
All live sessions are recorded and made available on the same day, ensuring accessibility for participants across different time zones.
Collaborative Discussions
Open discussion sessions provide an opportunity for participants to explore specific research questions and engage with instructors and peers.
Comprehensive Course Materials
All code, datasets, and presentation slides used during the course will be shared with participants by the instructor.
Personalized Data Engagement
Participants are encouraged to bring their own data for discussion and practical application during the course.
Post-Course Support
Participants will receive continued support via email for 30 days following the course, along with on-demand access to session recordings for the same period.
Who Should Attend / Intended Audiences
This course is designed for empirical researchers who work with regression models and face decisions about which variables to include or which models best represent their data. If you need to compare competing hypotheses or theoretical models using observational data, this course will provide practical tools and principled frameworks for model selection and comparison. Familiarity with R is required. You should be comfortable loading data, running basic regression models, and installing packages. No programming expertise is needed, but you should be able to follow R code and adapt examples to your own data. A solid foundation in basic statistics is expected. You should understand linear regression, p-values, confidence intervals, and hypothesis testing. Familiarity with generalized linear models (GLMs) or mixed effects models is helpful but not required—we will briefly review these as needed.
Equipment and Software requirements
A laptop or desktop computer with a functioning installation of R and RStudio is required. Both R and RStudio are free, open-source programs compatible with Windows, macOS, and Linux systems.
A working webcam is recommended to support interactive elements of the course. We encourage participants to keep their cameras on during live Zoom sessions to foster a more engaging and collaborative environment.
While not essential, using a large monitor – or ideally a dual-monitor setup – can significantly enhance your learning experience by allowing you to view course materials and work in R or linux simultaneously.
All necessary R packages will be introduced and installed during the workshop. A comprehensive list of required packages will also be shared with participants ahead of the course to allow for optional pre-installation.
Dr. Mark Andrews
Mark is a psychologist and statistician whose work lies at the intersection of cognitive science, Bayesian data analysis, and applied statistics. His research focuses on developing and testing Bayesian models of human cognition, with a particular emphasis on language processing and memory. He also works extensively on the theory and application of Bayesian statistical methods in the social and behavioural sciences, bridging methodological advances with real-world research challenges.
Since 2015, Mark has co-led a programme of intensive workshops on Bayesian data analysis for social scientists, funded by the UK’s Economic and Social Research Council (ESRC). These workshops have trained hundreds of researchers in the practical application of Bayesian methods, particularly through R and modern statistical packages.
Education & Career
• PhD in Psychology, Cornell University, New York (Cognitive Science, Bayesian Models of Cognition)
• MA in Psychology, Cornell University, New York
• BA (Hons) in Psychology, National University of Ireland
• Senior Lecturer in Psychology, Nottingham Trent University, England
Research Focus
Mark’s work centres on:
• Bayesian models of human cognition, especially in language processing and memory
• General Bayesian data analysis methods for the social and behavioural sciences
• Comparative studies of Bayesian vs. classical approaches to inference and model comparison
• Promoting reproducibility and transparent statistical practice in psychological research
Current Projects
• Developing Bayesian cognitive models of memory and linguistic comprehension
• Exploring Bayesian approaches to regression, multilevel, and mixed-effects models in psychology and social science research
• Co-leading ESRC-funded workshops on Bayesian data analysis for applied researchers
Professional Consultancy & Teaching
Mark provides expert training and advice in Bayesian data analysis for academic and applied research projects. His teaching portfolio includes courses and workshops on:
• Bayesian linear and generalized linear models
• Multilevel and mixed-effects models
• Cognitive modelling with Bayesian methods
• Applied statistics in R for psychologists and social scientists
He is also an advocate of open science and is experienced in communicating complex statistical methods to diverse audiences.
Teaching & Skills
• Instructor in Bayesian statistics, time series modelling, and machine learning
• Strong advocate for reproducibility, open-source tools, and accessible education
• Skilled in R, Stan, JAGS, and statistical computing for large datasets
• Experienced mentor and workshop leader at all academic levels
Links
• University Profile
• Personal Page
• ResearchGate
Session 1 – 02:00:00 – The Model Selection Problem
This session establishes why model selection matters for empirical research. We explore the fundamental tension between model complexity and predictive accuracy through the bias-variance tradeoff, using concrete examples to illustrate overfitting and its consequences. A key theme is distinguishing between prediction, explanation, and causal inference as fundamentally different research goals that require different modeling approaches. We review standard model fit measures and discuss why in-sample fit can be misleading, setting the stage for principled model comparison methods.
Break – 01:00:00
Session 2 – 02:00:00 – Out-of-Sample Prediction and Information Criteria
This session introduces methods for evaluating models based on their out-of-sample predictive performance. We begin with cross-validation as the conceptual gold standard, covering both k-fold and leave-one-out approaches and their practical implementation. We then turn to information criteria as computationally efficient alternatives, explaining AIC, AICc (particularly important for small samples), and BIC. The session emphasizes understanding what these criteria measure, how to interpret differences between models, and when each approach is most appropriate. Hands-on exercises demonstrate these methods with real research data.
Break – 01:00:00
Session 3 – 02:00:00 – Model Comparison Frameworks
This session examines different frameworks for comparing statistical models. We distinguish between nested and non-nested model comparisons, explaining when likelihood ratio tests and F-tests are appropriate versus when information criteria are needed. A critical topic is the multiple comparisons problem: how testing many models increases the risk of spurious findings. We discuss the distinction between exploratory and confirmatory analysis and emphasize the importance of honest reporting when model selection has been performed. Practical exercises compare nested and non-nested models on real datasets.
Session 4 – 02:00:00 – Variable Selection Methods
This session addresses the common research problem of choosing among many potential predictor variables. We examine automated selection methods including stepwise regression and all-subsets selection, with particular attention to the well-documented problems with stepwise approaches. The session then introduces regularization methods (ridge regression, lasso, elastic net) as modern alternatives that handle collinearity and perform variable selection through penalization. Hands-on exercises compare these different approaches on the same dataset to illustrate how much results can vary and help participants understand which method suits which situation.
Break – 01:00:00
Session 5 – 02:00:00 – Model Averaging
This session introduces multi-model inference as an alternative to selecting a single “best” model. We explore how Akaike weights quantify the relative support for competing models in a candidate set. The session focuses particularly on model-averaged predictions, which provide a principled way to account for model uncertainty when making forecasts. We discuss confidence sets of models as a way to honestly report when multiple models receive substantial support. Practical examples demonstrate workflows for calculating model-averaged predictions and interpreting results, with attention to what should and shouldn’t be averaged.
Break – 01:00:00
Session 6 – 02:00:00 – Mixed Effects Model Selection
This session addresses the special considerations that arise when selecting among mixed effects models. Mixed models present unique challenges for model selection because they involve both fixed and random effects, and standard model comparison procedures must be applied carefully. We cover the crucial distinction between REML and ML estimation and when each should be used for model comparison. The session examines strategies for comparing fixed effects structures, random effects structures, and combinations of both. We discuss practical workflows for stepwise simplification of mixed models, the use of information criteria with mixed models, and common pitfalls in mixed model selection. Hands-on examples demonstrate these principles using multilevel data
Frequently asked questions
Everything you need to know about the product and billing.
When will I receive instructions on how to join?
You’ll receive an email on the Friday before the course begins, with full instructions on how to join via Zoom. Please ensure you have Zoom installed in advance.
Do I need administrator rights on my computer?
I’m attending the course live — will I also get access to the session recordings?
I can’t attend every live session — can I join some sessions live and catch up on others later?
I’m in a different time zone and plan to follow the course via recordings. When will these be available?
I can’t attend live — how can I ask questions?
Will I receive a certificate?
When will I receive instructions on how to join?
You’ll receive an email on the Friday before the course begins, with full instructions on how to join via Zoom. Please ensure you have Zoom installed in advance.
Do I need administrator rights on my computer?
I’m attending the course live — will I also get access to the session recordings?
I can’t attend every live session — can I join some sessions live and catch up on others later?
I’m in a different time zone and plan to follow the course via recordings. When will these be available?
I can’t attend live — how can I ask questions?
Will I receive a certificate?
Still have questions?
Can’t find the answer you’re looking for? Please chat to our friendly team.





5.0
