£150Registration Fee
Register Now- Overview
- Instructors
- Schedule
Course Description
This 1-day course provides an introduction to the theory and application of zero-inflated models in R. Many real-world datasets in ecology, epidemiology, public health, and insurance exhibit excess zeros, overdispersion, or semi-continuous behaviour that standard Poisson or Negative Binomial models fail to capture. This course builds the foundations of count data modelling, then develops practical expertise in zero-inflated, hurdle, truncated, and related models.
What You’ll Learn
During the course will cover the following:
- Understand why zero inflation arises in real-world data across domains such as ecology, epidemiology, and insurance.
- Fit and interpret Poisson, Negative Binomial, Generalised Poisson, and Conway–Maxwell Poisson GLMs.
- Diagnose overdispersion and zero inflation using residual simulation and the DHARMa package.
- Implement and interpret Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) models.
- Apply hurdle and truncated models, and understand when they are preferable to zero-inflated models.
Course Format
Interactive Learning Format
Each day features a well-balanced combination of lectures and hands-on practical exercises, with dedicated time for discussing participants’ own data, time permitting.
Global Accessibility
All live sessions are recorded and made available on the same day, ensuring accessibility for participants across different time zones.
Collaborative Discussions
Open discussion sessions provide an opportunity for participants to explore specific research questions and engage with instructors and peers.
Comprehensive Course Materials
All code, datasets, and presentation slides used during the course will be shared with participants by the instructor.
Personalized Data Engagement
Participants are encouraged to bring their own data for discussion and practical application during the course.
Post-Course Support
Participants will receive continued support via email for 30 days following the course, along with on-demand access to session recordings for the same period.
Who Should Attend / Intended Audiences
This course is designed for ecologists, environmental scientists, public-health analysts, data scientists, postgraduate students, and early-career researchers who wish to enhance their quantitative analysis skills using R. Participants are expected to have basic experience with R and RStudio, including tasks such as importing data and running simple functions. A foundational understanding of descriptive statistics and linear regression concepts is assumed, while prior familiarity with linear models will be advantageous but not essential, as key concepts will be reviewed during the course. Some experience with data wrangling using packages like dplyr or tidyr, basic data visualization with ggplot2, and interpreting model output would be beneficial but is not required.
Equipment and Software requirements
A laptop or desktop computer with a functioning installation of R and RStudio is required. Both R and RStudio are free, open-source programs compatible with Windows, macOS, and Linux systems.
A working webcam is recommended to support interactive elements of the course. We encourage participants to keep their cameras on during live Zoom sessions to foster a more engaging and collaborative environment.
While not essential, using a large monitor—or ideally a dual-monitor setup—can significantly enhance your learning experience by allowing you to view course materials and work in R simultaneously.
All necessary R packages will be introduced and installed during the workshop. A comprehensive list of required packages will also be shared with participants ahead of the course to allow for optional pre-installation.
Dr. Niamh Mimnagh
Niamh is a statistician working at the interface of ecology, epidemiology, and data science. Her research focuses on applying and developing statistical and machine learning methods to address real-world challenges such as estimating species population sizes from count and trace data and predicting livestock disease re-emergence using sparse or imbalanced datasets. She works with a wide array of statistical approaches, including Bayesian hierarchical models, N-mixture models, anomaly detection algorithms, and spatial analysis techniques.
Niamh earned her PhD in Statistics, with a focus on multispecies abundance modelling, and holds a first-class MSc in Data Science. Alongside her research, she is actively engaged in science communication and education, running a popular blog on applied statistics for non-specialists, and regularly delivering workshops and guest lectures on topics such as GLMs and machine learning with imbalanced data.
Education & Career
- PhD in Statistics (Multispecies Abundance Modelling)
- MSc in Data Science (First Class Honours)
- Instructor, consultant, and science communicator in statistical ecology and epidemiology
Research Focus
Niamh’s work centres on extracting meaningful insights from complex ecological and epidemiological data. She is particularly interested in population estimation techniques and predictive modelling for conservation and disease management, using advanced statistical tools and reproducible workflows.
Current Projects
- Development of Bayesian and ML approaches for estimating species abundance from imperfect data
- Modelling livestock disease risk using spatial and temporal predictors
- Creating accessible educational materials for teaching applied statistics in R
Professional Consultancy
Niamh provides expert statistical support to academic and applied research projects, with a focus on ecological monitoring, conservation planning, and disease modelling. She also advises on study design and data workflows for interdisciplinary teams.
Teaching & Skills
- Teaches topics including GLMs, Bayesian statistics, machine learning for imbalanced data, and spatial statistics in R
- Advocates for reproducibility, open science, and accessible statistical training
- Experienced in communicating complex methods to broad audiences
Links
Session 1 – 01:20:00 – Poisson and Extensions
This session begins with the Poisson GLM, covering its assumptions, likelihood, and interpretation. We then highlight the limitations of the Poisson distribution, particularly with respect to overdispersion and zero inflation. To address these issues, we introduce and compare extensions including the Negative Binomial, the Generalised Poisson, and the Conway–Maxwell Poisson (COMP)
Session 2 – 01:20:00 – Model Validation
In this session, we turn to residual diagnostics, introducing simulated residuals and formal tests for model misfit. Participants will learn how to use the DHARMa package in R to detect overdispersion and zero inflation. The session concludes with a practical exercise on validating fitted models using real data.
Session 3 – 01:20:00 – Zero-Inflated Models
This session introduces the theoretical basis of zero-inflated models. We describe the mixture interpretation, in which zeros can arise either structurally or through the sampling process. We then present the Zero-Inflated Poisson (ZIP) and Zero-Inflated Negative Binomial (ZINB) models in detail, discussing their assumptions and mathematical formulation. Participants learn how to fit zero-inflated models in R using the pscl::zeroinfl and glmmTMB functions
Session 4 – 01:20:00 – Hurdle and Truncated Models
In this session, we introduce hurdle models and explain their conceptual differences from zero-inflated models. We focus on the Zero-Altered Poisson and Zero-Altered Negative Binomial, which treat zeros differently from the ZIP and ZINB.
Session 4 – 01:00:00 – Exploratory Data Analysis for Machine Learning
Using visual and numerical summaries, this session examines how to perform exploratory data analysis (EDA) in a machine learning context. Relationships between variables, outliers, and data distributions are explored using tools such as ggplot2.
Session 5 -01:00:00 – Model Fitting Frameworks in R
This session introduces the use of the caret and tidymodels frameworks for fitting machine learning models in R. Topics include data splitting, resampling, defining workflows, and generating predictions.
Session 6 – 01:00:00 – Baseline Models and Performance Metrics
This session focuses on constructing simple models and introducing key performance metrics. These include RMSE, MAE, accuracy, and AUC, with discussion of how and when to use each metric depending on the learning task.
Frequently asked questions
Everything you need to know about the product and billing.
When will I receive instructions on how to join?
You’ll receive an email on the Friday before the course begins, with full instructions on how to join via Zoom. Please ensure you have Zoom installed in advance.
Do I need administrator rights on my computer?
I’m attending the course live — will I also get access to the session recordings?
I can’t attend every live session — can I join some sessions live and catch up on others later?
I’m in a different time zone and plan to follow the course via recordings. When will these be available?
I can’t attend live — how can I ask questions?
Will I receive a certificate?
When will I receive instructions on how to join?
You’ll receive an email on the Friday before the course begins, with full instructions on how to join via Zoom. Please ensure you have Zoom installed in advance.
Do I need administrator rights on my computer?
I’m attending the course live — will I also get access to the session recordings?
I can’t attend every live session — can I join some sessions live and catch up on others later?
I’m in a different time zone and plan to follow the course via recordings. When will these be available?
I can’t attend live — how can I ask questions?
Will I receive a certificate?
Still have questions?
Can’t find the answer you’re looking for? Please chat to our friendly team.





5.0
