3
Loading Events
Home Online Courses Python for Biological Data Exploration and Visualization (PYBD01)
PYBD01

Python for Biological Data Exploration and Visualization

Explore and visualise biological data in Python using pandas and seaborn. Ideal for applied researchers.

  • Duration: 4 Days, 7 hours per day
  • Next Date: March 2-5, 2026
  • Format: Live Online Format
TIME ZONE

UK (GMT) local time - All sessions will be recorded and made available to ensure accessibility for attendees across different time zones.

£480Registration Fee

Register Now

Like what you see? Click and share!

5.0

from 200+ reviews

Course Description

This workshop aims to give novice programmers an introduction to data visualisation using Python for research in evolutionary biology and genomics by using biological examples throughout. We will use example datasets and problems themed around sequence analysis, taxonomy and ecology, with plenty of time for participants to work on their own research data.

Much of the popularity of Python stems from the availability of high quality libraries of existing code that we can use for our own projects. Libraries (“packages” in Python terminology) are even more useful when they are designed to work together. For scientific programming, we are lucky to have a collection of mature packages which work together to form a stack:

    • Numpy for numerical processing.
    • Pandas for reading, cleaning and processing tabular data files.
    • Matplotlib as a low-level charting library.
    • Seaborn as a high-level charting library for rapid dataset .
    • exploration through visualization.

In this course we will learn how to use these packages together to quickly explore large biological datasets, find meaningful patterns in the data, and present our results clearly. We will focus on the high-level packages – pandas and seaborn – as this will allow us to do the most work with the smallest amount of code. By concentrating on just two packages for an entire course, we will be able to cover a large part of what these tools can do.

What You’ll Learn

During the course will cover the following:

  • Apply the skills they have learned to tackling problems in their own research,
  • Continue their Python education in a self-directed way. ,

Course Format

Interactive Learning Format

Each day features a well-balanced combination of lectures and hands-on practical exercises, with dedicated time for discussing participants’ own data, time permitting.

Global Accessibility

All live sessions are recorded and made available on the same day, ensuring accessibility for participants across different time zones.

Collaborative Discussions

Open discussion sessions provide an opportunity for participants to explore specific research questions and engage with instructors and peers.

Comprehensive Course Materials

All code, datasets, and presentation slides used during the course will be shared with participants by the instructor.

Personalized Data Engagement

Participants are encouraged to bring their own data for discussion and practical application during the course.

Post-Course Support

Participants will receive continued support via email for 30 days following the course, along with on-demand access to session recordings for the same period.

Who Should Attend / Intended Audiences

This course is designed for anyone interested in using Python for the analysis and visualization of biological datasets. It assumes some prior experience with Python, as it does not cover the absolute basics of the language—participants should already be familiar with basic syntax. The Introduction to Python for Biologists course provides a suitable foundation. If you’re keen to join but have no Python experience, you’re encouraged to contact martin@pythonforbiologists.com for recommended resources to get up to speed.

A modest background in descriptive statistics and familiarity with common chart types, such as box plots and scatter plots, is also expected. As the course builds on existing Python knowledge, it is not suitable for complete beginners to programming. The course includes substantial hands-on time, including opportunities to work with your own datasets, making it particularly valuable for those at the beginning of the data analysis phase of a research project. If you’re unsure whether the course is right for you, feel free to reach out to martin@pythonforbiologists.com for guidance.

Equipment and Software requirements

A laptop computer with a working version of Python is required. Python is free and open-source software for PCs, Macs, and Linux computers.

A working webcam is recommended to support interactive elements of the course. We encourage participants to keep their cameras on during live Zoom sessions to foster a more engaging and collaborative environment.

While not essential, using a large monitor—or ideally a dual-monitor setup—can significantly enhance your learning experience by allowing you to view course materials and work in R simultaneously.

 

Participants should be able to install additional software on their computers during the course (please ensure you have administration rights to your computer).

Download Zoom Download Python

Dr. Martin Jones

Dr. Martin Jones

Martin is a freelance trainer specialising in programming and Linux skills tailored for researchers in biological sciences. With a background in biology and a PhD in large-scale phylogenetics, Martin combines deep scientific expertise with practical computational training to empower researchers to leverage coding and open-source tools in their work.

His teaching primarily focuses on Python programming and Linux, with an emphasis on applications relevant to biological data analysis. Since launching Python for Biologists in 2015, Martin has dedicated himself full-time to teaching and writing, helping bridge the gap between biology and computational skills.

 

Education & Career
• PhD in Large-Scale Phylogenetics, 2007
• Former Lecturer in Bioinformatics, University of Edinburgh
• Founder of Python for Biologists (2015–present)
• Over a decade of experience training biologists in programming and Linux

 

Training & Skills
Martin specialises in:
• Python programming for biological research
• Linux command line and scripting
• Bioinformatics workflows and reproducible research practices

 

Professional Focus
Martin’s work focuses on equipping life scientists with the computational tools necessary to analyse complex biological datasets. He advocates for practical, hands-on training that helps researchers automate tasks, perform reproducible analyses, and develop programming confidence.

 

Teaching & Writing
• Full-time educator and author on programming for biologists
• Creator of tutorials, workshops, and online courses in Python and Linux
• Active in the bioinformatics and computational biology training community

 

Links
• Python for Biologists

Session 1 – 03:00:00 – Environment Setup and Data Overview

This session focuses on preparing the analysis environment and introducing the tools and data used throughout the course. Participants will set up Jupyter Notebooks, verify necessary package versions, and explore the example data files. The session also covers loading data into pandas and introduces key concepts such as series, indices, and data types, providing a foundation for later topics.

Key Topics:
Terminals, standard output, variables, strings, special characters, formatting, statements, functions, methods, arguments, comments.

Session 2 – 03:00:00 – Series Objects and Thinking in Columns
This session introduces a key shift when moving from core Python to pandas: operating on entire columns (Series objects) rather than processing values one at a time. Through numerous hands-on examples, participants will develop an intuition for column-wise thinking, enabling concise and powerful data transformations. The session covers common tasks such as filtering rows and columns, creating new columns, sorting, and summarizing data — all using minimal code. Participants will also explore special filtering techniques that require different syntax, preparing them to tackle complex analysis challenges involving selection, filtering, and aggregation.

Key Topics:
Series objects, column-wise operations, broadcasting, filtering syntax, conditional selection, sorting, column creation, aggregation.

Session 3 – 03:00:00 – introducing Seaborn
This session shifts the focus from data analysis to data visualization, emphasizing visual exploration as a tool for understanding data. Participants will begin with an overview of the Seaborn library and then explore core plot types used to examine distributions and relationships in data. The session covers histograms, kernel density estimates, and scatter plots, as well as advanced alternatives like hexbin and contour plots, which are helpful when working with large datasets.

Participants will also learn how to leverage Seaborn’s ability to map dataframe columns to visual elements such as color, size, and shape, and how to efficiently generate small multiple plots for deeper insights. As with pandas, the goal is to produce powerful, publication-quality visualizations with minimal and intuitive code.

Key Topics:
Seaborn syntax, distribution plots, relational plots, hexbin plots, contour plots, aesthetic mappings, small multiples, figure customization.

Session 4 – 03:00:00 – Categorical Axes with Seaborn
This session explores the wide range of visualization techniques available for categorical data in Seaborn. Participants will work with common plot types such as strip plots, box plots, and bar plots, as well as more advanced options including swarm plots, violin plots, and boxen plots. The variety of available chart types provides an opportunity to consider the trade-offs between simplicity and detail in data visualization.
The session also introduces techniques for customizing chart appearance, with an emphasis on effective and intentional use of color. Participants will review best practices for color selection and learn how to avoid common pitfalls that can reduce the clarity or impact of a visual.

Key Topics:
Categorical plots, strip plots, box plots, violin plots, swarm plots, boxen plots, color palettes, style customization, visual encoding, color best practices.

Session 5 – 03:00:00 – Grouping and Categories with pandas
With a foundation in data visualization established, this session returns to pandas to explore more advanced data manipulation techniques. Participants begin with an in-depth look at categorical data types and their role in organizing and interpreting datasets. The core focus of the session is the powerful groupby functionality in pandas, which enables aggregation and analysis across different groupings.

Participants will learn multiple strategies for grouping data, including grouping by existing columns, applying custom grouping functions, and creating bins from continuous values. Binning is introduced as a particularly useful method for transforming numerical data into categorical form, enabling richer comparisons and more effective visualizations.

Key Topics:
Categorical data types, grouping with groupby, aggregation, custom groupings, binning numerical data, categorical transformations, integration with visualizations.

Session 6 – 03:00:00 – Long vs. Wide Form Data and Heatmaps
This session introduces the concepts of long-form (tidy) and wide-form (summary) data, building on earlier pandas techniques for reshaping DataFrames. Participants will learn how and when to use each format effectively, recognizing that different forms serve different purposes in data analysis and visualization.

The session also introduces the final major chart type in Seaborn: the heatmap, along with its extension, the clustermap. These visualizations directly reflect the structure of summary tables, offering a compact and powerful way to detect patterns, highlight relationships, and explore high-dimensional data. Participants will apply heatmaps to practical examples that illustrate their value in solving complex visualization challenges.

Key Topics:
Long-form vs. wide-form data, data reshaping, pivoting and melting in pandas, heatmaps, clustermaps, matrix-style visualizations, interpreting color-coded data.

Session 7- 03:00:00 – Dictionaries
This session introduces Python’s dictionary data structure as a powerful alternative to lists for managing paired data. Using a bioinformatics example focused on k-mer counting, participants will explore why dictionaries are better suited for specific tasks and learn the syntax for creating, accessing, and manipulating dictionaries.

Comparisons between list-based and dictionary-based solutions highlight the strengths and appropriate use cases for each approach. Additional examples demonstrate how key-value data structures are common in both bioinformatics and general programming. Practical exercises include working with dictionaries in real-world contexts such as DNA-to-protein translation.

Key Topics:
Paired data structures, hashing, key uniqueness, argument unpacking, tuples, dictionary methods.

Session 8- 03:00:00 – File Management and Housekeeping Scripts
This session explores Python’s standard library modules for file manipulation, emphasizing automation of essential housekeeping tasks commonly encountered in bioinformatics projects. Participants will learn techniques for renaming, moving, deleting files, and creating directories programmatically.

The practical component focuses on a bioinformatics data preprocessing challenge: organizing DNA sequences by length. This exercise highlights important considerations such as managing program state across multiple runs, processing multiple input files, and generating multiple outputs efficiently.

Key Topics:
File input/output, directory management, file renaming and deletion, scripting automation, handling multiple files, program state persistence.

Testimonials

PRStats offers a great lineup of courses on statistical and analytical methods that are super relevant for ecologists and biologists. My lab and I have taken several of their courses—like Bayesian mixing models, time series analysis, and machine/deep learning—and we've found them very informative and directly useful for our work. I often recommend PRStats to my students and colleagues as a great way to brush up on or learn new R-based statistical skills.

Rolando O. Santos

PhD Assistant Professor, Florida International University

Courses attended

SIMM05, IMDL03, ITSA02, GEEE01 and MOVE07

Testimonials

PRStats offers a great lineup of courses on statistical and analytical methods that are super relevant for ecologists and biologists. My lab and I have taken several of their courses—like Bayesian mixing models, time series analysis, and machine/deep learning—and we've found them very informative and directly useful for our work. I often recommend PRStats to my students and colleagues as a great way to brush up on or learn new R-based statistical skills.

Rolando O. Santos

PhD Assistant Professor, Florida International University

Courses attended

SIMM05, IMDL03, ITSA02, GEEE01 and MOVE07

Testimonials

PRStats offers a great lineup of courses on statistical and analytical methods that are super relevant for ecologists and biologists. My lab and I have taken several of their courses—like Bayesian mixing models, time series analysis, and machine/deep learning—and we've found them very informative and directly useful for our work. I often recommend PRStats to my students and colleagues as a great way to brush up on or learn new R-based statistical skills.

Rolando O. Santos

PhD Assistant Professor, Florida International University

Courses attended

SIMM05, IMDL03, ITSA02, GEEE01 and MOVE07

Frequently asked questions

Everything you need to know about the product and billing.

When will I receive instructions on how to join?

You’ll receive an email on the Friday before the course begins, with full instructions on how to join via Zoom. Please ensure you have Zoom installed in advance.

Do I need administrator rights on my computer?

Yes — administrator access is recommended, as you may need to install software during the course. If you don’t have admin rights, please contact us before the course begins and we’ll provide a list of software to install manually.

I’m attending the course live — will I also get access to the session recordings?

Yes. All participants will receive access to the recordings for 30 days after the course ends.

I can’t attend every live session — can I join some sessions live and catch up on others later?

Absolutely. You’re welcome to join the live sessions you can and use the recordings for those you miss. We do encourage attending live if possible, as it gives you the chance to ask questions and interact with the instructor. You’re also welcome to send questions by email after the sessions.

I’m in a different time zone and plan to follow the course via recordings. When will these be available?

We aim to upload recordings on the same day, but occasionally they may be available the following day.

I can’t attend live — how can I ask questions?

You can email the instructor with any questions. For more complex topics, we’re happy to arrange a short Zoom call at a time that works for both of you.

Will I receive a certificate?

Yes. All participants receive a digital certificate of attendance, which includes the course title, number of hours, course dates, and the instructor’s name.

When will I receive instructions on how to join?

You’ll receive an email on the Friday before the course begins, with full instructions on how to join via Zoom. Please ensure you have Zoom installed in advance.

Do I need administrator rights on my computer?

Yes — administrator access is recommended, as you may need to install software during the course. If you don’t have admin rights, please contact us before the course begins and we’ll provide a list of software to install manually.

I’m attending the course live — will I also get access to the session recordings?

Yes. All participants will receive access to the recordings for 30 days after the course ends.

I can’t attend every live session — can I join some sessions live and catch up on others later?

Absolutely. You’re welcome to join the live sessions you can and use the recordings for those you miss. We do encourage attending live if possible, as it gives you the chance to ask questions and interact with the instructor. You’re also welcome to send questions by email after the sessions.

I’m in a different time zone and plan to follow the course via recordings. When will these be available?

We aim to upload recordings on the same day, but occasionally they may be available the following day.

I can’t attend live — how can I ask questions?

You can email the instructor with any questions. For more complex topics, we’re happy to arrange a short Zoom call at a time that works for both of you.

Will I receive a certificate?

Yes. All participants receive a digital certificate of attendance, which includes the course title, number of hours, course dates, and the instructor’s name.

Still have questions?

Can’t find the answer you’re looking for? Please chat to our friendly team.

×

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
PYBD01 ONLINE
PYBD01 ONLINE
£ 480.00
25 available
£480.00
2nd March 2026 - 5th March 2026
Delivered remotely (United Kingdom), Western European Time Zone, United Kingdom
A closeup of a vulture