INFORMATION REPOSITORY

000. Course Overview

Updated on June 17, 2025

Data analysis is fundamental to analytical separation science, enabling the accurate interpretation of complex signals produced during chromatographic and electrophoretic separations. By applying statistical tools, the extracted data can be contextualized numerically (e.g. the confidence interval), ensuring reliability, reproducibility, and scientific validity.

Data processing completes the analytical workflow by converting raw detector signals into meaningful results. This involves steps such as baseline correction, peak detection, integration, and quantification each essential for transforming experimental output into interpretable data.

With the growing complexity of analytical datasets, chemometrics has become increasingly vital for extracting relevant patterns and insights. Emerging techniques such as machine learning further enhance this capability by enabling predictive modeling, pattern recognition, and automation across high-dimensional separation data.

This online course is based on the academic curriculum taught by Dr. Bob Pirok within the UvA-VU joint degree Master’s programme in Chemistry, Analytical Sciences. While it cannot fully replace the in-person lectures and tutorials, it offers wider access to core concepts and practices through this web-based platform.

Chemometrics & Statistics #

In this course general aspects of chemometrics and statistics applied for analytical methods will be treated. Parameters to describe the quality of analytical methods (e.g. accuracy and precision, sensitivity, selectivity, robustness) will be defined. Basic statistical methods applied to modern analytical instrumentation will be discussed. These include data exploration and visualization, statistical inference, hypothesis testing, and calibration, applied to univariate and multivariate data.

In addition, an important component of the course is signal processing and the student will learn how to find and process useful information from signals obtained by instrumental analytical techniques.

Attention will also be given to design-of-experiments and validation procedures. One of the main objectives of the course is to acquire the skills for adequate software handling for data analysis using a higher programming language.

Free and without advertisements

We are committed to advancing the accessibility of high-quality education in the field of analytical sciences. It is our hope that this course will serve as a valuable resource for learners worldwide.

Whether you are a student, educator, or professional aiming to deepen your understanding of separation science, this course provides a structured pathway through the textbook. It offers guided reading, supplementary materials, and exercises designed to support a rigorous and comprehensive learning experience.

Learning Goals #

After this course, you will be able to

  • Propose suitable methods to data processing and statistical analysis.
  • Evaluate whether the applied statistical method led to a useful answer to the analytical question.
  • Examine the quality of analytical methods (e.g. accuracy and precision, sensitivity, selectivity, robustness).
  • Find the main characteristics of signals obtained by instrumental analytical techniques.
  • Optionally: Write scripts to perform statistical computations.

How to use this course #

This course assumes active reading of the accompanying book Analytical Separation Science. Throughout the course, you will be guided to specific modules of the book such as is shown in the example below.

Analytical Separation Science by B.W.J. Pirok and P.J. Schoenmakers
READ SECTION 9.4.2

Outlier testing

Most of the images will feature interactive options that help you to explore what the graph is displaying, just like we would explain you in a lecture room in class.

Some concepts are exclusively explained through such images or the interactive exercises that we provide.

Extra Information

These boxes scattered throughout the lessons will inform you of additional information, comments, or extensions that are available in the book.

Programming #

Many of the concepts in this course are computational. Where possible, simple Excel methods will be provided for you to apply these concepts. However, it is difficult to process, for example, modern LC-MS datasets with a spreadsheet processor. We therefore also provide programming scripts, currently in MATLAB and Julia. 

If you are unfamiliar with programming, there is an introduction course.

				
					% Significance Level
p = 1-alpha/2;

% Calculation
t_crit = icdf('T',p,x_dof); 
x_range = t_crit*(x_std/sqrt(n));

% Confidence Interval
x_CI = [x_mean - x_range, x_mean + x_range];
				
			

The T.INV() function can be used to compute the ICDF for the t-distribution for a given probability and number of degrees of freedom.

				
					using Distributions, Statistics

# Significance level
p           = 1 - alpha / 2

# Define the T distribution with degrees of freedom
t_dist      = TDist(x_dof)

# Calculation
t_crit      = quantile(t_dist, p)
x_range     = t_crit * (x_std / sqrt(n))

# Confidence interval
x_CI        = [x_mean - x_range, x_mean + x_range]
				
			

Course Overview #

At the current stage, this course contains the following components.

  1. Introduction to Chemometrics
  2. Statistics of Repeated Measurements
  3. Confidence Intervals
  4. Hypothesis Testing
  5. Comparing Two Means
  6. Power Analysis
  7. Comparing Variances
  8. Comparing Several Means (ANOVA)
  9. Pre-testing
  10. Robust Statistics (Coming July 2025)
  11. Modeling and Calibration I (Coming November 2025)
  12. Modeling and Calibration II (Coming November 2025)
  13. Design of Experiments (DoE) (Coming November 2025)
  14. Error Propagation (Coming November 2025)
  15. Introduction to Multivariate Statistics (Coming November 2025)

Advanced Chemometrics & Statistics #

A second course will be launched in February 2026, entitled Advanced Chemometrics & Statistics. Featuring lectures including:

  1. Non-linear Regression
  2. Weighted Regression
  3. Bootstrapping
  4. Decision Trees & Random Forests
  5. SVMs and k-means Clustering
  6. Optimization Strategies I
  7. Optimization Strategies II
  8. Reinforcement Learning
  9. Bayesian Statistics
Is this article useful?