We have learned thus far to interact with the data and calculate various descriptive statistics. But how can we use these bits of information to answer our actual analytical questions? This will be the topic of the present lesson. We will see how a general hypothesis test works and learn about the first of several such tests: the comparison of our sample mean with a reference value.
Learning Goals #
- Understand how hypothesis tests are a cornerstone of analytical chemistry.
- Express an analytical problem into hypotheses.
- Conduct a hypothesis test to compare a sample mean with a reference value.
- Gain improved understanding about the significance level.
- Measures the probability of obtaining the observed results, assuming that the null hypothesis is true, using a p-value.

READ SECTIONS 9.3.1
Introduction to hypothesis testing
1. Hypothesis tests #
1.1. What is hypothesis testing? #
Hypothesis testing is an inferential statistical method that can be used to determine whether there is sufficient evidence in the measured data to support conclusions about the population. Hypothesis testing is a corner stone of analytical chemistry, with applications including instrument calibration, regulatory systems compliance, and policy development.
A hypothesis test yields evidence regarding whether the hypothesis is plausible based on the available data. It does not provide evidence to whether the hypothesis is true. This also means that we can make mistakes using the hypothesis test.
1.2. The analytical question #
Before we explore how a hypothesis test works we must first consider what we are actually after. To ensure that our answer is relevant, we need to distill the analytical question from the actual problem.Â

Is the factory polluting the river?
Environmental analysis is a classic example of analytical chemistry. The analytical question in this context would be: Is the concentration of pollutant X in the wastewater effluent exceeding the limit?
Is the suspect guilty?
Analysis of DNA, gunshot residue, alcohol levels, fire debris, etc. are each examples of analytical chemistry in forensics. The analytical question could be: does the chemical profile of the sample match the one found at the suspects house?


Is the buffer affecting retention in RPLC?
We must tailor a multitude of parameters when we develop analytical separation methods. How to test whether a parameter actually affects the results?
Is one of the instruments broken?
Hypothesis testing can also be used to aid the analytical chemist. A reference sample can be measured with a certified variable. Is the concentration according to specifications?

In each of these cases we will have to infer information about the population based repeated measurements of a sample of this population. This is an example of inferential statistics. Â
1.3. How does a hypothesis test work? #
The mechanism of a hypothesis test will be addressed in detail in the next sections. In brief, once an analytical problem has been identified for testing, a hypothesis test roughly works like this:
- Identify the statistical method and its prerequisites.
- Formulate the hypotheses.
- Decide on the significance level.
- Calculate the test statistic.
- Conduct the statistical test and determine its outcome.
The success of a hypothesis test heavily relies on the quality of the available data. Obviously, an important step preceding the ones above is that the analytical method provides good data!
Type of hypothesis tests
There are a large number of statistical hypothesis tests. In this course we will cover:
- Comparison of a mean with a reference value (Lesson 4)
- Comparison of two means (Lesson 5)
- Comparison of a variance with a reference value (Lesson 7)
- Comparison of two variances (Lesson 7)
- Comparison of multiple means (Lesson 8)
- Normality test
- Outlier test

READ SECTIONS 9.3.2
Comparison of mean with reference value
2. Comparison of sample mean with reference (t-test) #
For this type of hypothesis test, the objective is to compare our sample of a limited number of repeated measurements to a reference value. To illustrate this, we will regard the identification of a compound in gas chromatography using retention indices. The retention of a reference compound should be 1290. Note that this is now our true value, our \mu_{0}. Six repeated measurements are conducted, yielding
x = [1293, 1291, 1285, 1287, 1291, 1283];
The question now is whether the retention index matches. Let’s walk through the steps.
2.1. Step 1: Test requirements #
Each statistical test relies on certain prerequisites or assumptions. This is similar to what we saw in Lesson 2 where the mean was heavily affected by an outlier and the median was not. For the present case, we are about to do a z or t-test comparison and there several relevant assumptions. The data must be:
- Pertaining to continuous (interval) or ordinal variables.
- Normally distributed.
- Free of outliers.
- Independent: a representative but random sample from the population.Â
To focus on the operation of the hypothesis test we will for now assume that these requirements are met. We’ll learn how to test for them in later lessons.Â
The t and z comparisons of a mean with a reference value requires that the sample must be “independent: a representative but random sample from the population“. Which of the following statements is in agreement with this?
This is correct! Objects of a sample must always be measured in random order to avoid carryover effects biasing the results (lab bias). Moreover, the contents of a vial in chromatography should be homogenous. This means that the liquid throughout the entire vial has a similar composition so that the location from the which the liquid is taken does not matter. As for the calibration, for the resulting values from the experiment to be useful, we DO want the concentrations of the calibration standards to span the entire range of possible sample concentrations, but we do not want to measure them in order (again to avoid lab bias).
This is unfortunately not correct, yet! Objects of a sample must always be measured in random order to avoid carryover effects biasing the results (lab bias). Moreover, the contents of a vial in chromatography should be homogenous. This means that the liquid throughout the entire vial has a similar composition so that the location from the which the liquid is taken does not matter. As for the calibration, for the resulting values from the experiment to be useful, we DO want the concentrations of the calibration standards to span the entire range of possible sample concentrations, but we do not want to measure them in order (again to avoid lab bias).
2.2. Step 2: Hypotheses #
The alternative hypothesis (H_1 or H_a), is the opposite. H_1 does propose that there is a difference and therefore that the effect is significant.
Note how the two hypotheses are mutually exclusive. Only one of them can be true, and one is always true.
2.3. Step 3: Determine significance level #

Figure 1 shows our probability distribution for the present case, a t-distribution with five degrees of freedom. We can make several observations:
- This probability distribution assumes that H_0 is true. Given that H_0: \mu=\mu_{0}, it makes sense that – assuming that H_0 is true – its probability density is the strongest near t = 0 (see also next section).
- For H_0 to be rejected requires t_{\text{obs}} needs to deviate from 0.Â
- At some point, t will exceed the threshold, t_{\text{crit}}, the significance level.
- The larger the significance, the smaller t_{\text{crit}} will be, and thus the easier it is to reject H_0.

READ SECTION 9.3.2.3
Significance level
Read now if you did not already.

In practice, \alpha is typically set at 5% (i.e. 0.05), but the actual value differs greatly per application field. We will investigate in Lesson 6 how this and the consequence of the value even differs per case.
Note that, because the retention index can deviate in either direction, we are checking for both a positive and negative deviation. This is called a two-sided test, and this is why the signifiance is divided over both sides, similar the with the confidence levels in Lesson 3.Â
2.4. Step 4: Calculate test statistic #
We have set up the entire statistical experiment. It is now time to start calculating the statistic for our sample. Here we have to make a distinction. We concluded in the previous lesson that, if the number of datapoints n is sufficiently large (i.e. n >30), we can reasonably assume that s represents \sigma and accordingly use the z-test. In the present case, we have six retention indices, so we use the t-test instead. The z-test will be briefly treated in later sections.
We will have to calculate our statistic, t_{\text{obs}}, which is a standardized value that is calculated from sample data during a hypothesis test. The procedure that calculates the test statistic compares our data to what is expected under the H_0.
Equation 9.24: t_{\text{obs}}=\frac{|{\bar{x}-\mu_{0}}|}{s/\sqrt{n}}
t_{\text{obs}} is a standardized value that is calculated from sample data during a hypothesis test. The procedure that calculates the test statistic compares your data to what is expected under the H_0.
Calculate t_{\text{obs}} for the sample of retention index data using Equation 9.24. See the beginning of Section 2 on this page for more information. Note that \mu_{0} = 1290. Round to 3 decimals.
This is the correct answer! For the calculation we could use Equation 9.24 and note that the equation required us to take the absolute difference between \bar{x} and \mu_{0}. \bar{x} was 1288.333, s 3.933, and n was 6 as we had six retention index values in our sample.
This is not yet the correct answer! For your calculation you can use Equation 9.24. Note that the equation requires you to take the absolute difference between \bar{x} and \mu_{0}. \bar{x} should be 1288.333, s 3.933, and n is 6 as we have six retention index values in our sample.
2.5. Step 5: Statistical test #
Now that we have our test statistic, we can conduct the actual test. There are two methods, that each give the same outcome.
USING THE CRITICAL VALUE #
The first strategy requires t_{\text{crit}} to be calculated using the ICDF (Lesson 2). In essence this means calculating t_{\text{crit}} for the \alpha = 0.05 case in Figure 1.
H_0 is accepted if t_{\text{obs}} lies closer to 0, than t_{\text{crit}}. In other words, if t_{\text{crit,+}}>t_{\text{obs}}>t_{\text{crit,-}}.
USING THE P-VALUE #
In this lesson we did not focus on it much to this point, but we can remember from Lesson 2 that the area under the curve of the PDF is the probability, also known as the p-value.
Calculate the p-value for the retention index case. Round to 3 decimals.
Correct!
Unfortunately, this is not correct. Is your answer 0.174? Then you are almost there: you calculated the p-value for one side, but still need to multiply it by two (we are doing a two-sided test). If your p-value for one side was not 0.174, then be sure you calculate the CDF correctly. You should use the same mean, number of datapoints and standard deviation as for the previous question. Don’t forget that you are using the t-distribution, which has n-1 degrees of freedom. You will also need to fill in t_{\text{obs}}.
Note that the test statistic of Equation 9.24 can also be expressed as t_{\text{obs}}=r_A/r_B with r_A=|\bar{x}-\mu_0| and r_B=s/\sqrt{n}. These ranges of A and B are also depicted in Figure 3. As we can see, we are in fact considering whether the difference between the sample mean and the reference (r_A) can be explained by the spread of the data (r_B).

We still have to formally establish the outcome of the hypothesis test.
What is the outcome of the hypothesis test? Does the retention index match for the sample?
Correct!
First things first. The null hypothesis H_0 is accepted regardless of which strategy was chosen. After all, t_{\text{obs}} (1.038) was smaller than t_{\text{crit,+}} (2.571) and larger than t_{\text{crit,-}} (-2.571). Also the p-value (0.348) is larger than the significance (0.05).
Now that we have that out of the way, we can talk about the second part. Technically with what you have learned so far, your answer was also correct if you concluded that the retention index was a match. But.. statistically speaking.. we can never know for sure whether a hypothesis is actually true. The only thing we can say is whether the present data allows us to reject H_0. We will see in Lessons 5 and 6 that, with proper data, we can make any test fail or pass. So, we have to be careful here.
Oops! Well maybe your answer is still correct. Read on.
First things first. The null hypothesis H_0 is accepted regardless of which strategy was chosen. After all, t_{\text{obs}} (1.038) was smaller than t_{\text{crit,+}} (2.571) and larger than t_{\text{crit,-}} (-2.571). Also the p-value (0.348) is larger than the significance (0.05).
Now that we have that out of the way, we can talk about the second part. Technically with what you have learned so far, your answer is also correct if you concluded that the retention index is a match. But.. statistically speaking.. we can never know for sure whether a hypothesis is actually true. The only thing we can say is whether the present data allows us to reject H_0. We will see in Lessons 5 and 6 that, with proper data, we can make any test fail or pass. So, we have to be careful here.
3. z-test comparison #
The comparison of a mean with a reference value can also be conducted using the z-test. The test works exactly the same way, but the CDF and ICDF are now based on the z-distribution. Furthermore, the z-statistic must be calculated (Lesson 2) instead of the t-statistic.
Note that the z-distribution should only be used if n > 30, as only then we can assume s to be representative of \sigma (Lesson 3).

READ SECTION 9.3.3
Tail testing
4. One-sided tests #
We have just compared a sample mean retention index to a reference value. In this specific case it did not matter whether any deviation was positive or negative, because we were interested in whether there was a match. In other words, both positive and negative were equally important and consequently we performed a two-tailed test.
There are, however, also cases where the interest is specifically aimed at either a negative or positive deviation. For example, to test whether a threshold has been exceeded (e.g. a pesticide concentration in surface water), the hypotheses could be:
H_0: \mu\leq\mu_0 and H_1: \mu>\mu_0
In this case, it is desirable to focus the test to the positive (right) direction of the PDF. We would then do a right-sided or right-tail test. The contrary is true if our interest is to determine whether a minimum concentration has been realised. For instance, material analysis that requires a variable to be at a minimum level for a product property to be achieved. Here, the hypotheses are:
H_0: \mu\ge\mu_0 and H_1: \mu<\mu_0
This would be an example of a left-tail or left-sided test. In either case, the t statistic is calculated slightly differently:
Equation 9.26: t_{\text{obs}}=\frac{{\bar{x}-\mu_{0}}}{s/\sqrt{n}}
Something that can help knowing what to do is considering the sign H_1. Generally, if the signs H_1 points leftwards it concerns a left-tail test, rightwards signifies a right-tail test, and an equal sign points to a two-tailed test. One-sided t-tests or z-tests work the same as the two-sided variants, with the only exception being the slightly different computation of the test statistic, and the fact that the significance now focuses on only one of the sides.

Concluding Remarks #
We have learned how to conduct our very first hypothesis test. With hypothesis testing, we assess the probability of observing our data when H_0 is true. We do not assess the probability of the hypothesis in the context of our data. There are some important consequences:
- The p-value is the probability of obtaining the data obtained (or more extreme) supposing that the H_0 is true.
- The p-value is not the probability that the H_0 is true given the data.
- The p-value is not the probability of wrongly rejecting H_0 (this value is \alpha).
- The p-value does not inform us at all about the validity of a certain hypothesis.
Exercise Solution #
Below are some helpful files to check whether you did the exercise correctly or to help you forward.

% Data
x = [1293, 1291, 1285, 1287, 1291, 1283];
a = 0.05;
ref = 1290;
% Gathering Information
n = length(x);
x_dof = n-1; %
x_mean = mean(x);
x_std = std(x);
p = 1-a/2; % 1-Alpha/2
% Step IV: Calculate Test Statistic
t_obs = abs(x_mean-ref)/(x_std/sqrt(n));
% Step V: Critical Value Approach
t_crit = icdf('T',p,x_dof);
if t_obs-1*t_crit
accept = 'H0';
else
accept = 'H1';
end
% Step V: Critical Value Approach
t_crit = icdf('T',p,x_dof);
% For Two-Sided Test Specifically
if t_obs-1*t_crit
accept = 'H0';
else
accept = 'H1';
end
% Step V: P-Value Approach
p = 2*(1-cdf('T',t_obs,x_dof));
if p>a
accept = 'H0';
else
accept = 'H1';
end
The attached file below contains a fully worked out example with additional explanation.
It can be downloaded here (CS_04_EE, .XLSX).
using Distributions, Statistics
# Data
x = [1293, 1291, 1285, 1287, 1291, 1283]
a = 0.05
ref = 1290
# Gathering information
n = length(x)
x_dof = n - 1
x_mean = mean(x)
x_std = std(x)
p = 1 - a / 2
# Step IV: Calculate Test Statistics
t_obs = abs(x_mean - ref) / (x_std / sqrt(n))
# Step V: Critical value approach
dist = TDist(x_dof)
t_crit = quantile(dist, p)
if t_crit > t_obs > (-1*t_crit)
accept = "HO"
else
accept = "H1"
end
# Step V: Critical value approach
dist = TDist(x_dof)
t_crit = quantile(dist, p)
# For two-sided test specifically
if t_crit > t_obs > (-1*t_crit)
accept = "HO"
else
accept = "H1"
end
# Step V: p-value approach
p = 2 * (1 - cdf(dist, t_obs))
if p > a
accept = "HO"
else
accept = "H1"
end