Tools, Technologies and Training for Healthcare Laboratories

CLSI EP21 Reaffirms Total Analytical Error as the primary-principle for performance

In an extended review of the new CLSI EP21 standard, Dr. Paulo Pereira discusses how this guideline reaffirms total analytical error as the primary principle for quantitative performance assessment.

CLSI EP21 (3rd Ed., 2025): Reaffirming Total Analytical Error as the primary principle for quantitative performance assessment

Dr. Paulo Pereira
January 2026

Introduction -Total Analytical Error as the core decision framework for quantitative analytical performance

Clinical stakeholders do not experience “bias” and “imprecision” as separate phenomena - they experience a result that is either accurate enough for a medical decision - or not. EP21 is built around that decision-facing reality by operationalizing total analytical error (TAE) as the integrated expression of analytical deviation for a quantitative measurement procedure (1) and by linking the evaluation to predefined acceptability criteria (2) (see Guest Essay: New edition of CLSI EP46: best practices for determining allowable total error).

This framing is not “philosophy”; it is the practical translation of the two classical analytical pillars - systematic effects and random effects - into an actionable performance conclusion. EP21 therefore functions as a synthesis standard that allows laboratories, manufacturers, and assessors to answer the question that matters most: does this procedure meet clinical needs across the measuring interval?

 

1) The pillars are unchanged: bias and precision still underpin performance

EP21’s integrated view is fully consistent with metrology’s core vocabulary. In the International Vocabulary of Metrology (VIM), measurement precision is defined as closeness of agreement among indications or measured quantity values obtained by replicate measurements under specified conditions, and measurement bias is an estimate of systematic measurement error (3). These concepts remain the foundation for method validation and verification in laboratory medicine, and they map directly onto the way performance evidence is generated through established CLSI protocols for precision evaluation, verification, and method comparison (4-6).

EP21’s “reaffirmation” is therefore not a return to earlier paradigms; it formalizes the need to translate bias and imprecision into an integrated, decision-relevant statement of analytical acceptability.

 

2) EP21’s statistical “how”: a non-parametric TAE route grounded in patient samples

A key (and sometimes underappreciated) point in CLSI EP21 (3rd ed.) is that its recommended evaluation of TAE is naturally implemented as a non-parametric (distribution-free) estimate derived directly from the empirical distribution of paired patient-sample differences between a candidate and a comparator measurement procedure (7). In EP21, TAE is commonly defined as the central 95% region of observed differences, operationalized via the 2.5th and 97.5th percentiles (Plow = 0.025; Phigh = 0.975) (1), and then judged against a pre-set allowable total error (ATE) goal/limit (2).

 

EP21 non-parametric model (percentile-based)

For each unique patient sample i (i = 1…n), compute the paired difference:

  • Difference in measurand units

di = ycand,i - ȳcomp,i

where ȳcomp,i is the mean of R replicate comparator results for sample i (EP21 explicitly allows R to depend on the imprecision ratio between candidate and comparator).

Then define the TAE interval estimate for a chosen central region (most often 95%):

  • TAE interval (central 95%)

TAElower=Q0.025(d), TAEupper=Q0.975(d)

EP21 notes that TAE is often reported either as this interval (TAElower,TAEupper)or as a single conservative magnitude such as:

TAE=max⁡(∣TAElower∣,∣TAEupper∣)

and emphasizes that, by design, the EP21 protocol does not aim to separately estimate bias and imprecision from this experiment - it estimates their combined effect as it manifests in patient-sample result differences (while bias/precision can still be evaluated independently in separate studies).

 

Rationale for EP21 minimum sample size requirements in IVD-MD development

EP21 links the minimum n needed for a non-parametric estimate of the central (1-α)⋅100% region of differences to a simple bound:

n≥2/α

So, for the central 95% region (α=0.05), n ≥ 40 is the minimum; for the central 99% region (α=0.01), n ≥ 200. EP21 also makes clear that percentile estimates based on small n (e.g., 40) are more sensitive to extreme observed values, and therefore provides practical recommendations for typical studies (e.g., 120 unique patient samples across the AMI, and per-subinterval minima such as 60 for 2 subintervals or 40 for ≥3 subintervals). This “minimum/robust” framing is one reason EP21 is implementable under manufacturer constraints while remaining clinically anchored.

 

How this differs from the classic parametric “Westgardian” route - without changing the principles

Many laboratorians and manufacturers are familiar with the classical parametric total-error formulation that models the analytical error distribution using separately estimated components - typically bias (systematic error) and within-laboratory SD (random error) - and then derive an error limit under normality assumptions (8).

 

Classical parametric “Westgard” model (bias + z·SD)

EP46 summarizes the historical Westgard approach as an error interval driven by bias and imprecision, with a chosen normal-distribution multiplier z(eg, 1.96 for 95% two-sided; 1.65 for 95% one-sided) (2):

  • Parametric error interval

(|Bias|-z⋅SDWL,  | Bias|+z⋅SDWL )

In practice (e.g., with positive bias), the “worst-case” magnitude for the central 95% is often taken as:

TAE≈|Bias|+z⋅SDWL

with analogous handling for negative bias. EP46 also explicitly notes key limitations of this parametric model in real use, e.g., it assumes normal errors and may not reflect additional real-world error sources such as rare gross outliers, interferences, drift, lot effects, and other contributors.

 

Same principle, different statistics

EP21’s percentile-based method is different in the statistics (non-parametric, distribution-free), but identical in the governing principle:

  • quantify the combined effect of systematic and random error on patient results, and
  • judge acceptability against a predefined allowable error specification (ATE).

In other words: EP21 does not contradict the Westgard logic. It preserves the same clinical question (“How large can the analytical error be, with high probability, under intended-use conditions?”) while offering a robust estimation route that avoids imposing a parametric form when the empirical distribution of patient-sample differences does not behave ideally across the measuring interval.

 

3) Why EP21’s minimum sample guidance matters - especially for IVD-MD manufacturers

EP21 explicitly provides a “minimum” approach alongside more robust options, which is one reason it is so implementable across stakeholders. Practicality is not a minor feature: IVD-MD manufacturers often face feasibility constraints (specimen availability, timelines, multi-site logistics, and iteration during development). The minimum design guidance - while still anchored in patient specimens and measuring-interval coverage - makes EP21 realistic to apply under manufacturer conditions, without requiring complex distributional modeling or very large datasets.

That feasibility advantage is amplified when EP21 is used as part of a structured CLSI evidence chain: precision characterization using EP05/EP15 (4,6), method comparison, and bias considerations per EP09 (5) consistent with EP32 (9) - then integrated into a TAE conclusion per EP21.

 

4) EP21 and EP46 together: “estimate TAE” + “set allowable error goals/limits”

TAE is only actionable when compared with an allowable specification. EP46 is designed to support the determination of allowable total error goals and limits for quantitative measurement procedures, providing the conceptual and practical basis for defining ATE/TE goals that are fit for purpose. EP21 then provides the evaluation pathway to estimate TAE and compare it to those goals/limits.

This pairing is strategically important for all stakeholders:

  • Medical laboratories gain a defensible method to select/verify procedures and to justify acceptance criteria.
  • Manufacturers gain a coherent, submission-ready performance narrative that links study evidence to explicit allowable error claims.
  • Assessors and agencies gain transparency: a clear chain from allowable goals → study design → observed differences → acceptability decision.

 

5) Alignment with US regulatory expectations and EU, even when EP21 is not explicitly mandated

Even where EP21 is not directly referenced in regulation, its analytical logic aligns closely with what regulators expect manufacturers to demonstrate, document, and communicate about quantitative test performance.

  • US (labeling/performance expectations): US regulation requires IVD labeling to include “specific performance characteristics,” as appropriate - explicitly including accuracy and precision - and to summarize the supporting data, tied to generally accepted methods using biological specimens from normal and abnormal populations. This regulatory expectation naturally supports an integrated total-error narrative as a clear way to communicate whether combined bias + imprecision is acceptable for intended use (10).
  • US (FDA recognition of EP21 as a consensus standard): EP21 (3rd ed.) is listed in FDA’s Recognized Consensus Standards database for IVDs as a Complete standard (Recognition No. 7-347, effective/entry date 12/22/2025), reinforcing EP21 as a regulator-visible “best practice” route for performance substantiation and declarations of conformity where applicable (11)
  • EU (IVDR): Analytical performance is explicitly framed around trueness (bias), precision (repeatability/reproducibility), and accuracy (resulting from trueness and precision) - exactly the conceptual building blocks that EP21 integrates into a single, decision-facing acceptability evaluation via TAE against predefined allowable limits (12,13).
  • EU (MDR ecosystem + implementing rules): MDR provides the broader medical-device regulatory context that reinforces systematic demonstration of device performance and safety, while later instruments such as Commission Implementing Regulation (EU) 2022/1107 add common specifications in areas where tighter harmonization is required (notably for certain class D IVDs), shaping how performance evidence is reviewed and maintained over time (14).

Implication for global manufacturers: EP21 can function as a “best-practice backbone” across jurisdictions - compatible with the analytical-performance vocabularyembedded in EU regulation and simultaneously aligned with US regulatory expectations for performance characterization and communication - while offering a standardized, stakeholder-legible method to judge whether quantitative performance is acceptable for intended use.

 

6) EP21 in the CLSI “method evaluation workflow”: from component evidence to decision evidence

EP21 is most persuasive when it is presented not as a standalone “total error slogan,” but as the decision layer built on established evidence components:

  • Precision evaluation: EP05 (4) provides a comprehensive approach to estimating precision performance.
  • Verification in routine settings: EP15 (6) supports verification of precision and estimation of bias with practical designs suitable for many clinical laboratories.
  • Method comparison and bias estimation: EP09 (5) provides methodological foundations for comparing measurement procedures and evaluating differences across the measuring interval.
  • Traceability and comparability of results: EP32 supports the metrological traceability context needed for durable comparability (9)
  • Integration into a single acceptability conclusion: EP21 integrates observed differences into TAE conclusions against allowable limits.

This is precisely why EP21 is a strong tool for manufacturers and agencies: it converts a set of technical performance descriptors into a decision-relevant performance statement that can be communicated and defended.

7) Measurement uncertainty: essential to results, but not a replacement for EP21

Measurement uncertainty (MU) is indispensable to measurement science, but it answers a different primary question than EP21.

  • In the GUM, measurement uncertainty is a parameter associated with the result of a measurement that characterizes the dispersion of values that could reasonably be attributed to the measurand (15,16).
  • Eurachem emphasizes the conceptual boundary clearly: strictly, MU is not a performance characteristic of a measurement procedure but a property of the results obtained using that procedure (17-20).
  • ISO/TS 20914 provides practical guidance for estimating MU in medical laboratories, reinforcing MU’s role in interpreting and communicating the quality of reported results (21).

 

Why this distinction matters operationally

EP21/TAE is a performance assessment framework: it is fundamentally about deciding whether a procedure is fit for intended clinical purpose by integrating the effects of systematic and random error and comparing them to allowable limits. MU is fundamentally a result property: it describes the dispersion around a reported value under defined conditions and supports result interpretation and decision-making, particularly near clinical decision levels.

So, MU and EP21 do not compete. A coherent, scientifically sound picture is:

  • Bias and precision drive both TAE and MU, but
  • EP21 remains primary for performance accept/reject, while
  • MU remains essential for result-level interpretation and compliance decision rules.

 

Conclusion: EP21’s reaffirmation is about decision clarity, not ideology

EP21 reaffirms that TAE is the primary principle for quantitative performance assessment because it expresses what stakeholders need to know: the combined effect of bias and imprecision on results, judged against allowable limits, across the measuring interval.

Its recommended non-parametric, percentile-based implementation provides a robust statistical route to the same core principle long used in parametric total-error practice - while often being simpler to apply and more resilient to distributional irregularities. The inclusion of a feasible “minimum” approach makes EP21 particularly actionable in IVD-MD manufacturer settings, where study logistics and development cycles demand practicable yet defensible designs.

Measurement uncertainty remains essential in ISO 15189 frameworks - but as a property of results and as an input to decision rules in conformity and compliance, not as a replacement for EP21’s non-parametric method-performance or Westgardian parametric acceptability framework.

 

References

  1. Clinical and Laboratory Standards Institute (CLSI). Evaluation of Total Analytical Error for Quantitative Medical Laboratory Measurement Procedures. CLSI guideline EP21. 3rd ed. Wayne (PA): CLSI; 2025.
  2. Clinical and Laboratory Standards Institute (CLSI). Determining Allowable Total Error Goals and Limits for Quantitative Medical Laboratory Measurement Procedures. CLSI guideline EP46. 1st ed. Wayne (PA): CLSI; 2025.
  3. Joint Committee for Guides in Metrology (JCGM). JCGM 200:2012 International vocabulary of metrology (VIM): Basic and general concepts and associated terms. 3rd ed. 2012.
  4. Clinical and Laboratory Standards Institute (CLSI). Evaluation of Precision of Quantitative Measurement Procedures. CLSI guideline EP05. 3rd ed (A3). Wayne (PA): CLSI; 2014.
  5. Clinical and Laboratory Standards Institute (CLSI). Measurement Procedure Comparison and Bias Estimation Using Patient Samples. CLSI guideline EP09. 3rd ed. Wayne (PA): CLSI; 2018.
  6. Clinical and Laboratory Standards Institute (CLSI). User Verification of Precision and Estimation of Bias. CLSI guideline EP15. 3rd ed (A3). Wayne (PA): CLSI; 2014.
  7. Pereira P. A non-parametric framework for evaluating total analytical error in in vitro diagnostic medical devices in transfusion medicine. Transfus Apher Sci. 2024 Dec;63(6):104026.
  8. Westgard JO, Carey RN, Wold S. Criteria for judging precision and accuracy in method development and evaluation. Clin Chem. 1974 Jul;20(7):825-833.
  9. Clinical and Laboratory Standards Institute (CLSI). Metrological Traceability and Its Implementation. CLSI guideline EP32. 2nd ed. Wayne (PA): CLSI; 2025.
  10. U.S. Food and Drug Administration. 21 CFR Part 809 - In Vitro Diagnostic Products for Human Use. In: Code of Federal Regulations, Title 21, Vol 8. 2012.
  11. U.S. Food and Drug Administration (FDA). Recognized Consensus Standards: Medical Devices-Results (In Vitro Diagnostics): CLSI EP21 3rd Edition, Evaluation of Total Analytical Error for Quantitative Medical Laboratory Measurement Procedures (Recognition No. 7-347; extent: Complete; effective/entry date 2025 Dec 22) [Internet]. Silver Spring (MD): FDA; 2025 [cited 2026 Jan 4]. Available from: https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfStandards/results.cfm?category=InVitro&effectivedatefrom=&effectivedateto=&organization=&pagenum=10&productcode=&referencenumber=&regulationnumber=&sortcolumn=pdd&start_search=1&title=&type=
  12. European Parliament, Council of the European Union. Regulation (EU) 2017/745 of 5 April 2017 on medical devices, amending Directive 2001/83/EC, Regulation (EC) No 178/2002 and Regulation (EC) No 1223/2009 and repealing Council Directives 90/385/EEC and 93/42/EEC (Text with EEA relevance). Official Journal of the European Union. 2017 May 5;L117:1-175.
  13. European Parliament, Council of the European Union. Regulation (EU) 2017/746 of 5 April 2017 on in vitro diagnostic medical devices and repealing Directive 98/79/EC and Commission Decision 2010/227/EU (Text with EEA relevance). Official Journal of the European Union. 2017 May 5;L117:176-332.
  14. European Commission. Commission Implementing Regulation (EU) 2022/1107 of 4 July 2022 laying down common specifications for certain class D in vitro diagnostic medical devices in accordance with Regulation (EU) 2017/746 (Text with EEA relevance). Official Journal of the European Union. 2022 Jul 5;L178:3-56.
  15. Joint Committee for Guides in Metrology (JCGM). JCGM 100:2008 Evaluation of measurement data-Guide to the expression of uncertainty in measurement (GUM). JCGM; 2008.
  16. Joint Committee for Guides in Metrology (JCGM). JCGM 106:2012 Evaluation of measurement data-The role of measurement uncertainty in conformity assessment. JCGM; 2012.
  17. Eurachem/CITAC. Quantifying Uncertainty in Analytical Measurement (QUAM). 3rd ed. Eurachem/CITAC; 2012.
  18. Eurachem; Cantwell H, ed. The Fitness for Purpose of Analytical Methods-A Laboratory Guide to Method Validation and Related Topics. 3rd ed. Eurachem; 2025.
  19. Eurachem; Barwick V, ed. Planning and Reporting Method Validation Studies-Supplement to Eurachem Guide on The Fitness for Purpose of Analytical Methods. 2nd ed. Eurachem; 2025.
  20. Eurachem/CITAC. Use of Uncertainty Information in Compliance Assessment. 2nd ed. Eurachem/CITAC; 2021.
  21. International Organization for Standardization (ISO). ISO/TS 20914:2019 Medical laboratories-Practical guidance for the estimation of measurement uncertainty. Geneva: ISO; 2019.

Let us know what you're interested in!

Please use this form to request more information about.

Westgard Products and Services.

Invalid Input
Invalid Input
Invalid Input
Invalid Input