Tools, Technologies and Training for Healthcare Laboratories

Part VI: The Quality of GHb Testing

February 2005
with Sten Westgard, MS
The December 2004 issue of Clinical Laboratory Strategies provides an article on “Using POCT in Diabetes Management” [1]. The recommendations discussed come from the National Academy of Clinical Biochemistry (NACB) draft of a laboratory medicine practice guideline (LMPG) that was presented at the 2004 Clinical Chemistry meeting [2]. The recommendations consider a number of tests, including Glycohemoglobin (GHb). With respect to GHb the report suggests there is “good evidence to support the use of POCT” because of the benefit of having test results available when the physician sees the patient.

While this recommendation originates as part of a study of evidence-based medicine (EBM), we question whether there really is sufficient evidence that the quality of testing is adequate for the recommended application. The medical usefulness of glycohemoglobin is not the issue. This test is generally accepted as a surrogate for clinical outcome in the treatment of diabetic patients, therefore it is one of the most important laboratory tests today and the comprehensive use in the diabetic population is often taken as a measure of quality of treatment for a healthcare organization. The issue is the analytical performance that is available from current measurement procedures.

In an earlier report [3], NACB provided more detailed guidelines for glycohemoglobin testing, as follows:

  • GHb should be measured at least twice a year;
  • Treatment goals should be based on clinical studies such as the Diabetes Control and Complications Trial (DCCT);
  • The desired GHb concentration is 7% of less;
  • Treatment should be reevaluated with GHb exceeds 8%;
  • US laboratories should use methods certified by the National Glycohemoglobin Standardization Program;
  • Methods should have an interassay CV < %5, ideally 3%;
  • Two control materials with different mean values (high and low) should be analyzed at the beginning and end of each day’s run.
The last point implies there should be a total of 4 control measurements per run, which is not likely to be the practice in most US laboratories today because the CLIA regulations require only 2 levels of control per day. It is interesting that GHb is not a regulated analyte, i.e., CLIA does not define any criterion for acceptable performance for GHb tests. Furthermore, this also means that proficiency testing programs generally provide only 2 specimens per testing event rather than the 5 specimens per event that would be required for regulated analytes. And performance is graded on the curve, i.e., against the observed distribution, which represents the "state of the art" for current measurement procedures, rather than any defined requirement for quality.

Materials and Methods

Sigma-metrics are estimated for glycohemoglobin testing based on proficiency testing data collected during 2004. The available data is limited because only 2 specimens are provided per testing event.

  • Given there is no CLIA criterion for acceptable performance, we initially defined TEa to be 10%. If short-term glucose testing is supposed to be correct within 10%, then long-term monitoring should be expected to be consistent with that same quality requirement. As part of this assessment, however, the effect of different requirements for quality is evaluated for TEa from 10% to 25% and from 1% Hb to 2.5 %Hb in concentration units.
  • Survey specimens were selected near the medically important decision concentration of 7 %Hb. Most samples were actually in the 8 to 9 %Hb range, which would represent patients whose treatment would need to be re-assessed if following the NACB treatment recommendations.
  • PT data comes from 2004 surveys performed by the American Academy of Family Physicians (AAFP), Medical Laboratory Evaluation (MLE), American Association of Bioanalysts (AAB), American Proficiency Institute (API), College of American Pathologists (CAP), and New York State (NY).
  • National Test Quality (NTQ) observed for a single proficiency testing sample is estimated from the CLIA total allowable total error (TEa) divided by the group SD or CV, i.e., Sigma = TEa/CV. The average NTQ observed for multiple surveys is weighted for the number of laboratories participating in the survey.
  • Local Method Quality (LMQ) for a single proficiency testing sample is a weighted average of the Sigmas determined for each method subgroup without accounting for method bias, i.e., Sigma = TEa/CVmethsubgroup. The average LMQ observed for multiple surveys is weighted for the number of laboratories participating in each survey.
  • National Method Quality (NMQ) observed for a single proficiency testing sample is a weighted average of the Sigmas determined for each method subgroup taking bias into account, i.e., Sigma = (TEa – biasmethsubgroup)/CVmethsubgroup. The average NMQ observed for multiple surveys is weighted for the number of laboratories patricipating in each survey.
Predictive estimates of sigma performance were also made on the basis of the NACB recommendations for treatment guidelines and method performance specifications. These estimates make use of a clinical quality-planning model [4] that interprets the gray zone between two different treatment decisions as a “clinical decision interval” and accounts for the expected within-subject biologic variation, as well as the precision and bias of the measurement procedure and the error detection characteristics of the QC rules and numbers of control measurements. [See also on this website – Quality Planning Models] A figure of 4.1% was used for within-subject biologic variation [5]. The calculations and associated graphics were provided by the EZ Rules 3 computer program.

Further details on the methodology are discussed in an earlier essay.


Table 1 shows the proficiency testing results from approximately 5000 laboratories. The numbers of laboratories represent approximately half the numbers from the earlier case studies for cholesterol, glucose, and calcium. These data come from the same 5 survey programs, as shown in column 1. The numbers of labs in each of the survey programs, as shown in column 2, are also about half of the numbers in the earlier cases. The actual specimen concentrations range from 8.11 %Hb to 9.30 %Hb, with a weighted average of 9.06 %Hb. If TEa is 10%, the National Test Quality is estimated to range from 1.31 sigma to 2.43 sigma for the different survey programs, with a weighted average of 1.93 sigma. A similar estimate was obtained for National Method Quality. The more optimistic estimates for Local Method Quality range from 2.33 Sigma to 2.82 Sigma, with a weighted average of 2.57 Sigma.

Table 1. Summary of Glycohemoglobin Quality for TEa=10% from 5 national PT survey program
Quality requirement of 10%
Survey Program Labs Group Mean NTQ LMQ NMQ Datasheet
AAFP 209 9.30 1.82 2.76 2.12 1
MLE 342 9.03 1.31 2.33 1.15 2
AAB 885 8.11 1.53 2.50 1.82 3
API 1650 9.27 1.69 2.35 1.69 4
CAP 1980 9.30 2.43 2.82 2.29 5
Group summary 5066 9.06 1.93 2.57 1.93

As shown in the Sigma metrics graph, these estimates of method quality are rather dismal and the methods are not controllable with Ns of 2, or even Ns of 4!


Table 2 shows the effect of the defined quality requirement on the estimation of the Sigma metrics. The quality requirement was varied both in terms of percent (10% to 25%) and in concentration units (%Hb from 1.0 to 2.5 ng/mL). This “statistical sensitivity analysis” is useful for identifying the current level of reliable test performance. As shown in the bottom two rows of the table, current analytical methodology can achieve 5 Sigma to 6 Sigma performance when the quality requirement is from 25% to 2.5% Hb. That means that the current NACB treatment guidelines that require a level of 8.0 %Hb be measurably distinct from 7.0% Hb are not consistent with current analytical methodology. The “state of the art” today is that a level of 6.0% Hb is distinguishable from a level of 9.0 %Hb by the analytical methods in use.
Table 2. Glycohemoglobin Quality Achievable by Current Laboratory Methods

Table 2. Glycohemoglobin Quality Achievable by Current Laboratory Methods
Quality Requirement Sigma-metrics for current laboratory methods Datasheet
Allowable Total Error Labs Group Mean NTQ LMQ NMQ
10% 5066 9.06 1.93 2.57 1.93 Table 1
1.0 %Hb 5066 9.06 2.12 2.85 2.20 Table 1 (2)
15% 5066 9.06 2.89 3.86 3.21 Table 1 (3)
20% 5066 9.06 3.86 5.15 4.50 Table 1 (4)
2.0 %Hb 5066 9.06 4.24 5.69 5.04 Table 1 (5)
25% 5066 9.06 4.82 6.44 5.79 Table 1 (6)
2.5 %Hb 5066 9.06 5.30 7.11 6.47 Table 1 (7)

Datasheets for these assessments of quality are available in the GHb data file on this website.

To understand the effect of the NACB recommendations for method performance relative to the recommendation for test interpretation, a decision interval of 14.3% (1%Hb/7%Hb) was evaluated taking into account an expected within-subject biologic CV of 4.1%, method bias of 0%, and a method CV of 5%, the maximum allowable CV recommended by NACB. Using a clinical quality-planning model [4], the Sigma metrics calculations and graph predicts very poor performance, consistent with what is observed in the data from PT surveys.
If a more optimistic estimate of method precision is used, i.e., the 3.0% figure that NACB recommends as optimal, the Sigma metrics graph shows that performance will be somewhat better, but still not acceptable for the recommended interpretation of the test.


To satisfy the NACB interpretive guidelines, specifications for analytical performance and QC can be derived from the clinical planning model. The input parameters are shown here, along with the sigma-metrics graph and the QC procedure that has been selected using the EZ Rules3 computer program. To distinguish a GHb of 7.0 %Hb from 8.0%Hb requires a method with a CV of 2.0%, bias of 0%, and a QC procedure using 2.5s control limits with 2 levels of control per run.



The assessment of quality of glycohemoglobin testing is more complicated due to the fact that GHb is not a regulated analyte, PT data is limited, and there is no agreed upon quality requirement. It is difficult to understand why GHb isn’t a regulated analyte, given the general acceptance that this is one of the most important tests for monitoring the effectiveness of treatment in one of the largest disease populations.

Earlier assessments of the quality of cholesterol, glucose and calcium tests were straightforward because they were regulated analytes and CLIA defined the quality required for acceptable performance in proficiency testing. As we extend the investigation of quality to other tests, our assessment methodology must also extend beyond PT data. With the GHb example, estimates of Sigma performance from predictive quality-planning models provides new insights into the quality that can be expected on the basis of treatment guidelines and performance specifications. And we can now begin to understand why the quality of a laboratory test is what it is!

For GHb, the guidelines emerging under the auspices of NACB and the principles of evidence-based medicine identify medically important cutoffs for treatment decisions as well as method performance specifications. In this case, the use of a clinical quality-planning model demonstrates that the analytical performance specifications for precision and accuracy are not consistent with the recommended interpretation of the test and the associated treatment guidelines.

With the use of quantitative models that incorporate the relationship between quality, precision, accuracy, and QC, we can also assess how current performance should be reflected in the interpretation of test results. In effect, we can work backwards from the performance observed to assess the reliability of test results for clinical use. For GHb, test results are reliable for changing treatment based on a decision interval of 3 %Hb, but not for the currently recommendation that a patient where the decision interval is from 7 %Hb to 8 %Hb. For test results to reliably distinguish a patient having a GHb of 7 %Hb from one with 8 %Hb, the measurement procedure must operate with a 2% CV and zero bias. Such performance is not unattainable by some of today’s analytical systems, but it is not generally achieved, as revealed by the available PT data.


This study of the quality of glycohemoglobin testing in laboratories today challenges any assumption that this testing can or should be carried out at POC sites. The quality in large laboratories employing the best automated methodology is not good enough today. The quality in smaller laboratories appears to be even worse. The evidence presented here shows that glycohemoglobin is NOT ready for prime time testing by operators with minimal analytical training and skills! GHb testing requires the best analysts working with the best measurement technology available today!


  1. Using POCT in Diabetes Management: The NACB Draft LMPG Recommendations. Clinical Laboratory Strategies 2004;9(Dec):5-8.
  2. Evidence-Based Practice for POCT.
  3. Sack DB, Bruns DE, Goldstein DE, Maclaren NK, McDonald JM, Parrott M. Guidelines and recommendations for laboratory analysis in the diagnosis and management of diabetes mellitus. Clin Chem 2002;48:436-472.
  4. Westgard JO, Hyltoft Petersen P, Wiebe DA. Laboratory process specifications for assuring the quality in the US National Cholesterol Education Program. Clin Chem 1991;37:656-661.
  5. Lytken Larsen M, Fraser CG, Hyltoft Petersen P. A comparison of analytical goals for haemoglobin A1c assays derived using different strategies. Ann Clin biochem 1991;28:272-278.

James O. Westgard, PhD, is a professor of pathology and laboratory medicine at the University of Wisconsin Medical School, Madison. He also is president of Westgard QC, Inc., (Madison, Wis.) which provides tools, technology, and training for laboratory quality management.