This question comes from Robbie Keith of Summit Laboratory We are in the process of evaluating our QC program. Our techs monitor Levy-Jennings charts for shifts and trends weekly. We would like to know what you consider to define a shift or trend (e.g. how many points are required increasing or decreasing to define a trend?) Consider control rules such as 41s, 10mean, etc., as good indicators of shifts and trends. The number of observations needed increases as the limit approaches the mean of the control material in order to keep the false rejections down. Minimum number of consecutive observations above or below the mean should probably be set as 6. There are some recommendations, particularly in the Germany, to use 7 above or below the mean, or 7 trending consecutively in one direction.

Are Scientific Statements the Scientific Truth?

Callum G. Fraser,Ph.D.

Callum G. Fraser, Ph.D., the noted expert on biologic variation, takes an in-depth look at new guidelines for hsCRP. While the AHA/CDC has produced a scientific statement, sadly, he finds they have not found the scientific truth.

Clinical interpretation of hsCRP results
Calculation of the Number of Samples Required
Consequences of taking two samples - cholesterol
Consequences of taking two samples - hsCRP
Duplicate assays and two samples
Consequences for providing quality
Problems with the AHA/CDC proposal
The final plea
References
About Callum G. Fraser

Biochemical Medicine,
Ninewells Hospital and Medical School,
Dundee DD1 9SY Scotland.

Note: To see the calculators available with this essay, visit this link.

Dr. Westgard has spent considerable time and effort stimulating people to think objectively about the validity of guidelines, recommendations, and statements from expert groups, especially those from professional bodies. A recent essay on this vitally important quality matter can be found in The truth, the whole truth, and nothing but the truth on this website.

Some people may get tired of Dr. Westgard's evangelism about the need for correct test results and the importance of appropriate quality management. However, I agree with him and think that the basic underlying message of many of his essays is of vital importance. It is mandatory for us clinical scientists, and we are scientists first and foremost, to examine the truth of the ever-increasing number of supposedly evidence based guidelines that are published, often very widely, in the medical literature.

Clinical interpretation of hsCRP results

One of Dr. Westgard's examples of something that is "the truth and the whole truth" but unfortunately does not fulfil the final "nothing but the truth" criterion is provided by the much publicised new test - high sensitivity CRP [hsCRP]. He has explored the problems relating to the clinical interpretation of the results of hsCRP assays on individuals using quintiles derived based on epidemiological studies that consider large groups of people in Quintiles and quality on this website.

Since Dr. Westgard's recent essay on the problems of interpretation of hsCRP results using quintiles, the American Heart Association [AHA] and the Centres for Disease Control and Prevention [CDC] have published an AHA/CDC Scientific Statement. This concerns Markers of Inflammation and Cardiovascular Disease: Application to Clinical and Public Health Practice [1]. The summary of the Statement documents that, on the basis of published evidence, "it is most reasonable to limit current assays of inflammatory markers to high sensitivity CRP [hsCRP], measured twice, either fasting or nonfasting, with the average expressed in mg/L, in metabolically stable patients. Relative risk categories [low, average, high] correspond to approximate tertiles of values [3.0 mg/dL respectively] based on an aggregation of population studies".

The real problem with the idea of interpreting data in quintiles - now apparently tertiles are better - is that biological variation is not taken into account. For hsCRP, this has been discussed by Campbell and colleagues [2], who showed that the number of samples needed to distinguish a CRP of 1.13 mg/L [the mean for the healthy group] from a CRP of 1.38 mg/L [the mean for the group of men who went on to have a stroke] was 18!

Following some background, this essay will explore the effect of performing replicate assays and replicate samples on clinical interpretation based upon quantitative knowledge of both precision and within-subject biological variation. To clarify the terminology used in this discussion:

Sample is used to represent a specimen taken from a patient. Replicate samples means obtaining multiple specimens from a patient.
Assay is used to represent a measurement made on a sample. Replicate assays means making multiple measurements on a sample or specimen.

Calculation of number of samples required

It is easy to calculate the number of samples required to obtain an estimate within a certain percentage of the true individual homeostatic setting point of the individual from the formula based on a simple standard error of the mean estimate [3],

n = [Z * [CV_A² + CV_I²]^1/2/D]²

where Z is the number of standard deviations appropriate to the probability - and 1.96 is very often used since this is the 95% probability [P < 0.05] level;
CV_A is the analytical precision at the level of the homeostatic setting point;
CV_I is the within-subject biological variation; and
D is the percentage deviation allowed from the true homeostatic setting point.

An online calculator is available here to perform this task. You can use this calculator to verify the numbers in the examples below.

Consequences of taking two samples - cholesterol

Recommendations from many national and international experts regarding the interpretation of cholesterol results, including the National Cholesterol Educational Panel [NCEP] [4] state categorically that two results should be obtained before the individual's "number" is decided. The rationale is that this is to cater for "biological variation".

Using the formula above, let us suppose that the laboratory just attains the maximum precision allowable by NCEP, that is CV = 3% and justifies this performance by reference to the published "truth". We know that the within-subject biological variation of cholesterol is 6.0% [see Biological Variation Database & Desirable Quality Specifications on this website]. If we simply want the mean to be not more than 10% from the true value for that individual with 95% probability, then, since n = [Z * [CV_A² + CV_I²]^1/2/D]², n = [1.96 * [3.0² + 6.0²]^1/2/10]² = 1.73. Thus, the NCEP and the many other guidelines that suggest two samples from an individual should be taken are actually scientifically sound; they are "nothing but the truth".

In contrast, the new AHA/CDC document [1] bases recommendations on rather less good evidence. It is stated that "most of the acute-phase reactant assays have acceptable precision. It needs to be emphasized that the assays considered [in Table 2] were for hsCRP with acceptable precisions down to or below 3.0 mg/L". In fact, the inter-assay precision quoted in Table 2 is CV < 10%. Of course, this might be taken by laboratories in a manner similar to the NCEP guidelines that a precision of anything up to 10.0% is acceptable.

The document also states that "Considerable within-individual variability exists, however for hsCRP….The final result is that, in a manner similar to cholesterol, two separate measurements of hsCRP are adequate to classify a person's risk level and to account for the increased within-subject variability".

Are two separate measurements satisfactory?

Consequences of taking two samples - hsCRP

Using the above formula and the lowest estimate of within-subject biological variation used by Campbell and colleagues [2] based on the work of Franzini [5] of 30.3%,
n = [Z * [CV_A² + CV_I²]^1/2/D]² so that n = [1.96 * [10.0² + 30.3²]^1/2/10]² = 39.

We need 39 samples to get an estimate within 10% of any individual's homeostatic setting point if our method has the "acceptable precision" of 10% quoted by the AHA/CDC.

What does the mean of two samples actually give us? We can rearrange the above equation to
D = (Z * [CV_A² + CV_I²]^1/2) /n^1/2 so that D = (1.96 * [10.0² + 30.3²]^1/2) /2^1/2 = 44%.

Thus, taking two samples from an individual and taking the mean gives us an estimate that could be, , +/- 44% of its value with 95% probability. In other words, if we obtained a mean value of 0.9 mg/L [low-risk], this could well be 1.4 mg/L [in the average risk category] and, more importantly, any result more than 2.1 mg/L [well in the average risk tertile] could be above 3.0 mg/L, that is, high risk.

Thus, the idea of getting a risk score that we can depend on by taking two samples for hsCRP seems basically flawed and far from "nothing but the truth".

Duplicate assays and two samples

The effect of analysing one sample twice, or taking two samples and analysing them once each, or taking two samples and assaying them twice each seems poorly understood and not well documented.

We will explore this using cholesterol as the first example.

The dispersion of a result obtained by one analysis of a single sample obtained from an individual is calculated from the formula

Dispersion = Z * [(CV_A²/n_A) + (CV_I²/n_S)]^1/2

where, as above, Z is the number of standard deviations appropriate to the probability selected - 1.96 for 95% [P < 0.05] for example, the most commonly used Z-score;
CV_A is the precision at the level of the result,
n_A is the number of replicate assays or measurements,
CV_I is the within-subject biological variation, and
n_S is the number of patient samples or specimens.

An online calculator for Dispersion is available here. You can use this calculator to verify the numbers in the examples below. Note that there are default values of 1.96 for Z, 1 for n_A, and 1 for n_S.

Cholesterol is generally analytically well done by laboratories and most would achieve precision of less than the 3.0% stated as the maximum allowable by NCEP some years ago. We know that this 3% criterion is kind of empirical and the analysis of cholesterol and the quality specifications needed to actually do what is required is at QC Applications: Cholesterol with analytical requirements, and Q C Applications: Cholesterol with clinical requirements. We know that the biological variation of cholesterol is 6.0% .

If we have the usual laboratory mode of operation of one sample from one individual analyzed once, then the 95% dispersion = Z * [CV_A² + CV_I²]^1/2 = 1.96 * [3.0² + 6.0²]^1/2 = 13.1%

Now, we know that improving precision makes some difference, but we might suspect that this would not cause a great effect since CVA is already 50% of CVI at most. So, if precision was now 1.0%, then the 95% dispersion =Z * [CV_A² + CV_I²]^1/2 = 1.96 * [1.0² + 6.0²]^1/2 = 11.9% - lower than if we had 3.0% precision but not much lower.

The real problem is if precision deteriorates such as might be found in some POCT systems or self-test systems bought in drug stores, particularly when done with poor operating technique. If precision was 10%, then -

95% dispersion = Z * [CV_A² + CV_I²]^1/2 = 1.96 * [10.0² + 6.0²]^1/2 = 22.9%

Let us now see the effect of a repeat of the analysis on the same sample, taking the mean as the final result. The effect is to reduce precision by the square root of the precision obtained with single analyses. The basic statistical fact used throughout this essay is that replicates make the variation smaller by the square root of the number of replicates. Thus, the 95% dispersions are as follows -

with 3% precision, 95% dispersion = Z * [CV_A²/n_A + CV_I²]^1/2 = 1.96 * [3.0²/2 + 6.0²]^1/2 = 12.5%
with 1% precision, 95% dispersion = Z * [CV_A²/n_A + CV_I²]^1/2 = 1.96 * [1.0²/2 + 6.0²]^1/2 = 11.8%
with 10% precision, 95% dispersion = Z * [CV_A²/n_A + CV_I²]^1/2 = 1.96 * [10.0²/2 + 6.0²]^1/2 = 18.2%

The dispersion is made less but the important fact to recognize is that the effect on dispersion of duplicate analyses is significantly greater if the precision is poor.

Alternatively, let us do one analysis on each of two samples taken from one individual. Again the effect is to reduce the variation, in this case the biological variation, by the square root of the number of samples. The 95% dispersions are as follows -

with 3% precision, 95% dispersion = Z * [CV_A² + CV_I²/n_S]^1/2 = 1.96 * [3.0² + 6.0²/2]^1/2 = 10.1%
with 1% precision, 95% dispersion = Z * [CV_A² + CV_I²/n_S]^1/2 = 1.96 * [1.0² + 6.0²/2]^1/2 = 8.5%
with 10% precision, 95% dispersion = Z * [CV_A² + CV_I²/n_S]^1/2 = 1.96 * [10.0² + 6.0²/2]^1/2 = 21.3%

Again, as was discussed above, there is merit in taking two samples from an individual, particularly if the precision is low compared to the within-subject biological variation.

Finally, let us do duplicate analyses on each of two samples. The 95% dispersions are as follows -

3% precision, 95% dispersion: Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [3.0²/2 + 6.0²/2]^1/2 = 9.3%
1% precision, 95% dispersion: Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [1.0²/2 + 6.0²/2]^1/2 = 8.4%
10% precision, 95% dispersion: Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [10.0²/2 + 6.0²/2]^1/2 =16.2%

Consequences for providing quality

The clear messages are that the dispersion of the test result reported to the clinician can be reduced by replicate analysis and/or by taking more than one sample. From a practical point of view, the important considerations are as follows -

make precision low so as to cut down the analytical "noise" so that the biological "signal" is not confounded [See Biological variation data for setting quality specifications in laboratory medicine], and

inspect the comparative magnitudes of precision [from internal quality control or PT data] and biological variation [from the latest database]. If precision is higher than biological variation, then reduce the precision through method improvement OR do the analyses in replicate, and, if biological variation is high then it may be of advantage to take more than one sample and then take the mean of the two results.

Problems with the AHA/CDC proposal

Now, let us return to the AHA/CDC Scientific Statement [1] and assess the usefulness of taking two samples to assess the risk of an individual with hsCRP. Using the maximum precision that might be inferred from the Scientific Statement to be allowable, that is 10.0%, and a within-subject biological variation of 30.3% as shown above, then, for one analysis on one sample from one individual, the 95% dispersion = Z * [CV_A² + CV_I²]^1/2 = 1.96 * [10.0²+ 30.3²]^1/2 = 62.5%. This dispersion is almost five times bigger than cholesterol and, thus, equating hsCRP to cholesterol from this point of view seems to me to be significantly flawed.

hsCRP does not appear to me to have "a degree of measurement stability that is similar to that of total cholesterol" as suggested by Ockene et al [6], and the results in this paper do seem to provide much of the basic evidence for the analytical aspects of the AHA/CDC Scientific Statement. I agree with criticism of Campbell et al [7] of this work that objectively considers the effect of biological variation.

As discussed above, the AHA/CDC recommendation [1] is to take two samples. If we analyzed them once, then, with 10% precision, the 95% dispersion = Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [10² + 30.3²/2]^1/2 = 42.0%, also much larger than the similar dispersion for cholesterol.

hsCRP is NOT like cholesterol. It has very high variability. Taking the mean of two samples certainly does cut down the dispersion but not to the same level as the dispersion for cholesterol. To get to this level would mean taking far more samples.

If we took four samples [correctly thought by Ockene et al [6] to be of little "added value" to the two sample scenario], the 95% dispersion: Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [10² + 30.3²/4]^1/2 = 35.6%

If we took 10 samples, the 95% dispersion: Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [10² + 30.3²/10]^1/2 = 27.1%.

Even if we took 10 samples and analyzed all of them in duplicate, the 95% dispersion:
Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [10²/2 + 30.3²/10]^1/2 = 23.3%, still far higher than cholesterol.

Even if we took 2 samples and analyzed these 10 times each, the 95% dispersion:
Z * [CV_A²/n_A + CV_I²/n_S]^1/2 = 1.96 * [10²/10 + 30.3²/2]^1/2 = 42.4%, again far higher than cholesterol.

Again, as for cholesterol, we can see the very important relationships of precision and within-subject biological variation to individual test result dispersion. If precision is less than biological variation, taking two samples reduces the dispersion more than doing assays of one sample in replicate. Of course, if precision is more than biological variation, analyzing one sample in replicate [or better, instituting quality improvement to reduce the precision] reduces the dispersion more than taking more than one sample.

To get the small dispersion enjoyed by cholesterol that really facilitates clinical interpretation, we would need to take a large number of samples AND improve precision. Both of these might be difficult.

Thus, the evidence to support the thesis is that "two separate measurements of hs-CRP are adequate" seems less convincing to me when the large biological variation and large allowable precision are considered objectively as shown here. In Dr Westgard's words, the evidence does not seem to support the view that the Scientific Statement is "nothing but the truth".

The final plea

It is simple to calculate the number of samples needed to obtain an estimate of an individual's homeostatic setting point within a stated closeness at a predetermined probability. It is easy to calculate the dispersion of a single test result. The effects of analysing a sample more than once and/or taking more than one sample are also easy to calculate. All of these calculations require numerical knowledge of both precision and biological variations. Within-laboratory precision data are easily available from internal quality control or published PT survey results and biological variation data are usually available in the published database. Those that produce allegedly evidence based guidelines, recommendations, and scientific statements are urged to do all of these calculations and think on their ramifications on clinical utility before disseminating their work.

References

Pearson TA, Mensah GA, Alexander RW, et al. AHA/CDC Scientific Statement. Markers of inflammation and cardiovascular disease. Application too clinical and public health practice. A statement for healthcare professionals from the Centres or Disease Control and Prevention and the American Heart Association. Circulation 2003;107:499-511.

Campbell B, Badrick T, Flatman R, Kanowski D. Limited clinical utility of high-sensitivity plasma C-reactive protein assays. Ann Clin Biochem 2002;39:85-8.

Fraser CG. Biological Variation: From Principles to Practice. Washington, DC. AACC Press, 2001.

National Cholesterol Education Program Laboratory Standardization Panel. Current status of blood cholesterol measurement in clinical laboratories in the United States. Clin Chem 1988;34:193-201

Franzini C. Need for correct estimates of biological variation: the example of C-reactive protein. Clin Chem Lab Med 1998;36:131-2.

Ockene IS, Matthews CE, Rifai N, Ridker PM, Reed G, Stanek E. Variability and classification accuracy of serial high-sensitivity C-reactive protein assays in healthy adults. Clin Chem 2002;48:444-50.

Campbell B, Flatman R, Badrick T, Kanowski D. Problem with high-sensitivity C-reactive protein. Letter to the Editor. Clin Chem 2003;49:201.

Special thanks to Dr. Martin Holland, who provided much of the base code for the calculators here.

About Callum G. Fraser

Callum G Fraser is currently Clinical Leader of Biochemical Medicine, Tayside, and Honorary Senior Lecturer in the University of Dundee. He has published widely on the generation and application of data on biological variation.

Tools, Technologies and Training for Healthcare Laboratories

Guest Essay