Have you checked the math on your reports lately?

Once again my lab was questioned by a research study’s primary investigator and study coordinator about why our lung volume results came out significantly lower than another lab’s. In order to be part of this study a subject has to have an RV that is greater than 150% of predicted. The RV we had obtained on a subject referred to the study was over a liter less than the results they had brought with them from another lab and for this reason the patient no longer qualified.

When I reviewed the subject’s test data from my lab it was clear to me that our test quality was good and more than met the ATS/ERS reproducibility criteria. We were given a copy of the subject’s report from the other lab and at first glance, the results look very typical for emphysema. Specifically the report showed very severe airway obstruction, a normal TLC, an elevated FRC and RV consistent with hyperinflation and a severely reduced DLCO. Our results however, showed a mixed defect with severe obstruction and a mildly reduced TLC.

Getting accurate lung volume measurements is hard. Regardless of which measurement technique you use, in most instances any errors tend to cause lung volumes to be overestimated. When very severe airway obstruction is present unless you are careful about panting frequency, plethysmography will often overestimate FRC and TLC, and that may be what happened in this case.

But this isn’t about test quality or the reasons why I believe my lab is better than most others. Although the report was from a nearby hospital with a reputation for the quality of its patient care, when I started reviewing it I immediately started to see math errors among the predicted values. I’ve run across these kind of errors before but this report was from a different equipment manufacturer than last time and this means that these kind of errors are probably far more common than I ever would have expected.

Predicted: Pre-BD: %Predicted: Post-BD: Post %Pred: %Change:
FVC: 2.80 0.91 32% 1.01 36% +12%
FEV1: 2.14 0.49 23% 0.42 19% -15%
FEV1/FVC: 77 54 41
PEF: 5.55 1.86 33% 2.31 42% +25%
FIVC: 2.60 0.95 36% 1.06 41% +11%
TLC: 4.44 4.62 104%
FRC: 2.54 3.68 145%
RV: 1.85 3.63 197%
SVC: 2.26 0.99 44%
IC: 0.94
ERV: 0.07
DLCO: 21.2 4.60 21%
VA: 2.17
DL/VA: 3.70 2.11 57%

What first caught my eye was that the predicted TLC, RV and SVC did not add up correctly. Specifically, the predicted TLC was not equal to the predicted RV plus the predicted SVC, or if you want to put it another way, the predicted TLC minus the predicted RV did not equal the predicted SVC. Next I noticed that the predicted SVC, FVC and FIVC were all different. Finally I noticed that the observed IC and ERV did not add up to the observed SVC.

The reference equations were not specified on the report so I did some sleuthing and using the patient’s demographic information found that the spirometry equations came from Hankinson et al (NHANESIII) study and the lung volumes were from Stocks and Quanjer (ERS).

It’s not clear to me however, where the predicted SVC came from since it did not match the SVC that would have been derived from the ERS reference equations (2.59 L) and also did not match any of the other reference equations I have on hand. Along the same lines I am not sure why the predicted FIVC is smaller than the predicted FVC (or where it came from). I am unaware of any particular guideline that would indicate that FIVC should be smaller than the FVC, and would note that the primary ATS/ERS recommendation for SVC is for it to be an inspiratory maneuver.

When I checked the math of the observed data I also found a number of simple math errors that are probably due to truncating and rounding digits. The reason it’s hard to be sure it is that kind of error rather than a true math error is that I’ve found that raw test data is commonly stored with more numerals after the decimal point than what shows up on a report. Specifically, an FVC could be stored as 5.32857 in the database but would either be truncated or rounded and then appear in a report as 5.32 or 5.33. I understand that these extra digits happen because of multiplication and division but I don’t necessarily agree with them. First, test equipment isn’t accurate to a single milliliter, let alone fractions of a milliliter so why is it stored with so many digits? More importantly, that when math is performed on the raw data it then involves hidden digits. For example, the IC is 0.94 L and the ERV is 0.07 L. Added together that is 1.01 L but the report showed the SVC to be 0.99 L. My guess is that in this instance the IC and ERV were truncated for the report but that when added, the result was rounded.

It’s not clear who is responsible for all of these math errors. It appears to me that the observed results are reported with a mix of truncation and rounding. Although these errors aren’t large and they probably don’t make a significant difference when results are reviewed they are nevertheless present and this is the responsibility of the equipment manufacturer. But for the predicted values I know that it’s possible for a user to select a set of reference equations (like NHANESIII or ERS) and then modify an individual component (such as SVC) so this could be the responsibility of the end user instead of the manufacturer.

There are no ATS/ERS guidelines about normalizing reference values when reference equations from different sources are mixed and matched. I also haven’t seen any particular consensus about how this should be handled from the different equipment manufacturers either.

For example, in my lab’s test systems the predicted RV is derived from the ERS equations but the predicted SVC is actually the FVC from NHANESIII. The software then re-calculates the predicted TLC from predicted RV + predicted SVC. This means that everything adds up but it also means that the predicted TLC, IC and ERV are different than it would have been from the original ERS TLC equation. This is a solution of some kind but why is this okay? As importantly, why is it the TLC that is adjusted and not the RV?

I’ve also seen results from another lab where the predicted TLC and RV came from the ERS equations and the SVC came from the NHANESIII reference equations but they were sort of shoved in together and TLC – RV did not equal SVC. This is also a solution of some kind, but the predicted and percent predicted numbers never add up and why is this okay?

In this report’s case the TLC and RV came from the ERS equations but the SVC (and FIVC) did not come from any known source. I’m most concerned that the lung volume subdivisions didn’t come even close to adding up but in the absence of ATS/ERS guidance I can’t quite say it is wrong. Confusing, yes, but not necessarily wrong.

My personal opinion is that none of us would use the FVC from one set of reference equations and the FEV1 from a different one. Why then is it okay to shoehorn the predicted FVC from NHANESIII into the predicted TLC and RV from ERS? For me the simplest answer would be that we should not mix and match reference equations. This would mean that the predicted SVC would not be equal to the predicted FVC but since they (and the predicted DLCO as well) come from different study populations this is okay.

If the fact that the vital capacity taken from different reference populations can be different is a concern then all of the reference equations (spirometry, lung volumes and DLCO) can be taken from the same study and this is the other answer. The number of studies where all three types of testing was performed on their study population is small (Gutierrez et al, Marsh et al) but it can be done.

Over the years I haven’t had the opportunity to review all that many reports from other PFT Labs and up until recently I wouldn’t have thought to check the math in the predicted values. In the last four months however, I’ve seen reports from two different labs with significant – and different – math errors. This says to me that this problem is probably very common.

There appears to be a strong belief that the predicted SVC should be the same as the predicted FVC (although the SVC in this report flies against this). Part of this is based on the fact that studies like NHANESIII have a very large study population and use more sophisticated statistical analysis that did the ERS study (or any other lung volume study for that matter) and that this makes the NHANESIII FVC “better” than the ERS SVC. Part of it is based on the thought that a vital capacity is a vital capacity and that all the vital capacities on a report should therefore be the same. Trying to merge vital capacities from different reference equations seems to create more problems than it solves, however.

Reference equations are often viewed as an arcane and mysterious subject area and for this reason many labs are reluctant to “tamper” with the default settings of their test system software (after all, surely the manufacturer knows best, don’t they?). Since there is only minimal official guidance and no particular consensus on using reference equations many labs may also be reluctant to make any changes, particularly since that means they’d have to decide for themselves what’s the best approach. But if your reports have math errors, you have a problem and doing nothing is still making a decision.

So, have you checked the math on your reports lately?

References:

Gutierrez C, et al. Reference values of pulmonary function tests for Canadian caucasians. Can Respir J 2004; 6: 414-424.

Hankinson JL, Odencrantz JR, Fedan, KB. Spirometric reference values from a sample of the general U.S. Population. Amer J Resp Crit Care 1999; 159: 179-187.

Marsh S, Aldington S, Williams M, Weatherall M, Shirtcliffe P, McNaughton A, Pritchard A, Beaseley R. Complete reference ranges for pulmonary function tests from a single New Zealand population. New Zealand Med J 2006; 119: N1244.

Stocks J, Quanjer PH. Reference values for residual volume, function residual capacity and total lung capacity. Eur Respir J 1995; 8: 492-506.

Creative Commons License
PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.