Top 10 spirometry errors and mistakes

A couple of days ago my medical director and I had a short discussion about teaching pulmonary fellows to read PFTs and agreed that in order to be good at interpreting PFTs it isn’t the basic algorithms that are hard, it’s gaining an understanding of test quality and testing problems. My medical director then suggested this topic. At first I wasn’t sure I could find 10 errors but after spending a couple hours digging through my teaching files I managed to come up with just a few more than that. So strictly speaking it’s not a top 10 list but I kept the title because I liked it.

Spirometry errors and mistakes seem to fall into four categories: demographics, reference equations, testing and interpretation.

Demographics:

Normal values are based on an individual’s age, height and gender. When this information is entered incorrectly the normal reference values will also be incorrect. These errors often go uncaught because whoever reviews and interprets reports usually isn’t the same person who sees the patient and performs the tests. This type of error often doesn’t get corrected until the results are uploaded into a hospital information system or the patient returns for a second (or third or fourth) visit.

1. Wrong gender.

Pulmonary function reference equations are gender specific and for individuals with the same age and height, men will have a larger FVC and FEV1 than women do. When a patient’s demographics information is manually entered into a PFT system it’s always possible for somebody to enter the wrong gender. When this happens the predicted values will be either over- or under-estimated. This happens in my lab at least a half a dozen times a year and it’s why when I review reports I try to check the patient’s gender right after reading their name.

This is also a problem area for individuals who have gone through gender reassignment (transsexuals). An individual’s physiologic/developmental gender needs to be used to generate predicted values but this may be at odds with their gender recorded in a hospital’s information system. Some PFT lab systems populate their demographics information from their hospital’s information system when an order is received and it may or may not be possible to alter gender once this has happened. In other cases, an individual’s demographics may be cross-referenced when PFT results are uploaded into hospital information system and may throw an error if the wrong gender is present.

2. Wrong height

All lung volumes and capacities scale with height. Like any other manual entry height can be mis-entered and the most common error I’ve seen is for somebody to enter 60 inches when they meant 6 feet 0 inches.

Height can also be mis-measured if the patient isn’t asked to remove their shoes or to stand straight, or if the patient is asked for their height and it isn’t even measured. An error of an inch or two probably won’t make a big difference in a patient’s predicted values (particularly given the discrepancies between different reference equations) but for somebody who’s on the edge of normal and abnormal it can make a significant difference in how a report is interpreted.

3. Wrong age

Volumes and flow rates tend to increase until about age 20 or so and then decline thereafter. A patient’s date of birth is usually used to determine their current age and this, of course, is another opportunity for a manual entry error. The most common errors I’ve seen are for today’s date to be entered instead of the patient’s DOB or for the wrong decade to be entered.

This type of error used to be caught when we uploaded patient results into our hospital’s information system but for both good and bad reasons this no longer happens. Patients are usually asked to confirm their date of birth when they check in but this is confirmed through from the hospital’s information system, not the lab’s software, and there is no process available to cross-check these other than manually. This means that unless staff are paying attention or unless there is a significant error in the patient’s age, this type of error will not be caught.

Reference equations:

The ATS/ERS guidelines say that every pulmonary function lab is supposed to select the appropriate reference equations for the population it serves. This makes eminent sense but unfortunately there aren’t any recommendations on how this is actually supposed to be done. This is also complicated by the fact that reference equations themselves are limited by the size and limited range of ethnicities, ages and heights in the population they study, and the statistics used to analyze them.

4. Limits of reference equations for ethnicity

When a population shares an environment, diet and a large number of genes it is likely that their lung function will be similar as well. This is the basis for ethnicity-based reference equations but environment, diet and genes are rapidly changing and it is not as clear as it once was what ethnicity means. This can make selecting the appropriate ethnicity-based reference equations difficult. Even when an individual’s ethnicity is relatively evident, a PFT lab’s software may not have an appropriate reference equation available or it may make selecting the appropriate reference equation difficult.

As importantly, many test systems just subtract a specific percentage for Blacks or Asians from a reference equation for Caucasians but this makes their results dependent on a reference equation intended for a population they are not part of. This was a common practice up until about a dozen years ago when the ATS/ERS standards recommended the use of ethnicity-based reference equations instead.

As an example of this problem, the following individual was born and raised in India and referred for screening spirometry.

Asian Indian Observed: %Predicted: Predicted:
FVC: 3.40 73% 4.67
FEV1: 2.78 77% 3.58
FEV1/FVC: 82 107% 77

My lab’s software doesn’t allow us to enter Asian Indian as an ethnicity, nor does it have any Asian Indian reference equations. Depending on which reference equations you compare, Asian Indian FVC’s and FEV1’s are approximately 20% less than those for Caucasians. That means these results are probably not abnormal, particularly since the individual was asymptomatic, and continued to be asymptomatic even when followed up over several years.

5. Limitations of reference equations for height

The limits of the statistics used to generate reference equations often becomes quite clear when an individual is very short or very tall. Unfortunately there is no clear definition of what the normal limits for height are and most of the studies that the reference equations come from do not provide any statistics on the height range of their study population. As importantly there are almost no studies whatsoever on individuals at the extreme ends of the human height range.

The following individual is 48” tall and the first set of results were generated by the Morris reference equations:

48” Morris Observed: %Predicted: Predicted:
FVC: 1.73 102% 1.69
FEV1: 1.36 82% 1.65
FEV1/FVC: 79 81 97%

Note that the predicted FVC and FEV1 are almost identical. When the predicted are re-calculated using the NHANESIII reference equations the results look more normal:

48” NHANESIII Observed: %Predicted: Predicted:
FVC: 1.73 86% 2.01
FEV1: 1.36 82% 1.66
FEV1/FVC: 79 101% 78

But realistically, there’s no way to determine whether they are normal or not since the this person is over a foot shorter than anybody tested in the NHANESIII population. To some extent there is a similar problem with the very tall as well. The following individual was 84” high which is a half a foot higher than anybody in NHANESIII study:

84” NHANESIII Observed: %Predicted: Predicted:
FVC: 8.25 103% 8.04
FEV1: 5.40 85% 6.38
FEV1/FVC: 65 80% 81

The percent predicted results indicate this individual likely has airway obstruction but the NHANESIII FEV1/FVC ratio is calculated without the use of height. Almost all spirometry reference equations (and this includes the NHANESIII FVC and FEV1) show that the FEV1/FVC ratio normally decreases with increasing height. So is this really airway obstruction or an artifact of the reference equations?

6. Limitations of reference equations for age

Reference equations are limited at the extremes of age. For the very young there can be different reference equations for infants, children and adolescents. The dividing line between these categories can be unclear particularly since developmental age is not the same thing as chronological age.

For the elderly, most study populations usually have a limited number of subjects over the age of 70, rarely over the age of 80 and no study has ever had subjects over the age of 90. These results came from an individual that was 97 years old:

97 y/o Observed: %Predicted: Predicted:
FVC: 1.17 108% 1.08
FEV1: 0.95 126% 0.75
FEV1/FVC: 81 116% 70

This looks normal, but there is no way to be sure since the slope at which FVC and FEV1 decline with age is determined primarily by a population that is at least 20 years younger. This is complicated by the fact that the NHANESIII reference equations show that the decline with age accelerates with increasing age whereas many other reference equations show a linear decline with age and it’s not clear which of these observations is correct.

Testing:

7. Back extrapolation

Nobody is able to start exhaling and to reach their maximum expiratory flow instantaneously. Back extrapolation is a process that uses the slope of the highest expiratory flow to determine a standardized beginning of a forced vital capacity effort. When the back extrapolated volume is high the beginning of the spirometry effort becomes more indeterminate and the FEV1 is more likely to be overestimated.

back_extrapolation_fev1_error

Red: Blue:
FVC: 2.74 2.67
FEV1: 0.61 0.95
FEV1/FVC: 22 36

The ATS/ERS standard for spirometry states the extrapolated volume “must be <5% or the FVC or 0.150 L, whichever is greater.” Testing software doesn’t usually accept results with too large of a back extrapolation but in this case the technician overrode the computer because the effort had a “better” FEV1. Even so, some patients are unable to perform spirometry without a slow start and a large amount of back extrapolation on every effort and when this happens the computer will likely report the highest FEV1 regardless of whether it had the smallest or largest amount of back extrapolation.

8. Short effort

The FVC is supposed to be the maximal amount of air an individual can exhale after a maximal inhalation but some patients stop early because of glottal closure or because they feel they’ve exhaled enough. When this happens the FVC will be underestimated and the FEV1/FVC ratio overestimated. This may not be evident when the numerical results are reviewed:

Observed: %Predicted: Predicted:
FVC: 2.99 79% 3.79
FEV1: 2.51 88% 2.85
FEV1/FVC: 84 111% 76
Time: 6.9 sec

But becomes more noticeable when the volume-time curve is viewed:

abrupt_stop_vt

The reported expiratory time is often incorrect and this is because computer software usually doesn’t use an expiratory flow of zero as an indication that exhalation has ended but instead uses an inspiratory effort (or a manual override from the technician) as the marker for the end of exhalation.

9. FEV1 underestimated due to an expiratory pause

The ATS/ERS standards state that an FVC effort should be free of coughs and artifacts that will affect the measurement of FEV1. An early expiratory pause of any kind most often causes the FEV1 and the FEV1/FVC ratio to be underestimated. Software is generally poor at recognizing artifacts of this kind however, and it is only by inspecting the graphics from a spirometry effort that they may be noticeable at all.

When expiratory flow stops, a flow-volume loop will show a notch:

expiratory_pause_fev1_underestimated_fvl_2

But a flow-volume loop does not contain time, and a volume-time curve actually gives a better estimate of the effect an expiratory pause has on FEV1:

expiratory_pause_fev1_underestimated_vt_2

10. FVC underestimated from a mid-expiratory inspiration

A cough during exhalation doesn’t necessarily just pause expiratory flow, it may also cause a small inhalation to occur. Although the ATS/ERS standards give criteria for determining an adequate expiratory flow rate at the end of a test, they don’t address what level of inspiratory flow should terminate an expiratory effort and this issue is left to each manufacturer to decide for themselves. This example actually reported an expiratory time of greater than 6 seconds even though it stopped measuring the exhaled volume at the first inhalation:

mid_expiratory_inspiration_fvc_underestimated_fvl

And although it is noticeable on the flow-volume loop, once again it is more evident on the volume-time curve

mid_expiratory_inspiration_fvc_underestimated_vt

11. Inadequate inspiration

Although there are a number of criteria for judging the end of exhalation it is actually very difficult to determine whether or not a patient has taken a maximal inhalation. The ATS/ERS standards recommend that a forced expiratory effort be followed by a maximal inhalation. This is so that the maximum inspiratory flows can be measured but occasionally the FIVC is greater than the FVC.

Hidden FIVC redacted 3

When this happens the FVC is certainly underestimated and since the expiratory effort didn’t start from a maximal inhalation, the FEV1 is likely underestimated as well. Sometimes however, the indications that the patient didn’t take a full inhalation are more subtle.

fvc_with_no_ic_redacted2

In this instance the inspiratory capacity was almost nonexistent. Inspiratory capacity can be reduced in severe COPD due to expiratory flow limitation and gas trapping but in this case the tidal loop doesn’t show this and the small IC is more likely due to a submaximal effort.

12. FVC underestimated from a Patient leak

The mouthpieces that are used for spirometry are frequently circular and some patients may have difficulty maintaining a seal around them. When they leak, they may leak more during exhalation than during inhalation, or vice versa. When this happens, the tidal loop prior to the start of the FVC maneuver may drift:

tidal_loop_drift_fvl

A patient may be able to maintain a tight seal at the beginning of the test, but may get distracted and loosen their lips during the test itself. When this happens they leak during exhalation and their inspired volume will be greater than their exhaled volume:

leak_01_fvl_fvc_underestimated

In either case the FVC will usually be underestimated and the FEV1/FVC ratio will likely be overestimated.

13. Zero offset error

This is a hardware error that isn’t terribly common any more, but it can still happen. Spirometry is often performed using flow sensors and the signal from them is analog. The zero level from analog circuitry tends to drift but most test systems usually re-zero this before each test. Sometimes this process goes wrong. When it does, it can add (or subtract) an extra amount to the flow signal throughout exhalation:

zero_offset_error

Occasionally, it may just add an offset to be beginning of the exhalation:

fvc_zero_offset_error_vt

In these examples the FVC would be overestimated and the FEV1/FVC ratio underestimated but the zero offset can also work in the opposite direction.

Interpretation:

The ATS/ERS guidelines for interpreting pulmonary function results are fairly straightforward, but they are also necessarily simplistic and for this reason there are some deficiencies. In addition, which values should be used to interpret spirometry have changed over time and not everybody agrees with these changes.

14. Not using peak flow when selecting results

The ATS/ERS standards state that the largest FEV1 and largest FVC regardless of which effort they came from should be reported. This is somewhat at odds with another ATS/ERS part of the standards that state that an FVC effort should be performed with maximal effort. Peak flow occurs during the effort-dependent part of the FVC maneuver and numerous investigators have shown that efforts with the largest peak flow often do not have the largest FEV1. This means that if an FEV1 is reported based solely on being the largest value from a group of efforts, it may also be from a submaximal effort.

FEV1_vs_PEF_FVL

Blue: Red:
FVC (L): 2.72 3.06
FEV1 (L): 1.73 1.99
PEF (L/sec): 6.28 3.82

Although not currently part of the ATS/ERS guidelines it is widely agreed that FEV1 should be selected from an FVC effort with the highest peak flow or at least near the highest peak flow.

15. Not using SVC and IVC when they are available

The ATS/ERS standards state that the largest FVC, regardless of its source, should be reported and used to calculate the FEV1/VC ratio. When lung volumes or a diffusing capacity are measured along with spirometry, there are potentially two additional vital capacity measurements, the slow vital capacity (SVC) from lung volumes and inspired volume (IVC) from the DLCO. If either of these are larger than the FVC, then they should be substituted for the FVC and the FEV1/FVC ratio re-calculated accordingly. Doing this often reveals airway obstruction that was not evident using the FVC from the spirometry effort alone.

16. Using FEF25-75 (aka MMEF)

FEF25-75 is the measurement that will not die. When it was originally defined in the late 1960’s it was touted as a way to measure the flow through smaller airways and to be able to diagnose “small airways disease”. It has since been shown by numerous investigators that it is a poorly reproducible measurement that is highly dependent on the FVC, that it actually says very little about the actual flows in the middle of an exhalation and that it’s usually reduced only when the FEV1 is reduced. As importantly the ATS/ERS criteria used to select the FEF25-75 from different efforts (i.e., the effort with the largest combined FVC and FEV1) is way to standardize selection and in not a way to determine the best FEF25-75 (if there is such a thing).

The ATS/ERS guidelines state that the sole indicator of airway obstruction is a reduced FEV1/FVC ratio and discourages the use of the FEF25-75.

17. Using a change in the FEV1/FVC ratio to indicate a positive bronchodilator response

The FEV1/FVC ratio is dependent on both the FVC and the FEV1. An increase in the FEV1/FVC ratio can just as easily be due to a decrease in FVC as it is to an increase in FEV1. For this reason the FEV1/FVC ratio should not be used to assess the response to bronchodilator. In addition the ATS/ERS guidelines state that a positive response to a bronchodilator is a 12% (and 0.20 L) increase in either FEV1 or FVC which means that both the FEV1 and FVC can increase significantly without a change in the FEV1/FVC ratio.

18. Not comparing current values to trends

Although a patient’s initial spirometry can be useful in diagnosing an underlying condition given the range of possible normal reference values this means there is always some uncertainty involved in this process. Spirometry however, is just as useful, if not more so, when monitoring the progress of a patient’s disorder or their improvement from therapy. In order to get the maximum amount of information out of a spirometry test current results must always be compared to prior results.

* * *

If I was asked to take these and make a real top ten list, in order of importance or how frequently I think they occur (from least to most), they would be:

Importance: Error / Mistake:
10 #18 Not comparing trends
9 #16 Using FEF25-75
8 #15 Not using SVC and IVC
7 #14 Not using peak flow when selecting results
6 #4 Limits of ethnicity based reference equations
5 #11 Inadequate inspiration
4 #2 Wrong height
3 #7 Back extrapolation
2 #12 Patient leak
1 #8 Short Effort

One reason that spirometry isn’t relied on as much as it should be is that it has a relatively high rate of false positives and false negatives. To (badly) paraphrase Clauswitz, spirometry is simple but when testing people even the simple is very difficult. Spirometry looks simple but the number of possible ways in which it can be performed incorrectly is immense. It’s up to those of us that perform spirometry and those of us that interpret spirometry to be aware of the most common failure modes so that, as best as possible, we can reduce the false positives and false negatives.

This list is an attempt to categorize the most common mistakes and errors seen in spirometry. I suspect that there will be at least some disagreement about what is – and is not – included in this list. These are however, what (beyond the basic algorithms) I would want a pulmonary fellow being taught PFT interpretation to remember and to use.

References:

Brusasco V, Crapo R, Viegi G. ATS/ERS task force: Standardisation of lung function testing. Standardisation of spirometry. Eur Respir J 2005; 26: 319-338.

Brusasco V, Crapo R, Viegi G. ATS/ERS task force: Standardisation of lung function testing. Interpretive strategies for lung function tests. Eur Respir J 2005; 26: 948-968.

Creative Commons License
PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

4 thoughts on “Top 10 spirometry errors and mistakes

  1. Hi, Richard.

    My understanding is starting the FVC below TLC is the most common technical error in spirometry performance. This is why there’s a growing sentiment that the simple spirometry devices that only measure TLC to RV (not displaying the initial Vt to TLC, or an inspiratory flow or volume curve) shouldn’t be used. As you said, at times this can be difficult to determine.

  2. I was curious as to whether the results can be impacted due to the health of the patient at the time of the test. I’m normally pretty healthy, but in April I contracted a pretty bad virus while in Canada and had a bad cough, congestion, mucus for almost 3 months and I’m sure my lungs had a tough workout over those 3 months. The test was administered toward the end of that 3 months.

    • Ed –

      From my own experience I can say that it does. A number of years back I had a very bad case of the flu and was out for several days. Even though I was feeling better when I came back my spirometry results were over 10% down (I was doing weekly QC on myself) and it took a couple weeks before they got back to my normal values. The effect of being “under the weather” is not predictable, however, and I would suspect that any decrease (and the ability to recover from it) will depend on an individual’s baseline level of health and how much the specific illness affects their lungs.

      Regards, Richard

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.