Colonoscopy is the cornerstone of colorectal cancer screening programs. There is significant variability in the quality of colonoscopy between endoscopists. Colonoscopy quality assessment tracks various metrics to improve the effectiveness of colonoscopy, aiming at reducing the incidence and mortality from colorectal cancer. Adenoma detection rate is the prime metric, because it is associated with the risk of interval cancer. Implementing processes to measure and improve the adenoma detection rate is essential to improve the quality of colonoscopy.
Key points
- •
Colonoscopy is the cornerstone of colorectal cancer screening programs.
- •
There is significant variability in the quality of colonoscopy between endoscopists.
- •
Colonoscopy quality assessment tracks various metrics to improve the effectiveness of colonoscopy, aiming at reducing the incidence and mortality from colorectal cancer.
- •
Adenoma detection rate is the prime metric, because it is associated with the risk of interval cancer. Implementing processes to measure and improve the adenoma detection rate is essential to improve the quality of colonoscopy.
Introduction
Colonoscopy is the cornerstone of colorectal cancer (CRC) screening programs, whether used as a primary modality or as follow-up to another screening test. Therefore, adequate prevention of CRC relies heavily on colonoscopy.
Although studies from the 1990s indicated that colonoscopy could prevent up to 90% of CRC, more recent research suggests that the reduction in CRC incidence might be lower, particularly in the right colon. Interval CRCs (or postcolonoscopy CRCs) account for 3.4% to 9% of all cases of CRCs and are primarily diagnosed in the right colon.
Several factors can contribute to limited effectiveness of colonoscopy ; however, data suggest that endoscopist-related factors are the most important element. In fact, 71% to 86% of interval cancers can be attributed to colonoscopist-related factors, because they represent missed lesions or result from incompletely resected lesions.
These concerns have led to an increased emphasis on improving the quality of colonoscopy to maximize its ability to reduce CRC incidence. The Gastroenterology societies have proposed various indicators of colonoscopy quality. Choosing and measuring the most appropriate metrics to assess the quality of colonoscopy for practitioners, and improving them when required, remain a work in progress. This article discusses the value and limitations of quality metrics and considerations for implementing them in practice to improve colonoscopy performance.
Introduction
Colonoscopy is the cornerstone of colorectal cancer (CRC) screening programs, whether used as a primary modality or as follow-up to another screening test. Therefore, adequate prevention of CRC relies heavily on colonoscopy.
Although studies from the 1990s indicated that colonoscopy could prevent up to 90% of CRC, more recent research suggests that the reduction in CRC incidence might be lower, particularly in the right colon. Interval CRCs (or postcolonoscopy CRCs) account for 3.4% to 9% of all cases of CRCs and are primarily diagnosed in the right colon.
Several factors can contribute to limited effectiveness of colonoscopy ; however, data suggest that endoscopist-related factors are the most important element. In fact, 71% to 86% of interval cancers can be attributed to colonoscopist-related factors, because they represent missed lesions or result from incompletely resected lesions.
These concerns have led to an increased emphasis on improving the quality of colonoscopy to maximize its ability to reduce CRC incidence. The Gastroenterology societies have proposed various indicators of colonoscopy quality. Choosing and measuring the most appropriate metrics to assess the quality of colonoscopy for practitioners, and improving them when required, remain a work in progress. This article discusses the value and limitations of quality metrics and considerations for implementing them in practice to improve colonoscopy performance.
Colonoscopy quality process measures
The debate continues regarding the ideal quality measure for colonoscopy, because each of the indicators has advantages and drawbacks. It is now beyond debate though that colonoscopy performance should be regularly assessed for every endoscopist. Although adenoma detection rate (ADR) is largely viewed as the best validated practical outcome measure, 2 other process measures warrant consideration.
Cecal Intubation Rate
The use of cecal intubation rate as a quality metric derives from the fact that colonoscopists should have the ability to perform a complete examination to the cecum in the great majority of procedures. Based on established benchmarks, effective colonoscopists should be able to intubate the cecum in 90% or more of all cases, including 95% or more of screening colonoscopies in healthy adults.
Although some data from the late 1990s suggested that these targets were attained by only 55% of practitioners in North America, more recent evidence shows improved cecal intubation rates. A study of 10 endoscopists from the University of Maryland reported cecal intubation rates ranging between 88% and 97%, except for one endoscopist, with a rate of 63%. This study highlights the importance of measuring cecal intubation rates to identify outliers.
Failure to routinely intubate the cecum is, in fact, one of the reasons that limit colonoscopy’s effectiveness. A study from Japan and another from New Zealand found that interval cancers were due to failure to reach the cecum, in 27% and 54% of cases, respectively. Furthermore, Baxter and colleagues found that patients who undergo colonoscopy by endoscopists with cecal intubation rates of 95% or greater were less likely to have interval cancers compared with patients who undergo colonoscopy by endoscopists with intubation rates less than 80% (distal odds ratio [OR], 0.73; 95% confidence interval [CI], 0.54–0.97; proximal OR, 0.72; 95% CI, 0.53–0.97). Thus, evaluating cecal intubation rates is an important first step toward assessing overall colonoscopy quality.
Withdrawal Time
Spending sufficient time between intubating the cecum and removing the colonoscope from the patient is necessary to perform a thorough mucosal inspection, with a low miss rate for significant lesions. The suggested benchmark for withdrawal time (WT) is an average of at least 6 minutes in examinations where no biopsies or polypectomies are performed (derived from a large number of colonoscopies in patients without prior surgical resection).
The utility of WT as a quality measure has been somewhat disputed, because the evidence is inconsistent regarding its correlation with adenoma or polyp detection rate (PDR). A recent large observational study from England, which included more than 31,000 colonoscopies, supports the value of WT as a quality metric. Colonoscopists with WT less than 7 minutes had significantly lower ADRs compared with those with WT of 11 minutes or longer (42.5% vs 47.1%, P <.001), with 50% less right-sided adenomas detected per procedure. The authors also found that there was no incremental benefit in ADR beyond 10 minutes of WT (which does not include duration of polyp removal). Data from a recent study (currently in abstract form) by Shaukat and colleagues suggest that WT might be a more sensitive indicator of interval cancer than ADR. The authors analyzed records of more than 76,000 colonoscopies performed by 51 gastroenterologists in Minnesota; they found that colonoscopists’ average annual WTs were inversely associated with interval cancers ( P <.0001), whereas physicians’ ADR were not ( P = .40). Compared with WT greater than 6 minutes, the adjusted incidence rate ratio for WT less than 6 minutes was 2.3 (95% CI, 1.5–3.4; P <.0001).
It remains likely though that longer mean WT reflects better withdrawal technique (including cleaning of residual fluid, adequate colonic distention, and proper examination of proximal side of folds). Therefore, colonoscopy experts and the American Society for Gastrointestinal Endoscopy and the American College of Gastroenterology task force recommend the use of WT mainly as a quality indicator for colonoscopists who have low ADR, rather than a stand-alone metric.
Adenoma detection rate
Although it would be most meaningful to measure CRC incidence and mortality, or alternatively, the incidence of interval cancers, as outcomes of colonoscopy, these are not practical for timely quality interventions. Thus, ADR, or the proportion of screening colonoscopies where at least one adenoma is detected, has been proposed as the best, most reliable and practical surrogate quality metric. Current ADR benchmarks are 25% or greater for men and 15% or greater for women.
Adenoma Detection Rate Correlation with Risk of Interval Cancer
Two large studies from Poland and from the United States validated the ADR as a quality metric by showing its association with the risk of interval CRC. The Polish study was conducted by Kaminski and colleagues ; the authors evaluated 45,026 subjects undergoing screening colonoscopy, performed by 186 endoscopists. The results, published in 2010, showed that patients whose endoscopists had an ADR less than 20% were at least 10 times more likely to be diagnosed with an interval CRC, compared with those whose endoscopists had ADR 20% or greater ( P = .008). Interval cancer risk increased as ADR decreased, and there were no other factors associated with interval CRC besides age.
The US study was carried by Corley and colleagues ; the authors reviewed 314,872 screening colonoscopies performed between 1998 and 2010, by 136 high-volume gastroenterologists on patients in the Kaiser Permanente health plan. The results, published in 2014, showed that ADR is an independent predictor of interval CRC, with a hazard ratio of 0.52 (95% CI, 0.39–0.69) for patients whose examinations were performed by colonoscopists with ADR greater than 33.5%, compared with those whose colonoscopists had ADR less than 19%. Interval cancer risk decreased linearly with increasing endoscopist ADR, overall and separately in the proximal colon and in the distal colon, and there was no ADR threshold above which there was no further protective benefit; the risk of interval cancer decreased by 3% for each 1% increase in ADR (hazard ratio 0.97; 95% CI, 0.96–0.98). In addition, there was an inverse association between CRC mortality and ADR: patients whose colonoscopists were in the highest ADR quintile had a 62% (95% CI, 35%–88%) reduction in fatal CRC, as compared with patients whose physicians were in the lowest quintile.
Adenoma Detection Rate Measurement
Measuring the ADR of every colonoscopist is a priority for colonoscopy quality improvement. It should be noted though that a reliable ADR assessment likely requires a large sample of colonoscopies per practitioner. In a mathematical model developed by Do and colleagues, at least 500 procedures were needed in the ADR calculation to provide narrow 95% confidence intervals that more accurately reflect performance.
Nevertheless, ADR is relatively easy to measure, even when it requires manual chart reviews. Despite a somewhat time-consuming process to periodically review pathology data, this is largely facilitated by an objective approach, limited to a binary query (ie, presence or absence of adenoma—without further characteristics or count). Serrated lesions should not be counted toward the ADR. Added strength of ADR as a quality metric is that it indirectly reflects other factors, including WT, colonoscopist technique and motivation, and bowel preparation quality.
The importance of ADR measurement is underscored by the wide variability among colonoscopists, with reported rates ranging from less than 10% to more than 50%. The impact of the endoscopist on ADR is substantial, accounting for up to a 10-fold difference between colonoscopists, and exceeding the effect of patient age and gender. The baseline colonoscopy is decisive for effective CRC screening, because it impacts initial clearance from neoplastic lesions and also dictates subsequent surveillance intervals. Therefore, colonoscopists with high ADR provide their patients with double protection: more complete baseline clearance and shorter surveillance intervals. Thus, assessing and improving ADR is at the core of a successful CRC prevention program.
Polyp Detection Rate
PDR has received some attention as a surrogate for ADR, because it is automatically collected by colonoscopists while generating procedure reports and/or billing codes, making it more practical to measure than ADR. PDR was validated as a surrogate to ADR by 2 studies by Williams and colleagues. In the first study conducted at a single academic institution, the authors showed a strong correlation between ADR and PDR (correlation 0.86, P <.001). In the second study, which included 60 endoscopists at multiple practice sites, they found again a high correlation between colonoscopists’ PDR and ADR in both men and women (correlation 0.91, P <.001) as well as a correlation between PDR and advanced adenoma detection.
In addition, Baxter and colleagues carried out a large observational study in Ontario, Canada, that showed an association between the PDR and the incidence of interval cancers in the right colon. Patients who underwent colonoscopy by endoscopists with PDR 30% or greater were less likely to have proximal interval cancers, compared with patients whose colonoscopists had PDR less than 10% (OR, 0.61; 95% CI, 0.42–0.89, P <.0001).
To date, there is no prospective study that evaluates PDR as a quality measure, and there are 2 main predicaments that can hinder the use of PDR. First, there are no recommended benchmarks from the leading US Gastroenterology societies; potentially, the targets suggested by Williams and colleagues can be used (40% for men and 30% for women based on correlation with ADR targets of 25% for men and 15% for women, respectively), but these still await validation and official endorsement. Second, PDR is more prone to corruption than ADR, because it can be inflated by resection of nonneoplastic diminutive polyps or by biopsy of normal tissue. A remedy as suggested by Williams and colleagues could be periodic audits; however, these could negate to some extent the practicality of PDR.
Adenoma detection rate: limitations
Expecting colonoscopy to prevent all cases of CRC is unrealistic; however, its effectiveness is based on the ability of endoscopists to detect the vast majority of neoplastic lesions and to remove them completely. Although ADR has certainly proven to be a reliable measure of colonoscopy quality, it can be criticized for its inability to gauge if a colonoscopist is leaving behind additional adenomas, other precancerous lesions, or fragments of identified/resected adenomas.
Adenomas per Colonoscopy
ADR can overlook differences in colonoscopy quality, because endoscopists with the same ADR could be detecting highly variable numbers of adenomas per colonoscopy. Denis and colleagues analyzed more than 42,000 colonoscopies performed by 316 endoscopists in France. They evaluated the mean number of adenomas per procedure (MAP, total number of adenomas detected divided by the number of colonoscopies); for colonoscopists with ADR around 35%, the MAP varied markedly between 0.36 and 0.98. Wang and colleagues compared 2 groups of endoscopists serving the same patient pool in Los Angeles, California, at a tertiary care teaching hospital and at 3 nonteaching facilities. The authors assessed the MAP as well as another suggested metric, the ADR-Plus (mean number of adenomas detected after the first adenoma, in procedures in which at least one adenoma was found). Both groups had comparable ADRs (28.8% and 25.7%; P = .052), but the teaching group had 23.5% higher MAP and 29.5% higher ADR-Plus.
Coupling ADR with another total adenoma metric has therefore been suggested by both previous groups to better assess colonoscopists’ performance. Similarly, Lee and colleagues proposed 2 measures of total adenoma detection to complement the ADR: MAP and MAP+ (mean adenoma per positive procedure, calculated as total number of adenomas detected divided by the number of colonoscopies in which at least one adenoma was detected).
Adding a total adenoma detection metric provides a more comprehensive assessment of the thoroughness of a colonoscopic examination, provides better discrimination between endoscopists, and limits potential inclination to “gaming” (ie, a less thorough examination once a first adenoma is detected and resected). However, this makes the assessment much more labor-intensive. Furthermore, this could be an incentive to separate polyps from the same colon segment into different specimen bottles, leading to increased colonoscopy costs. In the absence of formally established benchmarks, and while adherence to measuring ADR remains far from optimal in routine practice, total adenoma detection metrics will primarily be applied in research settings in the short term.
Adenoma Detection Rate Benchmarks
The ADR targets originally defined in 2002 (≥25% for men and ≥15% for women ) may be outdated. Several recent studies report ADRs in the 40% to 50% range or even higher in average-risk individuals. Demographic features, other than gender, also affect the prevalence of adenomas ; this includes increasing prevalence with age and more proximal adenomas in blacks compared with whites. As ADR use becomes more prevalent in clinical practice and in different populations, benchmark adjustments will be necessary based on the specific characteristics of the screened demographic group and the most recent ADRs reported in the literature that have been proven to provide cancer prevention.
Advanced Adenomas
Measuring the advanced ADR might be more clinically relevant, because advanced adenomas are more prone to progress to CRC. Similar to ADR, there is significant variability in advanced ADR between colonoscopists. Among 9 colonoscopists in Indianapolis, detection of adenomas 1 cm in size and larger varied from 1.7% to 6.2%. Among 14 colonoscopists in Chicago, advanced ADR varied from 2% to 18.18%.
This study generated concerns, because it showed that colonoscopists’ advanced ADR were independent of their nonadvanced ADR (correlation −0.42; 95% CI −0.77 to 0.14, P = .13). In addition, the study by Wang and colleagues showed that the teaching group had a 28.7% higher advanced ADR compared with the nonteaching group, whereas ADRs were similar for both groups. These results suggest that some colonoscopists might have adequate ADRs, while missing a significant number of advanced adenomas. These findings warrant future investigation, because they could carry important implications in assessing colonoscopy quality. However, using advanced ADR as a quality metric will likely prove to be very challenging, because of variable polyp size measurement by endoscopists and large interobserver variability for villous elements among pathologists.
Serrated Polyps
An inherent limitation of the ADR is that it does not account for the detection of serrated polyps; this might be problematic as recent evidence suggests that the serrated pathway likely accounts for up to one-third of all CRCs. Endoscopic detection of serrated polyps is more challenging than detection of adenomatous lesions, because they have a subtle, pale appearance and indistinct margins. The prevalence of serrated polyps in the right colon in average-risk patients is higher than previously reported, with a mean of 13% in a recent study.
The variability in detecting proximal colon serrated lesions is more striking compared with ADR, ranging from 5-fold to 18-fold among endoscopists, whereas ADR variation is typically around 3-fold to 4-fold. In 2 recent studies, detection rates varied from 1% to 18% among 15 academic gastroenterologists who performed a total of 6681 screening colonoscopies, and from 6% to 22% among 5 colonoscopists who completed 1354 screening examinations.
Interestingly, and despite the variation in detection of proximal serrated lesions, there seems to be a strong correlation between it and ADR, which was the case in both men (correlation 0.71, P = .003) and women (correlation 0.73, P = .002). Therefore, although not directly capturing serrated polyps, ADR might still be valid by itself to assess the detection of both adenomatous and serrated lesions. This validity has important practical implications because of several problems impacting the use of serrated PDR as a quality metric: there are currently no endorsed benchmarks for serrated lesion detection rates; the measurement should only include lesions proximal to the sigmoid colon to avoid the confounding effect of hyperplastic polyps from the rectosigmoid (whereas it is difficult to reliably identify the junction between the sigmoid and the descending colon); and the histologic classification of serrated lesions remains challenging and variable in practice.
Incomplete Adenoma Resection
Several studies evaluating interval CRCs found that these cancers can be attributed to incomplete polypectomy in about 10% to 30% of cases. Because ADR is essentially geared at assessing detection and not resection, the inadequate quality in these cases would not be reflected by the ADR. Competent colonoscopists should be able to completely resect most sessile polyps up to 2 cm in size and all mucosally based pedunculated polyps.
In the CARE study, Pohl and colleagues assessed the completeness of resection based on 346 polypectomies performed in 269 patients by 11 colonoscopists. Polyps assessed were sessile or flat, ranged from 5 to 20 mm in size, and 59% were in the right colon. The evaluation was based on a biopsy protocol of the margins of the polypectomy site and revealed an incomplete resection in 10.1% of cases. In addition, there was a wide variability in incomplete resection rate, ranging from 6.5% to 22.7% between endoscopists. Of note, serrated lesions were more likely to be incompletely resected compared with adenomatous polyps (31% vs 7.2%, P <.001).
Polypectomy technique and effectiveness are key components of colonoscopy quality. Although awareness is increasing about its variability between colonoscopists, assessing resection quality is more difficult than assessing detection quality. However, this should not be an absolute deterrent. A group from the United Kingdom recently published work to develop and validate a tool to assess polypectomy skills using video reviews, but this area requires further research.