Improving the Accuracy of CTC Interpretation: Computer-Aided Detection




Computer-aided polyp detection aims to improve the accuracy of the colonography interpretation. The computer searches the colonic wall to look for polyplike protrusions and presents a list of suspicious areas to a physician for further analysis. Computer-aided polyp detection has developed rapidly in the past decade in the laboratory setting and has sensitivities comparable with those of experts. Computer-aided polyp detection tends to help inexperienced readers more than experienced ones and may also lead to small reductions in specificity. In its currently proposed use as an adjunct to standard image interpretation, computer-aided polyp detection serves as a spellchecker rather than an efficiency enhancer.


Computer-aided detection (CAD) for computed tomographic colonography (CTC) was introduced in the late 1990s. CAD has developed rapidly and early clinical trials of CAD are beginning to appear in the literature. This article presents a brief overview of the current clinical status of CTC CAD. The article concludes with a description of some advanced computerized display technologies that assist CTC readings and may play an important role in improving the diagnostic efficacy of CTC.


Rationale for CAD


It has been shown that perceptual error reduces the sensitivity of CTC by 14% for polyps 1 cm in size or larger. Given the multitude of images in a CTC study, the causes of perceptual error are not mysterious. Depending on the reconstruction interval, there can be 1200 images or more to interpret. For example, images in the prone and supine positions must be interpreted. Some investigators examine the colon antegrade and retrograde, and in lung and soft tissue windows. Three-dimensional virtual endoscopic views may also be needed for problem solving. Average interpretation times ranging from 15 to 25 minutes per study have been reported in the literature.


Interpretive errors can lead to substantial reductions in the sensitivity of polyp detection. Polyps can be missed if they are located between or behind haustral folds, in areas of poor bowel preparation or inadequate distention or because of inconspicuousness caused by flat shape. Factors affecting the ability to perceive abnormalities on large two-dimensional CT data sets and three-dimensional endoluminal fly-through images require further study.




Effect of reader fatigue


There is as yet little or no information about the effect of reader fatigue on the diagnostic efficacy of CTC interpretation. Anecdotally, radiologists report an upper limit on the number of CTC cases they can interpret per day, typically less than 10. Because interpretation of the CTC data is complex and requires manipulation of different types of images and sustained concentration, it is likely that fatigue is an issue. In addition, without addressing the lengthy interpretive process, it is unlikely that costs for CTC can be substantially reduced. It is therefore likely that CAD implementations that reduce fatigue will be beneficial for improving accuracy and reducing costs. Although some benefits of CAD in improving radiologist performance have been proved, it has not yet been shown that these benefits accrue because of a reduction in fatigue. However, fatigue and perceptual errors are closely intertwined. More research is needed in this area.


Performance of 1 Reader Versus 2 Readers (Single vs Double Reading) Without CAD


Double reading of medical images has been shown to increase sensitivity in certain settings, for example in interpretation of mammograms. There has been relatively little work on double reading of CTC. In a study using 3 readers, Johnson and colleagues found that the per patient and per polyp sensitivities tended to be higher and the specificity lower with double reading than for some single readings. However, there was considerable variability in the sensitivities of the 3 readers for polyp detection. In part, a hope is that CAD will provide a similar benefit to that of double reading but without the additional cost of the second human interpreter.




Effect of reader fatigue


There is as yet little or no information about the effect of reader fatigue on the diagnostic efficacy of CTC interpretation. Anecdotally, radiologists report an upper limit on the number of CTC cases they can interpret per day, typically less than 10. Because interpretation of the CTC data is complex and requires manipulation of different types of images and sustained concentration, it is likely that fatigue is an issue. In addition, without addressing the lengthy interpretive process, it is unlikely that costs for CTC can be substantially reduced. It is therefore likely that CAD implementations that reduce fatigue will be beneficial for improving accuracy and reducing costs. Although some benefits of CAD in improving radiologist performance have been proved, it has not yet been shown that these benefits accrue because of a reduction in fatigue. However, fatigue and perceptual errors are closely intertwined. More research is needed in this area.


Performance of 1 Reader Versus 2 Readers (Single vs Double Reading) Without CAD


Double reading of medical images has been shown to increase sensitivity in certain settings, for example in interpretation of mammograms. There has been relatively little work on double reading of CTC. In a study using 3 readers, Johnson and colleagues found that the per patient and per polyp sensitivities tended to be higher and the specificity lower with double reading than for some single readings. However, there was considerable variability in the sensitivities of the 3 readers for polyp detection. In part, a hope is that CAD will provide a similar benefit to that of double reading but without the additional cost of the second human interpreter.




Principles of CAD


The purpose of CAD is to locate possible polyps automatically and annotate the images or present a list of image locations. The radiologist reviews the output of the CAD and makes the final diagnosis.


The main function of the CAD software is to identify sites with features characteristic of polyps. Examples of useful features for CAD include surface shape and CT attenuation. Once these features are identified, the CAD software classifies sites of detection as potential polyps or false-positive diagnoses. A suitable CAD system has high sensitivity for detection of clinically significant polyps (those more than a size threshold, eg, 0.5 or 1.0 cm) and a low number of false-positive detections. All current CTC CAD systems produce on average at least 1 false-positive detection per CTC examination. Hence, review of the CAD marks by a trained reader is still required to prevent unnecessary referrals for colonoscopic polypectomy.


Once potential polyps are detected by CAD, they must be shown to the radiologist who makes the final diagnosis. There are several ways to do this. One way is to label sites directly on CTC images to show the radiologist where the potential polyps may be found. These labels can be turned on or off so that they do not obscure the original images. To save time, the radiologist can jump directly to the labeled images. Labels can be applied to the two-dimensional cross-sectional and three-dimensional endoluminal images.




Common false-positives


The most common CAD false-positives are on the ileocecal valve, thick haustral folds, residual fecal matter, and the rectal tube. It is possible to reduce the numbers of these false-positives through various techniques. For example, the ileocecal valve can be identified because it tends to be large and contain fat leading to lower CT attenuation than that of polyps. The rectal tube can be identified by its location in the rectum and by detecting its hollow channel.


False-positives caused by residual fecal matter and thick haustral folds can be more difficult to eliminate. High-quality bowel preparation and adequate colonic distention can reduce these problems.




Reasons for false-negatives


The most common reasons for CAD false-negatives are flat polyps, inadequate colonic distention, residual fecal matter, adhering contrast medium, polyp at air-fluid boundary, and small polyp size.


A flat polyp has a low elevation above the surface of the adjacent colonic mucosa. Flatter polyps are less conspicuous to radiologists and the CAD software. Hyperplastic polyps tend to be flatter than adenomatous polyps making them less conspicuous and more difficult to diagnose. The poorer sensitivity of CTC for detecting hyperplastic polyps may be beneficial by avoiding unnecessary colonoscopy and polypectomy for these lesions, which have lower malignant potential.


Inadequate colonic distention can be prevented by careful technique and the use of carbon dioxide insufflators. Quality assessment software can identify poor colonic distention in real time and allow correction before the patient leaves the examination room.


Residual fecal matter and fluid can cover polyps and obscure them. Fecal and fluid tagging with barium- and iodine-based contrast materials enables visualization of such polyps.


The CT attenuation of polyps adjacent to endoluminal contrast material can be artificially increased. This phenomenon, known as pseudoenhancement, can prevent polyp detection because the inflated CT values may greatly exceed typical soft tissue attenuation values. Software corrections can greatly improve the sensitivity for detecting such polyps, particularly those submerged under contrast-enhanced fluid.


Contrast material can adhere to some polyps. CAD software must be able to identify such polyps. Software that identifies polyps with adherent contrast material is under development.


Polyps at the air-fluid boundary can be difficult to detect whether or not the fluid is tagged with contrast material. Software that improves electronic fluid subtraction at the air-fluid boundary may enable detection of such polyps.


CAD performance tends to decrease for smaller polyps. Polyps from 6 to 9 mm in size are of particular interest because patient management (surveillance vs immediate polypectomy) may depend on whether polyp size is at the high or low end of this range. Even with the use of modern thin-section CT scanners, it is likely that CAD sensitivity for 6- and 7-mm polyps is substantially less than that of 8- and 9-mm polyps although these size subcategories are usually not reported separately.




Current status of CTC CAD


CTC CAD is at an advanced stage of development. Several small clinical trials have been published. Several commercial and precommercial CAD systems have been developed and have undergone or are undergoing regulatory review. In stand-alone CAD trials in the computer laboratory, as opposed to observer studies in which the performance of radiologists with CAD assistance is evaluated, the baseline sensitivities for detecting large (≥10 mm) polyps are as high as 85% to 100% with less than 10 false-positives per patient. These sensitivities reach or exceed those achieved by radiologists.


CAD has not yet been developed to handle the problem of extracolonic findings. The multiplicity of potential sites and types of extracolonic findings makes it particularly difficult to develop a CAD system to detect them all.




Stand-alone CAD trials: baseline performance of CAD in the laboratory


In 2005, Summers and colleagues published the results of a large stand-alone CAD trial. The investigators trained their CAD system on CTC data sets of 394 patients and tested 792 data sets, both sets taken from the Department of Defense screening CTC data set reported earlier by Pickhardt and colleagues. The reference standard was segmentally unblinded optical colonoscopy. For the test set, per polyp and per patient sensitivities for CAD were both 89.3% (25 of 28 polyps) for detecting identifiable adenomatous polyps at least 1 cm in size. The false-positive rate was 2.1 per patient. The CAD system detected 1 cancer originally missed by the colonoscopists. At 8-mm and 10-mm adenoma size thresholds, the per patient sensitivities of CAD (85.4% and 89.3%, respectively) were not significantly different from those of optical colonoscopy before segmental unblinding.


Halligan and colleagues published an external validation of a CAD system for CTC. External validation refers to the assessment of CAD applied to data different from that on which the CAD software was trained. The results of the external validation provide information about the generalizability of the CAD to different patient populations. The per polyp sensitivity of their CAD system was 94% for detecting polyps 6 mm or larger, indicating good generalizability. The false-positive rates ranged from 14 to 43 depending on the settings of a sphericity filter.


Summers and colleagues reported an external validation study of their CAD system. Their CAD system had per polyp sensitivities of 91.5% for adenomas 10 mm or larger and 82.1% for adenomas 6 to 9 mm. The per patient sensitivities were 97.6% and 82.4%, respectively. The mean and median false-positive rates were 9.6 and 7.0 per patient, respectively.


Van Ravesteijn and colleagues reported CAD sensitivities for polyps 6 mm or larger ranging from 85% to 100% with between 4 and 6 false-positives per scan. They applied their CAD system to 4 different data sets. They also performed a cross-center external evaluation and found that the trained CAD system generalized to data from different medical centers and with different patient preparations.


Lee and colleagues reported the sensitivity of 3 different CAD systems for detecting simulated polyps in an anthropomorphic colonic phantom. For polyps 6 mm or larger, the differences in the per polyp sensitivities amongst the 3 CAD systems were not statistically significant. Sensitivities were lowest for flat polyps, intermediate for sessile polyps, and greatest for pedunculated polyps. The false-positive rates ranged from 2.6 to 4.6 per scan and were not statistically different but the distribution of causes of false-positives did differ amongst the 3 CAD systems.




Effect of CAD on observer performance: CAD as a first, concurrent, or second reader


The stand-alone performance of CAD software in the laboratory described in the previous section describes the theoretic best performance achievable. However, when used in the clinic, CAD software rarely achieves its full potential. To assess the likely clinical benefit of CAD, researchers conduct observer performance experiments in which radiologists use CAD to read unknown cases. The experiments are typically conducted in a simulated clinical setting and to date have not been prospective clinical trials.


Radiologists may use CAD in 1 of 3 ways: as a first, concurrent, or second reader ( Fig. 1 ). It is not yet clear how well the 3 methods compare with one another and this may depend on the particular CAD implementation. Therefore, the observations in this section should be regarded as preliminary.




Fig. 1


Simplified three CAD reading paradigms. Horizontal bars ( clear, gray, solid ) represent CTC images. Clear bar indicates image has no CAD marks and is not reviewed by reader. Gray bar indicates image has CAD marks and is reviewed by reader. Black bar indicates image has no CAD marks and is reviewed by reader. In first reader mode, reader only reviews images with CAD marks. In concurrent reader mode, CAD marks are present during the reader’s first pass through all the images. In second reader mode, all images are reviewed first without CAD marks, then reader reviews only images with CAD marks to arrive at final diagnosis.


In the first reader paradigm, the radiologist only reviews the CAD results and does not review the entire colon. This method has the potential advantage of reduced interpretation time and high specificity (because the choice of false-positives is limited to the CAD findings) but the potential disadvantage of lower sensitivity relative to the concurrent and second reader paradigms. At present, radiologists are naturally reluctant to use the first reader paradigm because only the computer reviews the entire CTC data set.


In the concurrent reader paradigm, the CAD marks are visible during the radiologist’s primary interpretation of the images. The radiologist evaluates the CAD marks as they appear in the image. The potential advantages of this method are improved sensitivity and reduced interpretation time. These advantages may not actually be realized because the CAD marks could distract the radiologist from other findings in the vicinity of a mark, leading to “satisfaction of search” errors. The radiologist could also mischaracterize CAD false-positives, leading to decreased specificity as well.


In the second reader paradigm, the radiologist reviews the images, arrives at a preliminary diagnosis, reviews the CAD findings, and revises the preliminary diagnosis to arrive at a final diagnosis. Because CAD is not perfect, the radiologist should not disregard polyp candidates they identified that were not found by CAD. The potential advantage of this technique is sensitivity higher than either the first or concurrent reader paradigms. The disadvantages are the longest interpretation times and potentially reduced specificity compared with either the first or concurrent reads.


Although a CAD system may be marketed as being optimized for 1 of these 3 reading methods, it is quite possible that radiologists will adapt their reading style to another reading method based on personal choice and experience.


Several research publications have recently evaluated the performance of radiologists assisted by CAD. These publications are preliminary works with small numbers of cases and readers. Some of the relevant findings include:


Sep 12, 2017 | Posted by in GASTOINESTINAL SURGERY | Comments Off on Improving the Accuracy of CTC Interpretation: Computer-Aided Detection

Full access? Get Clinical Tree

Get Clinical Tree app for offline access