The role for artificial intelligence in evaluation of upper GI cancer

Abstract

With the application of artificial intelligence (AI) in deep learning, it has become possible to develop an AI that can be used clinically even in the field of upper endoscopy, which has been said to be difficult to diagnose. This review summarizes current studies on upper gastrointestinal tract based on AI and deep learning. At present, AI research on gastric cancer detection, Heliobactor pylori infection diagnosis, and esophagus cancer detection is progressing, and there is a possibility that AI can be used to assist in diagnosing invasion of depth of gastric caners and esophageal cancers. The studies reviewed show evidence that the use of AI in diagnosing cancer in the upper gastrointestinal tract as well as in the lower gastrointestinal tract, where research has been advanced, will be introduced to the clinical site in a form that contributes to detecting suspected cancer lesions, determining treatment policies, and improving examination accuracy.

1 Introduction

In recent years, the image diagnostic capability of artificial intelligence (AI) has been found to surpass that of human beings owing to 3 functions: deep learning (in other words CNN: convolutional neural network), high-performance computer (GPU), and increasingly vast amount of digitized image data. AI has particularly been introduced in medicine through the utilization of diagnostic imaging. Image classification and image detection AIs have been developed in diagnosing skin cancer, diabetic retinopathy, and colonic polyps . This paper discusses the latest research findings of imaging diagnostic AI for upper gastrointestinal tract cancer using deep learning. It provides an overview of the role of AI in such diagnoses and discusses future directions.

2 Gastric cancer diagnosis

Gastric cancer often arises from atrophic gastritis. Thus, when gastric cancer resembles gastritis, it is sometimes difficult to detect in the early stages. The false-negative rate for detecting gastric cancer with esophagogastroduodenoscopy is 4.6%-25.8% . Furthermore, inexperienced endoscopists tend to overlook early gastric cancer because it often shows only subtle morphological changes that are difficult to distinguish from background mucosa with atrophic gastritis [ , ]. AI is expected to serve as a tool that compensates for such disparities among human observers. There are few reports of AI using deep learning in the endoscopic diagnosis of gastric cancer because such diagnosis is more difficult than the diagnosis of cancer in other organs such as esophageal cancer and colorectal cancer.

Hirasawa et al reported the world’s first instance of an AI system for detecting gastric cancer by using AI with deep learning ( Fig 1 ). The training image dataset comprised of 13,584 high-quality endoscopic images of gastric site were collected from 2639 histologically proven gastric cancer patients. All images were marked manually by an expert on gastric cancer and linked to clinical data. The AI system was verified using 2296 endoscopic images of 69 consecutive cases (77 lesions) of gastric cancer. The AI system detected 71 of the 77 gastric cancer lesions, a sensitivity of 92.2%. Of the 6 gastric cancer lesions that the AI system could not detect, 5 lesions were minute lesions of diameter 5 mm or less. The AI detected 70 lesions out of 71 (98.6% sensitivity) were detected when limited to gastric cancer of diameter 6 mm or more. The time required to analyze 2296 images was 47 seconds (0.02 seconds per image), and the analysis speed was so fast that it could not be compared to that of humans. The positive predictive value (PPV) was 30.6%; in other words, 69.4% of the lesions diagnosed as gastric cancer by the AI system were benign. The most common reasons for misdiagnosis were gastritis with atrophy and intestinal metaplasia. These findings are sometimes difficult even for experienced endoscopists to distinguish from gastric cancer.

The study by Ishioka et al trained AI to diagnose early gastric cancer using still images; a study using videos of 68 cases of early gastric cancer was also conducted . The AI system detected 64 early gastric cancers out of 68 (94.1%) from the videos, which were at the same level as reported in the still images. Also, it took only 1 second (median time) for the AI to recognize a lesion as cancerous after it appeared on the screen.

Recently, another AI system for detecting gastric cancer was reported by Wu et al . The researchers constructed an AI system, using 3170 still gastric cancer images and 5981 benign still images. The verification images used 200 independent images. Wu et al’s AI system had 94.0% sensitivity, 91.0% specificity, 92.5% accuracy, a 91.3% PPV, and a 93.8% negative predictive value for gastric cancer detection. This result outperformed 21 endoscopists. These were still picture studies; however, it is suggested that improving PPV may be possible by using a new convolutional neural network (CNN)-based AI algorithm.

Zhu recently reported AI diagnosis of the invasion depth of gastric cancer. The researcher developed an AI that discriminated the depth of gastric cancer from a depth of penetration sm1 to a depth of sm2, using 790 gastric cancer images. The AI in this test was 89.1% accurate, exceeding the average of 77.5% of endoscopic doctors. Because the treatment policy of endoscopic resection or surgery changes depending on whether the depth of invasion is sm1 or deeper, AI may also be possibly used to support endoscopists not only in cancer detection but also in deciding on a treatment plan.

3 H. pylori infection

An AI system was also reported that diagnoses the presence or absence of Heliobactor pylori infection from endoscopic images. Shichijo et al constructed an AI system by employing a deep learning method with 32,205 training images from 735 H. pylori positive 735 cases and 1015 H. pylori negative cases . The verification images used 11,481 images from 397 independent cases. The AI system showed 88.9% sensitivity, 87.4% specificity, and 87.7% accuracy for H. pylori infection. Meanwhile, 23 endoscopists averaged 79.0% sensitivity, 83.2% specificity, and 82.4% accuracy. Therefore, the AI system performed at a more-than-equivalent level with the endoscopists. The average time required for diagnosis in 397 cases was 198 seconds for the AI system, a considerably shorter duration than the 230 minutes for the human endoscopists.

Shichijo et al also constructed an AI system to differentiate not only H. pylori negative and positive cases but also cases before and after H. pylori eradication . The researchers trained the AI using 98,564 training images from 742 H. pylori positive cases, 3649 H. pylori negative cases, and 845 H. pylori -eradicated cases. The verification images used 23,699 images from 847 independent cases. The AI system revealed 80% accuracy for negative diagnoses, 84% accuracy for eradication, and 48% accuracy for positive diagnoses.

4 Gastric cancer screening

The use of AI is not completely helpful in the detection of cancer unless all parts of the stomach are clearly shown. If AI recognizes the anatomic site of the stomach, it will be possible to check if the stomach could be observed completely. Takiyama et al constructed an AI to classify the anatomical location of the upper digestive tract . A total of 27,335 images were classified as portraying the pharynx, esophagus, upper stomach, middle stomach, lower stomach, or duodenum, and were learned by the AI. ROC-AUC values were as good as 1.0 for the pharynx and esophagus and 0.99 for the stomach and duodenum.

Wu et al developed an AI, WISENSE (now renamed ENDOANGEL), to perform a real-time stomach site check . They collected images of 26 typical sites inside the stomach and trained them to AI. If AI recognized the 26 sites during the examination, it was determined that the sites were observed. They conducted a clinical trial in 324 patients and reported that using the AI at the time of gastroscopic examination resulted in a 15% reduction in missed sites. It is expected that AI assists in the comprehensiveness (check all site of stomach) at the time of gastric observation.

5 Esophageal cancer diagnosis

Esophageal cancer is the eighth most common cancer worldwide and the sixth most common cause of cancer-related mortality . Esophageal squamous cell carcinoma (ESCC) is common histological type in Asia (particularly Japan), Middle East, Africa, and South America, while the incidence of esophageal adenocarcinoma (EAC) is increasing in the United States and Europe .

When esophageal cancer is diagnosed at an advanced stage, it requires a highly invasive treatment, and its prognosis is poor. Therefore, early detection is of great importance. However, it is difficult to diagnose esophageal cancer in early stage by conventional endoscopy using white light imaging. Iodine staining has been used to detect ESCC in high-risk patients; however, it is associated with problems such as chest pain or discomfort and increased procedural time .

Narrow band imaging (NBI) is a revolutionary technology of image-enhanced endoscopy that has facilitated more frequent detection of superficial ESCC without using iodine staining . NBI is superior to iodine staining for screening endoscopy because of the ease of use (one pushes one button) and the lack of discomfort for patients. However, NBI has demonstrated an insufficient sensitivity of 53% for detecting ESCC when used by inexperienced endoscopists , indicating that training and experience are required to use NBI effectively.

Horie et al reported use of the CNN-based AI diagnosis system to detect esophageal cancer, including both ESCC and EAC ( Fig 2 ). The CNN was trained by 8428 endoscopic images with esophageal cancer from 397 lesions, including 365 ESCC and 32 EAC lesions. Then they validated the CNN-based AI diagnosis system with another training data set which consisted of 1118 images from 97 cases, including 47 cases with esophageal cancer and 50 cases without esophageal cancer. The AI diagnosis system analyzed the 1118 images in 27 seconds and detected 98% (46/47) of the cases of esophageal cancer. Notably, the AI diagnosis system could detect all smallest 7 lesions less than 10 mm. The sensitivity of the AI diagnosing system in scanning each image was 77%, with specificity of 79%, PPV of 39%, and negative predictive value of 95%. Moreover, the AI diagnosing system could diagnose cancer as either superficial or advanced cancer with 98% accuracy. Although PPV was quite low, it seems that deeper learning would overcome this limitation. Because the analyzing speed of this system is fast enough, this system will work in medical videos, which makes it possible to use the AI diagnosing system during screening endoscopy and support endoscopists not to miss ESCCs.