Ulcerative Colitis Inflammation Status Using AI Technology
Ulcerative colitis (UC) is a chronic inflammatory bowel disease (IBD) characterized by inflammation of the colon (large intestine) and rectum. It is one of the two main types of IBD, the other being Crohn’s disease. UC primarily affects the innermost lining of the colon and rectum, leading to inflammation and the formation of ulcers. The inflammation tends to be continuous and typically starts from the rectum, extending into various parts of the colon.
Common symptoms of ulcerative colitis include abdominal pain, diarrhea (often bloody), rectal bleeding, urgency to have a bowel movement, weight loss, fatigue, and sometimes fever. The severity of symptoms can vary widely among individuals.
People with ulcerative colitis may be at a higher risk of developing certain complications, such as arthritis, skin issues, and eye problems. This underlines the importance of holistic healthcare for individuals with this condition, addressing both gastrointestinal symptoms and potentially related issues in other parts of the body.
Study Overview
Current endoscopic scores for ulcerative colitis (UC) objectively classify disease severity by identifying the presence or absence of endoscopic findings. However, this approach may only capture part of the spectrum of clinical severity within each category.
In contrast, expert endoscopists specializing in inflammatory bowel disease (IBD) assess the severity and provide an overall impression of the inflammation level. This research focuses on creating an artificial intelligence (AI) system capable of accurately reflecting the evaluation of UC endoscopic severity by IBD expert endoscopists.
Methods
A multicenter, retrospective study was conducted using 39,553 endoscopic images obtained from 424 patients with UC who underwent colonoscopy at Keio University Hospital between April 2012 and October 2019.
Additionally, 10,256 endoscopic images were collected from 388 patients with UC who underwent colonoscopy at the Japanese Red Cross Kyoto Daini Hospital during the same period, resulting in a total of 49,809 images.
Endoscopic imaging utilized devices such as PCF-H290ZI, PCF-H290I, CF-H290I, CF-H260AI, or CF-H260AZI, standard endoscopes manufactured by Olympus Medical Systems (Tokyo, Japan), and EC-L600ZP7, EC-L600ZP7/L, or EC-L600XP7/L devices, which are standard endoscopes manufactured by Fujifilm Corporation (Tokyo, Japan).
The study received approval from the ethics committee of each medical institution (Keio Ethics Committee approval no. 20180315), and all participants provided informed consent before participating.
A ranking-convolutional neural network (ranking-CNN) underwent training with comparative data on the severity of ulcerative colitis (UC) from 13,826 pairs of endoscopic images, curated by expert endoscopists specializing in inflammatory bowel disease (IBD).
To refine the AI’s ability to assess inflammation consistently with IBD expert endoscopists, scores from MES or UCEIS were not directly used for AI training classification tasks. Instead, the AI underwent training as a ranking task using comparative information on UC severity from image pairs annotated by IBD expert endoscopists.
In MES and UCEIS systems, findings reflecting inflammation are translated into numeric values, forming the basis for assessment. These findings include aspects like visible vascular patterns, mucopurulent matter adhesion, depth, number of erosional and ulcerous surfaces, mucosal edema degree, redness degree, range, mucosal fragility, and extent of regenerated mucosa.
However, relying solely on assessments using these criteria may not accurately replicate composite evaluations of UC severity made by IBD expert endoscopists. Therefore, diagnostic assessments without detailed severity criteria restrictions were employed.
The trained ranking-CNN was then applied to the UC Endoscopic Gradation Scale (UCEGS) to quantify severity. Correlation coefficients were computed to ensure consistency in severity assessments between the UCEGS diagnosed by the AI and the Mayo Endoscopic Subscore. Additionally, correlation coefficients were determined for the mean assessments of test images using UCEGS by four IBD expert endoscopists and the AI.
The study engaged seven IBD expert endoscopists, with three involved in AI development and four in validation. An IBD expert endoscopist, with over 15 years of experience, conducted endoscopy on 2000+ UC patients and published at least one report on UC endoscopic diagnosis.
For AI image datasets, an IBD expert endoscopist assessed which of two randomly arranged images (total training pool: 14,208 images) indicated greater severity or approximate equality. MES data were used for pretraining and evaluating AI performance during data preparation. The dataset included 14,208 images, with 7,897, 2,847, 2,684, and 780 images having MES values of 0, 1, 2, and 3, respectively. Blurred, narrow-band imaging or out-of-focus images were excluded during preprocessing.
Results
Correlation coefficients were computed using 1,479 images, previously assessed with MES by an IBD expert endoscopist, to ensure consistency in severity assessments between the novel AI-diagnosed UCEGS and the current MES score.
The UCEGS results for 50 test images, assessed by four IBD expert endoscopists not involved in the AI development, were also analyzed for correlation coefficients, standard errors of the means, and standard deviation to assess diagnostic variability among the endoscopists.
Additionally, the system’s ability to diagnose inflammation accurately within the correct range was confirmed by evaluating endoscopic images from different UC patients and dates. The test set comprised 1,479 images selected independently from the training set, and 50 test images were randomly chosen for assessment.
The Spearman’s correlation coefficient between the AI-diagnosed UCEGS and the Mayo Endoscopic Subscore was around 0.89. The correlation coefficients for the evaluation results between the IBD expert endoscopists and the AI were consistently higher than 0.95 (P<0.01).
Conclusion
In this study, a novel AI was developed to enhance the evaluation of ulcerative colitis (UC) by leveraging the expertise of IBD expert endoscopists. Unlike traditional scoring methods such as MES and UCEIS, this AI quantifies inflammation on a gradient, providing an automated visualization of the endoscopist’s assessment. Previous AIs focused on scoring indices, but expert endoscopists consider nuanced factors beyond conventional scores, like mucus amount, edema, erythema, and regenerative epithelium. This novel AI, expressed on a continuous scale from 0 to 10, was found to reliably reproduce IBD expert assessments, offering a potential computerized substitute.
The study introduced a novel deep learning method, ranking-CNN, described in the “AI algorithm” section, which accurately orders objects and is represented as the UCEGS. This AI’s strength lies in its ability to represent inflammation on a detailed UCEGS scale, allowing a more nuanced assessment than conventional systems. The AI subdivides images into severity levels, even under MES criteria, enhancing efficacy evaluation and mucosal healing assessment.
To facilitate clinical use, a user interface (UI) depicting inflammation severity and distribution was developed. The UI overlays the UCEGS on the colon image, aiding in immediate comprehension of inflammation status and treatment response. While the AI has limitations, such as potential selection bias in image collection and the need for future validation in clinical practice, it presents a promising tool for comprehensive UC severity assessment. Future studies should explore its responsiveness to effective therapies.