Replication Data for: CAPTIV8 : A comprehensive large scale CAPsule endoscopy dataset for Integrated diagnosis


General description and ethics approvals: The dataset contains images and videos of wireless capsule endoscopic examinations of 10 patients focused on the large colon conducted using the PillCAM colon 2 capsule manufactured by Medtronic. In addition to images and videos it includes alphanumeric metadata comprising of diagnostic summaries from capsule endoscopy, colonoscopy and histopathology reports. The dataset includes 8 different types of pathologies in addition to symptoms of ulcerative colitis. The examinations were conducted in 2021 at the Innlandet Hospital Trust, Gjøvik (Norway) with confirmed patients of ulcerative colitis. All patients gave written informed consent and ethical approvals to publish the anonymized image, video and text data were obtained from the director of medicine and health at Innlandet Hospital Trust in 2021. Patient information was not linked to the study to preseve anonymity and pseudo IDs were assigned instead.

Data acquistion procedure : The patients underwent capsule endoscopy examination on the first day followed by a colonoscopy the next day. Tissue samples were retrieved during colonoscopy from different bowel segments and sent for histopathology. The histopathology report corresponds to 5 sections of the colon numbered 1 to 5, these can be interpreted as such : 1 : cecum/ascending 2: transverse 3: descending, 4: sigmoid, 5 rectum.

Annotation procedures: The annotations were performed by experienced gastroenetrologist in the software Rapid reader. Clean and representative normal as well as abnormal frames were selected in the video and a text describing the images was written corresponding to the images. A short video segment of approximately 150 frames each was extracted around each of these normal/abnormal images. These are available in the dataset. The video can be assumed to carry the same weak-label as the frame. Certain video fragments have been cut to be shorter than 150 frames intentionally to prevent accidental identification pre or post capsule ingestion.

Related Identifier IsCitedBy
Metadata Access
Creator Vats, Anuja ORCID logo; Ahmed, Bilal ORCID logo; Floor, Pål Anders; Mohammed, Ahmed; Pedersen, Marius ORCID logo; Hovde, Øistein
Publisher DataverseNO
Contributor Vats, Anuja; Pedersen, Marius; NTNU – Norwegian University of Science and Technology; NTNU Colourlab
Publication Year 2024
Rights CC0 1.0; info:eu-repo/semantics/openAccess;
OpenAccess true
Contact Vats, Anuja (NTNU – Norwegian University of Science and Technology); Pedersen, Marius (NTNU – Norwegian University of Science and Technology)
Resource Type Dataset
Format text/plain; application/zip; application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size 8933; 9113; 61062; 25982; 91783; 3951201762; 3487302
Version 1.2
Discipline Life Sciences; Medicine