The Prostate_Cancer_CISH_HE_Epithelium_Segmentation dataset

DOI

The Prostate_cISH_Epithelium_Segmentation Dataset

Corresponding author: Henrik Sahlin Pettersen (henrik.s.pettersen@ntnu.no)
Consultant Pathologist / Associate Professor, St. Olav's Hospital / NTNU, Trondheim, Norway

Short Webpage Description

This dataset provides high-resolution histopathological images and corresponding expert-annotated segmentation masks, specifically designed for developing AI models for prostate epithelium segmentation. The images feature Chromogenic In Situ Hybridization (cISH) staining for various miRNAs, alongside controls and standard Hematoxylin & Eosin (HE) stains. Data originates from 70 patients: 30 with prostate cancer (PCa) and 40 with benign prostatic hyperplasia (BPH).

Sample Collection

Prostate Cancer (PCa): 30 patients. Samples include normal glandular epithelium (core type 'a'), Gleason 3 pattern (core type 'b'), and Gleason 4 pattern (core type 'c'), ideally in triplicate for each marker.
Benign Prostatic Hyperplasia (BPH): 40 patients. Samples consist of triplicate normal glandular epithelium for each marker.

Markers and Controls Data is provided for the following stains, each organized into its own top-level folder:

miRNAs: miR‑550A, miR‑1246, miR‑3614, miR‑4326, miR‑4632, miR‑4742, miR‑4754, miR‑7850
Controls: U6 (Positive), Scr (Negative)
Standard Stain: Hematoxylin & Eosin (HE)

Data Format & Organization

Each high-resolution image (.jpg) has a corresponding pixel-level segmentation mask (.png) delineating the prostate epithelium, suitable for training deep learning models. Segmentation masks are single-channel images where pixel value 0 indicates background and pixel value 255 indicates epithelium.


The data is organized first by marker, then by tissue type/origin. Within each marker's top-level folder (e.g., HE/, 550A/), the structure is:

[Marker]/ ├── Normal/ │ ├── Normal_TURP_BPH/ (Images/masks from BPH patients) │ └── Normal_Prostatectomy/ (Normal core 'a' images/masks from PCa patients) └── Cancer/ └── Cancer_Prostatectomy/ (Gleason 3/4 core 'b'/'c' images/masks from PCa patients)

All image and mask files are located directly within the innermost folders (Normal_TURP_BPH, Normal_Prostatectomy, Cancer_Prostatectomy). Filenames encode marker, patient type, anonymized ID, sample number, core type, and optional experimental details.

Terms of Use

Distributed under a CC0 license for open research and development.
Identifier
DOI https://doi.org/10.18710/EGRQRC
Metadata Access https://dataverse.no/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18710/EGRQRC
Provenance
Creator Pettersen, Henrik Sahlin ORCID logo; Wiik, Erik Nesje ORCID logo
Publisher DataverseNO
Contributor Pettersen, Henrik Sahlin; Faculty of Medicine and Health Sciences; NTNU – Norwegian University of Science and Technology
Publication Year 2025
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact Pettersen, Henrik Sahlin (NTNU – Norwegian University of Science and Technology)
Representation
Resource Type Dataset
Format text/plain; application/zip
Size 8698; 22184678661; 23224072355; 23385715897; 23240968934; 22663312845; 23547707975; 22436951809; 22488217403; 19266063123; 47422123185; 42155; 58038637903
Version 1.0
Discipline Life Sciences; Medicine