HeLaCytoNuc: fluorescence microscopy dataset with segmentation masks for cell nuclei and cytoplasm

DOI

Data Description:

This dataset comprises fluorescence micrographs of HeLa cells, specifically labelled to identify nuclei and cell cytoplasm. These images were acquired as a technical calibration for a high-content screening study detailed and published in [1].

The HeLa cell line (ATCC-CCL-2), a widely used immortalised cell line in laboratory research, was cultured under standard conditions. Post-cultivation, the cells were fixed and stained with fluorescent dyes to visualise the nuclei and cytoplasm. The nuclei were stained with DAPI (4',6-diamidino-2-phenylindole), a blue-fluorescent DNA stain, while fluorescent-labeled phalloidin was used to detect actin filaments and delineate the cytoplasm. The entire process of cell culture, fixation, staining, and imaging adhered strictly to the protocols described in [1].

The preprocessed dataset includes 2,676 8-bit RGB images, each with a pixel resolution of 520 x 696 pixels. In these images, only two of the RGB channels are utilized: the red channel represents the cytoplasm, and the blue channel represents the nuclei. The dataset is divided into training, validation, and test subsets in a 70:20:10 ratio. The entire dataset is accompanied by instance segmentation masks for nuclei and cytoplasm objects obtained through a specialised CellProfiler [2] software. Notably, the test subset was annotated manually by a specialist, ensuring high-quality annotations. The original raw images are of a higher resolution, 1040 x 1392 pixels, and have a bit depth of 16 bits, providing more detailed information for advanced analyses.

File Description:

The file structure of the zip files is as follows:

HeLaCytoNuc_{train/validation/test}.zip ->

  • images -> {filename}.tif

  • nuclei_masks  -> {filename}.tif

  • cytoplasm_masks  -> {filename}.tif

HeLaCytoNuc_raw_images.zip -> {filename}.tif

HeLaCytoNuc_test_cellprofiler_masks.zip ->

  • nuclei_masks  -> {filename}.tif

  • cytoplasm_masks  -> {filename}.tif 

References:

1. Rämö, Pauli, Anna Drewek, Cécile Arrieumerlou, Niko Beerenwinkel, Houchaima Ben-Tekaya, Bettina Cardel, Alain Casanova et al. "Simultaneous analysis of large-scale RNAi screens for pathogen entry." BMC genomics 15 (2014): 1-18.

2. Carpenter, Anne E., Thouis R. Jones, Michael R. Lamprecht, Colin Clarke, In Han Kang, Ola Friman, David A. Guertin et al. "CellProfiler: image analysis software for identifying and quantifying cell phenotypes." Genome biology 7 (2006): 1-11.

Identifier
DOI https://doi.org/10.14278/rodare.3001
Related Identifier IsIdenticalTo https://www.hzdr.de/publications/Publ-39181
Related Identifier IsPartOf https://doi.org/10.14278/rodare.3000
Related Identifier IsPartOf https://rodare.hzdr.de/communities/health
Related Identifier IsPartOf https://rodare.hzdr.de/communities/rodare
Metadata Access https://rodare.hzdr.de/oai2d?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:rodare.hzdr.de:3001
Provenance
Creator De, Trina ORCID logo; Urbanski, Adrian ORCID logo; Thangamani, Subasini; Wyrzykowska, Maria; Yakimovich, Artur ORCID logo
Publisher Rodare
Publication Year 2024
Rights Creative Commons Attribution 4.0 International; Open Access; https://creativecommons.org/licenses/by/4.0/legalcode; info:eu-repo/semantics/openAccess
OpenAccess true
Contact https://rodare.hzdr.de/support
Representation
Language English
Resource Type Dataset
Version Version 1
Discipline Life Sciences; Natural Sciences; Engineering Sciences