This dataset is issued from the public repository TCGA (https://portal.gdc.cancer.gov/) and contain several files, each corresponding to a given omic on the same individuals with breast cancer. Raw data have been obtained from the mixOmics case study described in http://mixomics.org/mixdiablo/case-study-tcga/ [link accessed on August 18, 2021] and were made available by the package authors at http://mixomics.org/wp-content/uploads/2016/08/TCGA.normalised.mixDIABLO.RData_.zip (R data format). Data in the zip file had been normalised for technical biases by the package authors.
Data from the train and test sets were exported as TXT/CSV files and completed with miRNA expression on the smae individuals and toy datasets to handle missing value cases and alike. They serve as a basis for the illustration of the web data analysis tool ASTERICS (Project 20008788 funded by Région Occitanie).
R, 4.0.4
The Cancer Genome Atlas (TCGA) https://portal.gdc.cancer.gov/
Data dictionnary is available on TCGA website https://docs.gdc.cancer.gov/Data_Dictionary/viewer/
The origin of sources is a public repository where raw original data may be retrieved. Data were preprocessed (normalized) by the mixOmics package authors as described in Supplementary Section S2 of [Singh et al, 2019], where the origin of the dataset is also fully described.