Created
August 13, 2023
Created by
Status
In Progress
Tags
Reading Notes
Hypothesis Annotations: https://hyp.is/go?url=https%3A%2F%2Farxiv.org%2Fpdf%2F2011.13971.pdf&group=o94Eonpx
My main goals for reading this paper is to understand the following things: [1] How do people implement SSL for histopathology and what kind of architecture have people tried to use for histopathology tasks, [2] any data-specific decisions in model design, [3] evaluation tasks, [4] data curation procedure, [5] evaluation metrics.
‣
- Train SimCLR on a multiorgan pathology dataset without supervision
- The paper shows that pretraining with unlabelled histopathology images can improve performance over Imagenet pretraining.
‣
- SimCLR
‣
- Having a more diverse training set and dataset-varying sampling strategy helps with the contrastive learning outcomes which benefit from having visually diverse data.
- The majority if the WSI datasets are from TCGA and CPTAC, and a variety of public challenges datasets
‣
- Settings Variation
- Single Dataset Pretraining
- Different Size of Pretraining Dataset
- Task Variation
- BACH (four classes of breast cancer classification
- Lymph (three-class malignant lymph node cancer classification
- BreakHisv1 (binary breast cancer classification)
- NCT-CRC-HE100K (9 classes of colorectal cancer tissue classification)
- Gleason2019 (five classes prostate cancer classification)
- DigestPath2019 (WSI segmentation)
- BreastPathQ (single regression dataset)
‣
For later.