Now 300,000 Whole Slide Images in the Data Hub

Learn more

HistAI Data Hub

The World's Largest
WSI Data Hub

A curated repository of whole slide images spanning hundreds of tissue types, stains, and disease states — fully licensed for commercial AI development and available for instant download.

301K
Whole slide images
301,208 files
76K
Unique cases
76,355 patients
218K
H&E slides
83K IHC & special stains
100%
Commercial license
Perpetual, transparent terms

Real, diverse cases

76,355 de-identified patient cases across cancer and non-cancer diagnoses, with paired metadata for cohort building.

H&E + IHC + special stains

218K H&E slides plus 83K IHC and special stains spanning 400+ markers and clones.

Commercial-ready rights

Full perpetual commercial license. Train, fine-tune, and deploy AI models with no usage restrictions and no hidden fees.

Dataset at a glance

The full distributions behind the Data Hub — explore the demographics, organ systems, diagnoses, scanners, and stains that make up the dataset.

Cancer vs. non-cancer

Diagnosis category across all cases

Cancer: 27,913 (37%)Non-cancer: 48,442 (63%)76Ktotal
Cancer27,91337%
Non-cancer48,44263%

Gender

Patient gender across all cases

Female: 51,762 (68%)Male: 24,503 (32%)Not specified: 90 (0.1%)76Ktotal
Female51,76268%
Male24,50332%
Not specified900.1%

Age distribution

Cases grouped into 10-year deciles.

05.0K10K15K20K0–9: 4024020–910–19: 1,1381.1K10–1920–29: 4,7644.8K20–2930–39: 13,19613K30–3940–49: 13,96714K40–4950–59: 12,50413K50–5960–69: 15,36115K60–6970–79: 10,73211K70–7980–89: 3,8653.9K80–8990–100: 42342390–100

Scanners

Number of slide files by scanner

Leica Aperio GT 450Leica Aperio GT 450: 293,493293,4933DHISTECH Panoramic 2503DHISTECH Panoramic 250: 4,0304,030Not specifiedNot specified: 3,6363,636Hamamatsu NanoZoomer S360Hamamatsu NanoZoomer S360: 4949

Top organ systems

Number of cases by organ system (top 20)

GastrointestinalGastrointestinal: 25,53225,532SkinSkin: 15,20315,203GynecologicalGynecological: 10,51210,512BreastBreast: 6,1176,117GenitourinaryGenitourinary: 3,5473,547IntegumentaryIntegumentary: 2,7432,743HeadHead: 1,6371,637NeckNeck: 1,6091,609RespiratoryRespiratory: 1,1631,163Soft TissueSoft Tissue: 949949LymphaticLymphatic: 937937HematolymphoidHematolymphoid: 553553EndocrineEndocrine: 465465HematopoieticHematopoietic: 413413MusculoskeletalMusculoskeletal: 311311HematologicHematologic: 301301HepatobiliaryHepatobiliary: 234234NervousNervous: 210210PeritonealPeritoneal: 104104LymphoidLymphoid: 2727

Top cancer diagnoses

Top 20 by case count

Invasive breast carcinoma NOSInvasive breast carcinoma NOS: 4,7984,798Basal cell carcinomaBasal cell carcinoma: 2,4102,410AdenocarcinomaAdenocarcinoma: 1,7211,721Acinar adenocarcinomaAcinar adenocarcinoma: 1,5071,507Squamous cell carcinomaSquamous cell carcinoma: 631631Invasive lobular carcinomaInvasive lobular carcinoma: 396396Superficial spreading melanomaSuperficial spreading melanoma: 373373Ductal carcinoma in situ (DCIS)Ductal carcinoma in situ (DCIS): 323323Squamous cell carcinoma keratinizingSquamous cell carcinoma keratinizing: 262262CarcinomaCarcinoma: 237237Rectal adenocarcinomaRectal adenocarcinoma: 237237Serous carcinomaSerous carcinoma: 233233Gastric adenocarcinomaGastric adenocarcinoma: 230230Papillary urothelial carcinomaPapillary urothelial carcinoma: 227227Adenocarcinoma differentiatedAdenocarcinoma differentiated: 220220Neuroendocrine tumorNeuroendocrine tumor: 214214TumorTumor: 209209Endometrial adenocarcinomaEndometrial adenocarcinoma: 203203Tubular adenocarcinomaTubular adenocarcinoma: 198198Follicular lymphomaFollicular lymphoma: 192192

Top non-cancer diagnoses

Top 20 by case count

Intradermal melanocytic nevusIntradermal melanocytic nevus: 4,5774,577Compound melanocytic nevusCompound melanocytic nevus: 3,4873,487Hyperplastic polypHyperplastic polyp: 2,0782,078Tubular adenoma low gradeTubular adenoma low grade: 1,8241,824Chronic gastritisChronic gastritis: 1,7311,731Seborrheic keratosisSeborrheic keratosis: 1,3361,336Chronic antral gastritisChronic antral gastritis: 1,1541,154NormNorm: 846846Endometrial polypEndometrial polyp: 768768Tubular adenomaTubular adenoma: 760760Complex melanocytic nevusComplex melanocytic nevus: 612612Intradermal papillomatous melanocytic nevusIntradermal papillomatous melanocytic nevus: 566566Chronic endometritisChronic endometritis: 535535Melanocytic nevusMelanocytic nevus: 495495EctopyEctopy: 444444Chronic cervicitisChronic cervicitis: 414414Chronic tonsillitisChronic tonsillitis: 403403Proliferative phase endometriumProliferative phase endometrium: 385385Chronic non-specific endometritisChronic non-specific endometritis: 379379Sessile serrated lesionSessile serrated lesion: 360360

Top 100 stains

Files by stain. 218,434 H&E slides plus a long tail of IHC clones and special stains.

#StainFilesShare
1
H&E
218,43477%
2
Ki67 (MM1)
3,7651.3%
3
ER (6F11)
3,4461.2%
4
Giemsa
3,2171.1%
5
Progesterone Receptor
2,9681.0%
6
HER2 (c-erbB-2)
2,3550.8%
7
CD10 (56C6)
1,4460.5%
8
CD20 (MJ1)
1,4130.5%
9
Ki67 (30-9)
1,3000.5%
10
CK7(OV-TL 12/30)
1,2700.4%
11
Bcl-6 (LN22)
1,1870.4%
12
MCK (AE1& AE3)
1,1830.4%
13
Bcl-2 (bcl-2/100/D5)
1,1680.4%
14
HER2/neu
1,1630.4%
15
CK20 (Ks20.8)
1,1410.4%
16
Ki67 (SP6)
1,0790.4%
17
CD3 (LN10) VET
1,0790.4%
18
CD5 (4C7)
1,0350.4%
19
Synaptophysin
1,0220.4%
20
p63 7JUL
1,0000.4%
21
S100(4C4.9)
9750.3%
22
CD23 (1B12)
9450.3%
23
TTF-1(SPT24)
9440.3%
24
MUM1 Protein (MUM1p)
8250.3%
25
Pax-8 (Polyclonal)
8130.3%
26
HER2/neu (4B5)
8010.3%
27
GATA-3 (L50-823)
8000.3%
28
CDX2 (EPR2764Y)
7980.3%
29
Cyclin D1 (D1-GM)
7840.3%
30
CD34 (QBEnd/10)
7830.3%
31
CD138 (B-A38)
7620.3%
32
PD-L1 (28-8)
7270.3%
33
MCK (AE1/AE3)
7120.2%
34
CD30 (1G12)
6960.2%
35
CD45 (X16/99)
6640.2%
36
CD20(L26)
6170.2%
37
CK14 (LL002)
5960.2%
38
P40 (BC28)
5910.2%
39
Chromogranin A (LK2H10)
5780.2%
40
SMA(1A4)
5550.2%
41
Ki67 (MIB-1) VET
5540.2%
42
ER (SP1)
5480.2%
43
Vimentin (V9)
5360.2%
44
CK HMW
5230.2%
45
AMACR
5220.2%
46
P53 (DO-7)
5070.2%
47
PR (16)
5030.2%
48
MSH6 (PU29)
4940.2%
49
PR (1E2)
4810.2%
50
WT1 WT49
4770.2%
#StainFilesShare
51
CD56 (CD564)
4750.2%
52
SOX-10 (EP268)
4590.2%
53
Pax-5 (1EW)
4470.2%
54
Desmin (DE-R-11)
4460.2%
55
CD138 (MI15)
4350.2%
56
E-Cadherin (36B5)
4350.2%
57
Epstein Barr Virus (CS1-4)
4340.2%
58
Chromogranin A(DAK-A3)
4130.1%
59
CK7(RN7)
4010.1%
60
CD56 (123C3.D5)
3730.1%
61
CD3 (MRQ -39)
3610.1%
62
MLH1 (MLH1)
3610.1%
63
S100 (Polyclonal)
3540.1%
64
PMS2 (MRQ-28)
3420.1%
65
p63 (i27-i)
3400.1%
66
CD117 (T595)
3330.1%
67
Melan-A (A103)
3320.1%
68
ER (1D5)
3260.1%
69
Estrogen Receptor
3090.1%
70
MSH2 (25D12)
2920.1%
71
Progesterone Receptor (PgR636)
2880.1%
72
EMA (GP1.4)
2860.1%
73
HMB45 (HMB-45)
2750.1%
74
CD68 (514H12)
2670.1%
75
CD4 (4B12)
2660.1%
76
Napsin A (poly)
2570.1%
77
P16(R19-D)
2560.1%
78
CK 5/6 D5/16B4
2540.1%
79
TTF1 (SPT24)
2480.1%
80
Van Gieson
2440.1%
81
MSH2 (G219-1129)
2370.1%
82
CK 8 & 18 (B22.1&B23.1)
2260.1%
83
Calretinin (CAL6)
2260.1%
84
Calponin-1 (EP798Y)
2240.1%
85
Argentum
2230.1%
86
Lambda Light Chain (SHL53)
2210.1%
87
Kappa Light Chain (CH15)
2180.1%
88
CD15 (MMA)
2040.1%
89
CD99 (EPR3097Y)
2020.1%
90
DOG-1 (K9)
2000.1%
91
PAS
1940.1%
92
Cyclin D1 (SP4-R)
1910.1%
93
SATB2 (EP281)
1860.1%
94
Synaptophysin (27G12)
1830.1%
95
Arginase-1 (EP261)
1830.1%
96
CD79a (11E3)
1790.1%
97
INSM1
1760.1%
98
Alcian Blue Shifa
1720.1%
99
CK5/6 (D5/16B4)
1700.1%
100
PD-L1 (ZR3)
1690.1%

Build the next generation of pathology AI.

Filter, preview, and download cohorts tailored to your research — with transparent pricing and a commercial license that travels with the data.