5/26/2026
HistAI CellDX Data Hub Reaches 300,000 Whole Slide Images
HistAI today announced that its CellDX Data Hub has surpassed 300,000 whole slide images (WSIs) — a major milestone that cements the platform as one of the largest commercially accessible collections of digitized pathology data available to the AI and research community.
The collection spans a diverse range of benign and malignant cases across various organs and indications, each accompanied by rich, fully digitized pathology reports. This breadth and depth of coverage makes the Data Hub uniquely suited for training robust, generalizable AI models in computational pathology.
A Self-Service Platform for Histopathology Data
CellDX Data Hub is designed to remove the friction traditionally associated with acquiring quality pathology datasets. The platform provides a convenient, self-service experience for data exploration and slide purchase — no lengthy negotiations, no opaque licensing, no waiting.
Users can navigate the intuitive interface at celldx.hist.ai to browse slides, build cohorts by organ, stain type, diagnosis and other parameters, review detailed digital reports, and purchase exactly the data they need.
AI-Native Access via Agent Skills
Beyond the web UI, CellDX Data Hub is built for the agentic AI era. Users can task their AI agents — including Claude Code, Gemini, and Codex — to search the Data Hub, explore available datasets, and assemble well-defined data cohorts programmatically.
HistAI's agent skills are open-source and available at github.com/histai/skillsets, enabling seamless integration into any compatible agent runtime or ML pipeline.
Democratizing Access to Pathology Data
HistAI remains committed to democratizing access to valuable pathology resources for the broader community. The company offers two complementary paths:
Commercial licensing. A full-fledged, perpetual commercial license with transparent pricing and volume discounts — no per-inference royalties, no recurring fees. Full details are available in HistAI's Data License Agreement.
Open-source contributions. HistAI continues to open-source large collections of data and AI models through its Hugging Face organization, including the HISTAI dataset of 100,000+ WSIs, the SPIDER multi-organ patch dataset, and the Hibou family of pathology foundation models with over 1.5M downloads.
What's Next
The path from 100,000 to 300,000 slides has been driven by strong partnerships and growing community demand. HistAI will continue to expand the Data Hub with new organs, indications, and stain types — building the foundational data infrastructure that computational pathology needs to scale.
Explore the Data Hub today at celldx.hist.ai.