About Root Cause Analytics
Root Cause Analytics builds document extraction products and pre-labelled synthetic document libraries for teams working with healthcare, insurance and other privacy-sensitive documents. RCA Extract (formerly MEDISCAN) is our self-hosted extraction container. The RCA libraries are the test data the product is built against, sold separately.
Making document data work
Healthcare and insurance organisations process millions of paper and digital documents each year. Discharge summaries, referral letters, pathology reports, broker submissions, policy schedules. The information locked inside these documents has enormous operational value, yet most of it remains inaccessible because extracting it manually is too slow, too expensive, and too error-prone.
We built RCA Extract to change that for healthcare PDFs, and we built the RCA Medical and Insurance libraries to make sure RCA Extract (and other extraction pipelines) have somewhere safe to be tested.
RCA Extract runs as a self-hosted Docker container inside the customer's own environment. The libraries ship as direct downloads with ground truth, bounding boxes and scanned variants for every document. See how RCA Extract works
What we stand for
Our values guide every product decision, from architecture choices to pricing models.
Patient-First Design
Every feature is designed with the downstream impact on patient care in mind. Better data quality leads to better clinical decisions.
Security by Architecture
We chose a zero data movement architecture not as a feature - but as a foundational design principle. Patient data stays where it belongs.
Evidence Over Marketing
We publish per-document-type evaluation alongside benchmark releases rather than headline accuracy numbers. Numbers without methodology do not help buyers.
Simplicity at Scale
Enterprise data challenges should not require enterprise-scale implementation projects. A self-hosted container that runs in your environment reflects that belief.
Founder-led
Root Cause Analytics is a specialist document AI and healthcare data company based in Sydney, Australia.
Jack Webb
Founder & Lead Data Engineer
Builds Root Cause Analytics from Sydney. Background in healthcare data engineering. Direct contact below.
jack.webb@rootcauseanalytics.com.auTechnical capabilities
The product line combines healthcare-specific OCR and NLP, deployed as a self-hosted container, alongside synthetic training document libraries used internally for validation and sold externally for QA.
- Healthcare-specific OCR fine-tuned on clinical documents
- Self-hosted container for zero data egress
- FHIR-aligned output schemas for interoperability
- Synthetic training document libraries shipped with ground truth and bounding boxes
- Deterministic generators, reproducible by seed
- AU-specific document conventions: NSW postcodes, Medicare format, provider postnominals
Security & Synthetic Safety
Get in touch
Request a free preview pack from one of the libraries, talk to us about deploying RCA Extract in your environment, or reach out about a custom library scoped to your document types.