RCA Insurance Library
Synthetic commercial P&C submission packs
Pre-labelled broker submissions for QA, evaluation and training of document extraction pipelines. Built and shipped by Root Cause Analytics.
What is in the library
Each submission pack is a complete broker submission as you would receive it in a real underwriting inbox: cover note, attachments, supporting forms. Pack composition varies by submission type (new business, renewal with claims, FNOL).
Broker submission email
Cover note, named attachments, broker signature block
Loss run report
Last 5 years of claims, per-claim rows, displayed totals, status
Statement of values
Per-location rows, building, contents and BI values, displayed totals
Policy schedule
Insurer schedule with limits, deductibles, endorsements
Certificate of currency
Broker-issued confirmation of cover
Insurance application
New business questionnaire
FNOL form
First notice of loss form
Claim report
Incumbent renewal claim narrative
Engineered red flags
A subset of packs are deliberately broken: cross-document inconsistencies we have seen in real submissions, engineered in at known positions so your extraction or validation pipeline has a controlled target to flag.
Loss run total mismatch
Displayed total disagrees with the sum of the claim rows
Statement of values total mismatch
Displayed total disagrees with the sum of the location rows
Missing attachment
The broker email lists a doc that is not in the pack
ABN formatting inconsistency
Same ABN formatted differently across documents in the same pack
Policy number mismatch
Certificate of currency disagrees with the policy schedule
Location address mismatch
Statement of values address disagrees with the policy schedule
Claim after policy end
A loss date is outside the policy period
Currency mismatch
A non-AUD currency on a single location row inside an otherwise AUD submission
Red flag inventory ships as red_flags_summary.csv with each pack. The CSV includes a where_to_review column pointing to the two documents to compare. This file is the most useful artefact for QA workflows.
Bbox structure: per-row, not just per-document
Most synthetic libraries return one bounding box per document. The RCA Insurance Library returns a bbox for every labelled field in the document, plus a per-row bbox for every claim in claim_rows_json and every location in location_rows_json.
A LayoutLMv3 or Donut fine-tune learns per-claim and per-location supervision. A reviewer can click any row in the structured ground truth and highlight the exact pixels on the rendered PDF.


Same shape on statements of values
Per-location rows from location_rows_json get the same treatment. Each address, occupancy, building value, contents value, stock value and BI value lands as its own bbox keyed by row index. A statement of values with five sites ships roughly 41 labelled-field bboxes.


Diversity controls
Each PDF is rendered with a deterministically chosen style profile, each modelled on a real underwriting-inbox archetype:
Each document type has three named template families that vary header / footer / section ordering without changing field labels or ground truth values. The chosen profile and family are recorded per row in the ground truth.
Pricing
| Tier | Scale | Price | Delivery |
|---|---|---|---|
| Free preview | 2 packs | Free | Same day on request |
| QA Sprint Pack | 10 packs + red flag summary + 30-min handover | AUD $2,500 | 48 to 72 hours |
| QA library | 25, 100, 500 packs | On request | Scoped per order |
| Bulk training library | 5,000+ packs | On request | Scoped per order |
| Custom variants | Your document types or red flag set | On request | Scoped per order |
Synthetic safety
Every PDF carries a visible synthetic disclaimer on every page. All broker names, insurer names, insured business names, ABNs, addresses, phone numbers, policy numbers, claim numbers and dollar values are computer-generated and do not refer to any real organisation, broker, insurer or claim.
Not for underwriting, claims handling, accounting, or regulatory use.
Try the free 2-pack preview
Two complete submission packs, ground truth, bboxes and scanned variants. The pack ships with a five-minute review path.