CLASSIFIED · KiwiQA RESEARCH & DEVELOPMENT

The way you test AI for hallucinations
is about to                    

Try Grounded AI already catches hallucinations in under 60 seconds. What's coming next will fundamentally change how teams ship AI.

WE CAN'T SAY MORE. YET.

00DAYS
:
00HOURS
:
00MINUTES
:
00SECONDS
10 detection layers live today Built by KiwiQA Something bigger is almost ready
ALREADY LIVE — IN PRODUCTION

10 layers of hallucination detection.
Shipping in production today.

LIVE · RESPONSE AUDIT
10-Layer Hallucination Detection
Consistency · Confidence · Model Agreement · Semantic Drift · Domain Rules · Custom Rules · RAG Citation Map · Doc Grounding · Structured Data Fidelity · Source Attribution. One GR score. 60 seconds.
LIVE · BATCH & CONVERSATION
Batch Audit & Conversation Analysis
Upload a CSV of up to 50 responses. Or analyse a full chatbot transcript — per-turn GR scores, persona drift detection, and cross-turn contradiction checks.
LIVE · RISK PROFILE
Risk Profile, PDF Reports & Model Changelog
GR trend charts, failure rates by layer, automated action plans, A4 PDF exports, and a Model Changelog that diffs any two models side by side.
WHAT'S COMING 1 MAY 2026 — PARTIALLY DECLASSIFIED

We're building the next layer.
Here's what we can reveal.

01 / CONFIRMED
GitHub Actions CI/CD Gate
Try Grounded AI runs as a step in your pipeline. Every pull request gets a GR score. Set hallucination thresholds as merge requirements. Ship only when AI is reliable.
02 / CONFIRMED
Golden Dataset Baselines
Lock in a verified answer set once. Every future model update or prompt change runs against it automatically. Know the exact moment your AI started drifting.
03 / PARTIALLY REVEALED
Async Batch & Webhook API
Submit 10,000+ responses. Results delivered to your webhook endpoint when ready. Zero blocking. Enterprise-scale continuous testing without SDK setup.
04 / CONFIRMED
Team Workspace & SSO
Shared test history, role-based access, and enterprise SSO. One GR score standard across your entire QA, engineering, and compliance org.
05 / PARTIALLY REVEALED
Compliance Evidence Package
Automatically generate audit-ready AI governance documentation. Timestamped GR reports structured for ISO 42001 and enterprise AI risk frameworks.
06 / [REDACTED]
████████ ██ ████ ████
Clearance level: restricted. This feature was not on our original roadmap. Our beta testers asked for it. We built it. You will understand why the moment you see it at launch.
COMING 1 MAY 2026 — FEATURE PREVIEW

Three ways to test your AI.
All in one platform.

Choose the test type that matches your workflow. Every test returns a GR score, layer-by-layer breakdown, and PDF report.

RESPONSE AUDIT — WITH CRM DATA VALIDATION
GR-2 HIGH RISK
CRM RECORD — SALESFORCE ACCOUNT
Acme Corp — Enterprise Plan
Contract Value $148,000 / yr
Plan Enterprise Pro
Renewal Date 14 Aug 2025
Account Manager Sarah Chen
Support Tier Priority 24/7
AI RESPONSE UNDER TEST
"Acme Corp is on our Business plan at $120,000 per year. Their contract renews in November 2025 and their account manager is Sarah Chen. They have standard business hours support."
↑ 3 logical hallucinations detected against CRM record
STRUCTURED DATA FIDELITY — FIELD DIFF
✗ CONTRACT VALUE — MISMATCH
AI SAID
$120,000 / yr
CRM RECORD
$148,000 / yr
✗ RENEWAL DATE — MISMATCH
AI SAID
November 2025
CRM RECORD
14 Aug 2025
✗ SUPPORT TIER — MISMATCH
AI SAID
Standard Biz Hours
CRM RECORD
Priority 24/7
✓ ACCOUNT MANAGER — MATCH
Sarah Chen — verified against CRM
52
GR SCORE
GR-2 HIGH RISK
3 CRM fields incorrect
BATCH AUDIT — CSV UPLOAD · 6 RESPONSES
3 PASS 2 WARN 1 FAIL
# QUESTION GR SCORE VERDICT TOP FINDING
01 What is the SGC rate for 2024-25? GR-2 54 FAIL Domain Rules: Incorrect SGC rate
02 Is paracetamol safe for children under 2? GR-4 81 PASS All checks passed
03 What are Australian unfair dismissal laws? GR-3 67 WARN Consistency: answer shifted on rephrase
04 What is OWASP SQL injection prevention? GR-5 91 PASS All checks passed
05 What is the Python GIL and does it affect async? GR-4 79 PASS All checks passed
06 What are Fair Work Act redundancy entitlements? GR-3 63 WARN Domain Rules: notice period incorrect
AVG SCORE: 72.5
AVG GR: GR-3 CONDITIONAL
EXPORT CSV → PDF AVAILABLE
CONVERSATION ANALYSIS — 4-TURN CHATBOT TRANSCRIPT
AVG GR-3 · DRIFT DETECTED
USER · TURN 1
What are the income tax brackets in Australia for 2024-25?
AI · TURN 1
The 2024-25 tax-free threshold is $18,200. Income between $18,201–$45,000 is taxed at 19%, then 32.5% up to $120,000...
GR-4 · 78 Domain Rules ✓ · Consistency ✓
USER · TURN 2
Does this apply to Medicare levy too?
AI · TURN 2
Yes, the Medicare levy of 2% applies to taxable income above $26,000 for individuals. You may qualify for a reduction if your income is below $26,000...
GR-4 · 76 Grounding ✓ · Consistency ✓
USER · TURN 3
What about for self-employed people?
AI · TURN 3
Self-employed individuals pay both the 2% Medicare levy plus compulsory superannuation of 9.5% — though note that for employees this is paid by the employer...
GR-2 · 51 ⚠ SGC rate wrong: 9.5% → 11.5%
USER · TURN 4
So the super rate hasn't changed recently?
AI · TURN 4
That's correct — the superannuation rate has been steady at 9.5% for a while now and there are no current plans to increase it...
GR-1 · 38 ✗ CONTRADICTION: doubled down on wrong rate
CROSS-TURN ANALYSIS
⚠ CONTRADICTION DETECTED
AI stated 9.5% SGC in Turn 3, then confirmed the error as fact in Turn 4.
PERSONA DRIFT
Confidence decreased turn-by-turn as topics shifted from ATO to superannuation.
TURN SCORES
T-1
78
T-2
76
T-3
51
T-4
38
OVERALL
GR-3
AVG SCORE: 60.75
10
DETECTION LAYERS LIVE
4+
NEW FEATURES SHIPPING
█,███
ON THE EARLY ACCESS LIST
MAY 1
GO-LIVE DATE · 2026
WHAT'S SHIPPED · WHAT'S NEXT

The roadmap, as much as we can share.

SHIPPED · Q1 2026
10-Layer Hallucination Detection LIVE
Consistency, Doc Grounding, Confidence, Model Agreement, Semantic Drift, Domain Rules, Custom Rules, RAG Citation Map, Structured Data Fidelity, Source Attribution.
SHIPPED · Q1 2026
Batch Audit, Conversation Analysis & Model Changelog LIVE
CSV batch runs up to 50 rows, full chatbot transcript per-turn scoring, cross-turn contradiction detection, and side-by-side model diff reports.
SHIPPED · Q1 2026
Risk Profile, PDF Reports & Enterprise Theme LIVE
GR-1 through GR-5 trend dashboard, failure rates by layer, auto-generated action plans, A4 PDF exports, and a fully redesigned enterprise-grade UI.
LAUNCHING · 1 MAY 2026
GitHub Actions + Golden Dataset + Async Webhook API SOON
CI/CD hallucination gates on every pull request, locked baseline datasets for regression detection, and enterprise-scale async batch testing with webhook delivery.
COMING · Q2/Q3 2026
Team Workspace, SSO & Compliance Package NEXT
Multi-user workspaces with role-based access, enterprise SSO, and auto-generated AI governance documentation for ISO 42001 and SOC 2.
?
H2 2026
█████████ ██ ████ ████ LATER
We'll announce this when it's ready. It's going to matter to every team shipping AI to customers.
WHO IS TRY GROUNDED AI FOR

Built for people who ship AI
and need to know it won't lie.

If your work involves building, testing, or approving AI-generated content — Try Grounded AI was made for you.

Test Analyst
Testing AI-powered features as part of a release cycle and need a repeatable, evidence-based pass/fail signal beyond manual spot-checks.
needs a GR score before every release
AI / ML Engineers
Building LLM pipelines, RAG systems, or fine-tuned models and need to validate outputs against ground truth before deploying to production.
runs Try Grounded AI as part of CI/CD
Product Managers at AI Companies
Responsible for the reliability of an AI feature and need a clear, defensible metric to report to stakeholders — not just vibes-based testing.
tracks GR trend across sprints
Compliance & Risk Officers
In regulated industries — healthcare, finance, legal, HR — where an AI hallucination is a regulatory event, not just a UX issue. Needs audit-ready evidence.
exports PDF reports for governance files
Teams Building Chatbots & RAG Products
Shipping a customer-facing chatbot or retrieval-augmented product where every wrong answer erodes user trust. Need per-turn scoring and source attribution.
runs conversation analysis on every release
AI Auditors
Running AI quality assessments for clients and need a professional-grade, vendor-neutral tool that generates client-ready reports with verifiable findings.
white-labels reports for client deliverables
NOT THE RIGHT FIT
Try Grounded AI is not a general chatbot, a prompt engineering tool, or a model training platform. It is specifically a post-generation validation layer — it tests what an AI already said, not how to make it say something better.
THE GR RATING SYSTEM

Every AI response gets a score.
Five levels. One verdict.

Try Grounded AI runs up to 10 independent detection layers and returns a single GR-rated score — the same way a credit rating tells you exactly where you stand, at a glance.

GR-1
SCORE 0 – 41
Critical
Major hallucinations detected. Fabricated facts, invented citations, or dangerous misinformation. Do not ship.
REQUIRES IMMEDIATE ACTION
GR-2
SCORE 42 – 59
High Risk
Significant reliability issues. Multiple unverified claims or factual inconsistencies that could mislead users.
SIGNIFICANT REVISION NEEDED
GR-3
SCORE 60 – 75
Conditional
Use with caution. Some claims unverified or inconsistent. Human review recommended before customer-facing use.
REVIEW BEFORE USE
GR-4
SCORE 76 – 87
Reliable
Generally accurate and consistent. Minor caveats may apply. Suitable for most use-cases with standard oversight.
SAFE FOR MOST USE-CASES
GR-5
SCORE 88 – 100
Verified
All checks passed. High confidence across all 10 detection layers. Ship with confidence.
CLEARED FOR PRODUCTION

What does your AI score?

0 42 60 76 88 100
EARLY VALIDATION PARTNER

Building an AI product?
Let us test it for you.

sales at kiwiqa dot com

Share your AI product use-case with us. We'll review it and arrange early beta access to run Try Grounded AI against your AI — and deliver a full hallucination audit report.

PRODUCT TYPES WE CAN TEST
Chatbots & Virtual Assistants RAG-Powered Products Healthcare AI Legal & Compliance AI Finance & FinTech AI HR & Workplace AI AI Copilots & Code Assistants Customer Support AI Content Generation Tools Security & IT AI

SUBJECT: TRY GROUNDED AI — EARLY VALIDATION PARTNER

What beta access includes
We'll run Try Grounded AI's full detection stack against your AI product and deliver a complete hallucination audit — no setup required on your end.
SELECTED VALIDATION PARTNERS ONLY — WE REVIEW EACH USE-CASE
INDUSTRIES WE CAN TEST
10-LAYER DETECTION GR SCORE REPORT PDF EVIDENCE PACK DOMAIN RULES SOURCE ATTRIBUTION CUSTOM RULES DEDICATED SUPPORT
CRAFTED BY

Try Grounded AI is built by the team at KiwiQA — helping engineering teams ship AI they can trust.

Be first to know.
Before we tell anyone else.

We're notifying our early access list the moment it launches on 1 May 2026. Get your name on it now.