CLASSIFIED · KiwiQA RESEARCH & DEVELOPMENT

The way you test AI for hallucinations
is about to

Try Grounded AI already catches hallucinations in under 60 seconds. What's coming next will fundamentally change how teams ship AI.

WE CAN'T SAY MORE. YET.

00DAYS

:

00HOURS

:

00MINUTES

:

00SECONDS

JOIN THE EARLY ACCESS LIST · NO SPAM · UNSUBSCRIBE ANYTIME

✓ YOU'RE ON THE LIST — WE'LL EMAIL YOU BEFORE ANYONE ELSE

10 detection layers live today Built by KiwiQA Something bigger is almost ready

ALREADY LIVE — IN PRODUCTION

10 layers of hallucination detection.
Shipping in production today.

LIVE · RESPONSE AUDIT

10-Layer Hallucination Detection

Consistency · Confidence · Model Agreement · Semantic Drift · Domain Rules · Custom Rules · RAG Citation Map · Doc Grounding · Structured Data Fidelity · Source Attribution. One GR score. 60 seconds.

LIVE · BATCH & CONVERSATION

Batch Audit & Conversation Analysis

Upload a CSV of up to 50 responses. Or analyse a full chatbot transcript — per-turn GR scores, persona drift detection, and cross-turn contradiction checks.

LIVE · RISK PROFILE

Risk Profile, PDF Reports & Model Changelog

GR trend charts, failure rates by layer, automated action plans, A4 PDF exports, and a Model Changelog that diffs any two models side by side.

WHAT'S COMING 1 MAY 2026 — PARTIALLY DECLASSIFIED

We're building the next layer.
Here's what we can reveal.

01 / CONFIRMED

GitHub Actions CI/CD Gate

Try Grounded AI runs as a step in your pipeline. Every pull request gets a GR score. Set hallucination thresholds as merge requirements. Ship only when AI is reliable.

02 / CONFIRMED

Golden Dataset Baselines

Lock in a verified answer set once. Every future model update or prompt change runs against it automatically. Know the exact moment your AI started drifting.

03 / PARTIALLY REVEALED

Async Batch & Webhook API

Submit 10,000+ responses. Results delivered to your webhook endpoint when ready. Zero blocking. Enterprise-scale continuous testing without SDK setup.

04 / CONFIRMED

Team Workspace & SSO

Shared test history, role-based access, and enterprise SSO. One GR score standard across your entire QA, engineering, and compliance org.

05 / PARTIALLY REVEALED

Compliance Evidence Package

Automatically generate audit-ready AI governance documentation. Timestamped GR reports structured for ISO 42001 and enterprise AI risk frameworks.

06 / [REDACTED]

████████ ██ ████ ████

Clearance level: restricted. This feature was not on our original roadmap. Our beta testers asked for it. We built it. You will understand why the moment you see it at launch.

COMING 1 MAY 2026 — FEATURE PREVIEW

Three ways to test your AI.
All in one platform.

Choose the test type that matches your workflow. Every test returns a GR score, layer-by-layer breakdown, and PDF report.

RESPONSE AUDIT — WITH CRM DATA VALIDATION

GR-2 HIGH RISK

CRM RECORD — SALESFORCE ACCOUNT

Acme Corp — Enterprise Plan

Contract Value $148,000 / yr

Plan Enterprise Pro

Renewal Date 14 Aug 2025

Account Manager Sarah Chen

Support Tier Priority 24/7

AI RESPONSE UNDER TEST

"Acme Corp is on our Business plan at $120,000 per year. Their contract renews in November 2025 and their account manager is Sarah Chen. They have standard business hours support."

↑ 3 logical hallucinations detected against CRM record

STRUCTURED DATA FIDELITY — FIELD DIFF

✗ CONTRACT VALUE — MISMATCH

AI SAID

$120,000 / yr

CRM RECORD

$148,000 / yr

✗ RENEWAL DATE — MISMATCH

AI SAID

November 2025

CRM RECORD

14 Aug 2025

✗ SUPPORT TIER — MISMATCH

AI SAID

Standard Biz Hours

CRM RECORD

Priority 24/7

✓ ACCOUNT MANAGER — MATCH

Sarah Chen — verified against CRM

52

GR SCORE

GR-2 HIGH RISK

3 CRM fields incorrect

BATCH AUDIT — CSV UPLOAD · 6 RESPONSES

3 PASS 2 WARN 1 FAIL

#	QUESTION	GR	SCORE	VERDICT	TOP FINDING
01	What is the SGC rate for 2024-25?	GR-2	54	FAIL	Domain Rules: Incorrect SGC rate
02	Is paracetamol safe for children under 2?	GR-4	81	PASS	All checks passed
03	What are Australian unfair dismissal laws?	GR-3	67	WARN	Consistency: answer shifted on rephrase
04	What is OWASP SQL injection prevention?	GR-5	91	PASS	All checks passed
05	What is the Python GIL and does it affect async?	GR-4	79	PASS	All checks passed
06	What are Fair Work Act redundancy entitlements?	GR-3	63	WARN	Domain Rules: notice period incorrect

AVG SCORE: 72.5

AVG GR: GR-3 CONDITIONAL

EXPORT CSV → PDF AVAILABLE

CONVERSATION ANALYSIS — 4-TURN CHATBOT TRANSCRIPT

AVG GR-3 · DRIFT DETECTED

USER · TURN 1

What are the income tax brackets in Australia for 2024-25?

AI · TURN 1

The 2024-25 tax-free threshold is $18,200. Income between $18,201–$45,000 is taxed at 19%, then 32.5% up to $120,000...

GR-4 · 78 Domain Rules ✓ · Consistency ✓

USER · TURN 2

Does this apply to Medicare levy too?

AI · TURN 2

Yes, the Medicare levy of 2% applies to taxable income above $26,000 for individuals. You may qualify for a reduction if your income is below $26,000...

GR-4 · 76 Grounding ✓ · Consistency ✓

USER · TURN 3

What about for self-employed people?

AI · TURN 3

Self-employed individuals pay both the 2% Medicare levy plus compulsory superannuation of 9.5% — though note that for employees this is paid by the employer...

GR-2 · 51 ⚠ SGC rate wrong: 9.5% → 11.5%

USER · TURN 4

So the super rate hasn't changed recently?

AI · TURN 4

That's correct — the superannuation rate has been steady at 9.5% for a while now and there are no current plans to increase it...

GR-1 · 38 ✗ CONTRADICTION: doubled down on wrong rate

CROSS-TURN ANALYSIS

⚠ CONTRADICTION DETECTED

AI stated 9.5% SGC in Turn 3, then confirmed the error as fact in Turn 4.

PERSONA DRIFT

Confidence decreased turn-by-turn as topics shifted from ATO to superannuation.

TURN SCORES

T-1

78

T-2

76

T-3

51

T-4

38

OVERALL

GR-3

AVG SCORE: 60.75

10

DETECTION LAYERS LIVE

4+

NEW FEATURES SHIPPING

█,███

ON THE EARLY ACCESS LIST

MAY 1

GO-LIVE DATE · 2026

WHAT'S SHIPPED · WHAT'S NEXT

The roadmap, as much as we can share.

✓

SHIPPED · Q1 2026

10-Layer Hallucination Detection LIVE

Consistency, Doc Grounding, Confidence, Model Agreement, Semantic Drift, Domain Rules, Custom Rules, RAG Citation Map, Structured Data Fidelity, Source Attribution.

✓

SHIPPED · Q1 2026

Batch Audit, Conversation Analysis & Model Changelog LIVE

CSV batch runs up to 50 rows, full chatbot transcript per-turn scoring, cross-turn contradiction detection, and side-by-side model diff reports.

✓

SHIPPED · Q1 2026

Risk Profile, PDF Reports & Enterprise Theme LIVE

GR-1 through GR-5 trend dashboard, failure rates by layer, auto-generated action plans, A4 PDF exports, and a fully redesigned enterprise-grade UI.

→

LAUNCHING · 1 MAY 2026

GitHub Actions + Golden Dataset + Async Webhook API SOON

CI/CD hallucination gates on every pull request, locked baseline datasets for regression detection, and enterprise-scale async batch testing with webhook delivery.

→

COMING · Q2/Q3 2026

Team Workspace, SSO & Compliance Package NEXT

Multi-user workspaces with role-based access, enterprise SSO, and auto-generated AI governance documentation for ISO 42001 and SOC 2.

?

H2 2026

█████████ ██ ████ ████ LATER

We'll announce this when it's ready. It's going to matter to every team shipping AI to customers.

WHO IS TRY GROUNDED AI FOR

Built for people who ship AI
and need to know it won't lie.

If your work involves building, testing, or approving AI-generated content — Try Grounded AI was made for you.

Test Analyst

Testing AI-powered features as part of a release cycle and need a repeatable, evidence-based pass/fail signal beyond manual spot-checks.

needs a GR score before every release

AI / ML Engineers

Building LLM pipelines, RAG systems, or fine-tuned models and need to validate outputs against ground truth before deploying to production.

runs Try Grounded AI as part of CI/CD

Product Managers at AI Companies

Responsible for the reliability of an AI feature and need a clear, defensible metric to report to stakeholders — not just vibes-based testing.

tracks GR trend across sprints

Compliance & Risk Officers

In regulated industries — healthcare, finance, legal, HR — where an AI hallucination is a regulatory event, not just a UX issue. Needs audit-ready evidence.

exports PDF reports for governance files

Teams Building Chatbots & RAG Products

Shipping a customer-facing chatbot or retrieval-augmented product where every wrong answer erodes user trust. Need per-turn scoring and source attribution.

runs conversation analysis on every release

AI Auditors

Running AI quality assessments for clients and need a professional-grade, vendor-neutral tool that generates client-ready reports with verifiable findings.

white-labels reports for client deliverables

NOT THE RIGHT FIT

Try Grounded AI is not a general chatbot, a prompt engineering tool, or a model training platform. It is specifically a post-generation validation layer — it tests what an AI already said, not how to make it say something better.

THE GR RATING SYSTEM

Every AI response gets a score.
Five levels. One verdict.

Try Grounded AI runs up to 10 independent detection layers and returns a single GR-rated score — the same way a credit rating tells you exactly where you stand, at a glance.

GR-1

SCORE 0 – 41

Critical

Major hallucinations detected. Fabricated facts, invented citations, or dangerous misinformation. Do not ship.

REQUIRES IMMEDIATE ACTION

GR-2

SCORE 42 – 59

High Risk

Significant reliability issues. Multiple unverified claims or factual inconsistencies that could mislead users.

SIGNIFICANT REVISION NEEDED

GR-3

SCORE 60 – 75

Conditional

Use with caution. Some claims unverified or inconsistent. Human review recommended before customer-facing use.

REVIEW BEFORE USE

GR-4

SCORE 76 – 87

Reliable

Generally accurate and consistent. Minor caveats may apply. Suitable for most use-cases with standard oversight.

SAFE FOR MOST USE-CASES

GR-5

SCORE 88 – 100

Verified

All checks passed. High confidence across all 10 detection layers. Ship with confidence.

CLEARED FOR PRODUCTION

What does your AI score?

0 42 60 76 88 100

EARLY VALIDATION PARTNER

Building an AI product?
Let us test it for you.

sales at kiwiqa dot com

Share your AI product use-case with us. We'll review it and arrange early beta access to run Try Grounded AI against your AI — and deliver a full hallucination audit report.

PRODUCT TYPES WE CAN TEST

Chatbots & Virtual Assistants RAG-Powered Products Healthcare AI Legal & Compliance AI Finance & FinTech AI HR & Workplace AI AI Copilots & Code Assistants Customer Support AI Content Generation Tools Security & IT AI

SUBJECT: TRY GROUNDED AI — EARLY VALIDATION PARTNER

What beta access includes

We'll run Try Grounded AI's full detection stack against your AI product and deliver a complete hallucination audit — no setup required on your end.

SELECTED VALIDATION PARTNERS ONLY — WE REVIEW EACH USE-CASE

INDUSTRIES WE CAN TEST

10-LAYER DETECTION GR SCORE REPORT PDF EVIDENCE PACK DOMAIN RULES SOURCE ATTRIBUTION CUSTOM RULES DEDICATED SUPPORT

CRAFTED BY

Try Grounded AI is built by the team at KiwiQA — helping engineering teams ship AI they can trust.

Be first to know.
Before we tell anyone else.

We're notifying our early access list the moment it launches on 1 May 2026. Get your name on it now.

JOIN THE EARLY ACCESS LIST · NO SPAM · UNSUBSCRIBE ANYTIME

✓ YOU'RE ON THE LIST — WE'LL EMAIL YOU BEFORE ANYONE ELSE