LLM Audit Proposal – UK Insurance Sector

Project Summary

This summary outlines the estimated time, team roles, and cost for conducting a benchmark-driven audit of Large Language Models (LLMs) in a chosen business sector. The process is enhanced using AI to accelerate key phases including prompt generation, batch execution, and preliminary scoring, while maintaining expert oversight and FCA-aligned rigour.

Time and Cost Breakdown

Total Estimated Time: 15–20 days (3–4 weeks)
Total Estimated Cost: £24,400 – £34,100

Optional Add-ons

GPT API tokens (3–5 runs × 60–80 prompts): £300 – £800
Retesting on updated models (e.g. GPT-4.5, Claude 3): £2,000 – £4,000
Custom audit dashboard (e.g., Power BI, Tableau): £1,500 – £3,000
Regulator-facing summary pack: £1,500 – £2,500

Audit Phases and Costs

Phase	Key Tasks	Time (Days)	Roles Involved	Estimated Cost (£)
Scoping & Design	Define audit scope, KPIs, regulatory fit	3–5	Project Lead, Compliance Lead	£3,600 – £6,000
Query Development (GPT)	Generate audit queries by theme/type	1–2	Prompt Engineer + GPT	£1,600 – £2,400
Execution Design	Define GPT run protocol, test format	1	AI Engineer	£900
Query Execution (GPT API)	Run tests using GPT API	1–2	LLM Engineer + GPT	£800 – £1,600
Response Evaluation (GPT + Human)	GPT scoring + human review	3–4	Evaluator + GPT	£2,400 – £3,200
Expert Validation	Spot-check high-risk outputs	2–3	Insurance/Regulatory SME	£2,000 – £3,000
Reporting & Dashboards	Generate reports using GPT + review	1–2	Analyst + GPT	£1,200 – £2,000
Final QA & Delivery	Assemble full audit pack	1	Project Lead + Editor	£900

Compliance and Transparency

This proposal assumes efficient use of GPT technologies alongside expert oversight to maintain transparency, objectivity, and regulatory alignment. The audit structure is designed to ensure:

FCA-aligned governance
Objective scoring methods
Transparent documentation
Reusability across LLMs and scenarios