LLM Audit Proposal – UK Insurance Sector
Project Summary
This summary outlines the estimated time, team roles, and cost for conducting a benchmark-driven audit of Large Language Models (LLMs) in a chosen business sector. The process is enhanced using AI to accelerate key phases including prompt generation, batch execution, and preliminary scoring, while maintaining expert oversight and FCA-aligned rigour.
Time and Cost Breakdown
- Total Estimated Time: 15–20 days (3–4 weeks)
- Total Estimated Cost: £24,400 – £34,100
Optional Add-ons
- GPT API tokens (3–5 runs × 60–80 prompts): £300 – £800
- Retesting on updated models (e.g. GPT-4.5, Claude 3): £2,000 – £4,000
- Custom audit dashboard (e.g., Power BI, Tableau): £1,500 – £3,000
- Regulator-facing summary pack: £1,500 – £2,500
Audit Phases and Costs
Phase | Key Tasks | Time (Days) | Roles Involved | Estimated Cost (£) |
---|---|---|---|---|
Scoping & Design | Define audit scope, KPIs, regulatory fit | 3–5 | Project Lead, Compliance Lead | £3,600 – £6,000 |
Query Development (GPT) | Generate audit queries by theme/type | 1–2 | Prompt Engineer + GPT | £1,600 – £2,400 |
Execution Design | Define GPT run protocol, test format | 1 | AI Engineer | £900 |
Query Execution (GPT API) | Run tests using GPT API | 1–2 | LLM Engineer + GPT | £800 – £1,600 |
Response Evaluation (GPT + Human) | GPT scoring + human review | 3–4 | Evaluator + GPT | £2,400 – £3,200 |
Expert Validation | Spot-check high-risk outputs | 2–3 | Insurance/Regulatory SME | £2,000 – £3,000 |
Reporting & Dashboards | Generate reports using GPT + review | 1–2 | Analyst + GPT | £1,200 – £2,000 |
Final QA & Delivery | Assemble full audit pack | 1 | Project Lead + Editor | £900 |
Compliance and Transparency
This proposal assumes efficient use of GPT technologies alongside expert oversight to maintain transparency, objectivity, and regulatory alignment. The audit structure is designed to ensure:
- FCA-aligned governance
- Objective scoring methods
- Transparent documentation
- Reusability across LLMs and scenarios