Train Your Team to Spot AI-Generated Documents
Practical guide to training staff on visual cues for AI-generated document fraud: seven key indicators, a three-tier programme, and UK AML obligations for 2025.

Summarize this article with
Visual cues for AI-generated documents are structural, typographic and metadata anomalies that betray a synthetic origin: inconsistent fonts, artificially uniform guilloche patterns, the absence of scan artefacts, and backgrounds that are too clean. Recognising these signals requires deliberate training, because modern forgery tools reproduce official layouts convincingly while leaving detectable traces for a trained eye. This guide covers the seven essential cues and a structured method for teaching them to your team.
AI-Generated Document Fraud: A Rapidly Escalating Threat
AI-generated document fraud has grown at an unprecedented rate over the past two years, driven by the widespread availability of consumer-grade generative models. According to CheckFile's analysis of over 180,000 documents processed monthly, AI-generated documents now account for 12% of detected fraud attempts in 2025, up from just 3% in 2024.
This fourfold increase in a single year reflects a structural shift in the threat landscape. Producing a convincing forged payslip, P60, or bank statement no longer requires specialist skills. Freely available tools โ some pre-trained on UK document templates โ can generate a visually plausible forgery in under ten minutes. The document passes standard OCR checks because the data fields are internally consistent; it is the provenance that is fabricated.
For UK regulated firms, the implications extend beyond detection. The Financial Conduct Authority (FCA) expects firms to maintain systems and controls that are commensurate with evolving fraud risks. The Money Laundering, Terrorist Financing and Transfer of Funds (Information on the Payer) Regulations 2017 (as amended 2022), Regulation 19, requires that relevant employees receive regular training in the latest techniques used to launder money and finance terrorism โ a requirement that now encompasses AI-generated document fraud by any reasonable interpretation.
For a deeper technical analysis of automated detection approaches, see our guide on deepfake document detection.
Seven Visual Cues to Teach Your Team
The seven cues below cover the most consistently observed signatures of AI-generated documents. Each can be identified with the naked eye or a digital zoom โ no specialist forensic equipment required.
| # | Visual Cue | What It Reveals | Frequency |
|---|---|---|---|
| 1 | Overly uniform or repetitive guilloche | AI model reproducing a pattern without random variation | Very common |
| 2 | Mixed fonts within a single field | Characters assembled from different training sources | Common |
| 3 | Physically impossible shadows or reflections | Artificial lighting inconsistent with document geometry | Common |
| 4 | Unnaturally sharp photo edges | Algorithmic cropping with no scan bleed | Common |
| 5 | Absent progressive pixellation at corners | No JPEG compression artefacts on a supposedly scanned document | Moderate |
| 6 | Perfectly regular text alignment | No micro-variation from printing tolerances | Moderate |
| 7 | Absent or anachronistic metadata | PDF creation date later than the document's stated date | Variable |
1. Overly Uniform Guilloche
Guilloche โ the fine interlaced line patterns used as a security background on identity documents, payslips and bank statements โ contains deliberate micro-variations in authentic documents. These variations arise from the intaglio printing process and are difficult to reproduce mathematically. An AI model generating a guilloche produces a pattern that is mathematically perfect and sometimes identical across multiple forgeries generated from the same template. Train your team to zoom to 400% on the background zone and look for exact tile repetitions.
2. Mixed Fonts
AI-generated documents frequently combine typographic fragments from multiple training sources. On a forged payslip, the employee's name may appear in a slightly condensed Arial variant while salary figures are set in standard Helvetica. The difference is imperceptible at normal zoom but becomes visible at 150%, particularly on accented characters and numerals. Authentic UK payslips and P60s use a consistent typeface throughout; any inconsistency warrants escalation.
3. Physically Impossible Shadows
Diffusion models generate locally coherent shadows that are globally inconsistent. A shadow cast by a company stamp may indicate a light source on the left, while a shadow on the photograph suggests the right. This cue is particularly effective on documents containing multiple graphical elements โ logos, stamps, photographs and watermarks. Check that all shadows across the document originate from a single consistent light source.
4. Unnaturally Sharp Photo Edges
A photograph on a scanned document always carries compression artefacts and a slight bleed at its edges caused by the scanning process. An AI-generated identity photo is cropped algorithmically, producing a pixel-perfect boundary between the photo and its background. Look also for the absence of scanner grain in the zone immediately surrounding the photograph โ on a genuine scanned document this zone should show the same grain as the rest of the page.
5. Absent Progressive Pixellation
Genuine PDF documents produced by scanning at 300 dpi contain JPEG progressive compression artefacts that are visible at the corners and high-contrast zones when zoomed to 200%. A document generated directly as a PDF โ rather than produced by scanning a physical original โ displays uniform quality throughout with no such artefacts. This cue is particularly useful for bank statements and utility bills, which are almost always provided as scanned originals or official bank-generated PDFs with predictable compression signatures.
6. Perfectly Regular Text Alignment
Physical printing introduces micro-variations in line spacing and character alignment that are imperceptible at normal magnification but visible at 300% zoom. Rows of figures on an authentic payslip or P60 show fractional inter-line variation. An AI-generated document produces text with mathematically uniform spacing throughout. This cue is most useful when examining columns of figures such as National Insurance contribution tables and monthly salary breakdowns.
7. Absent or Anachronistic Metadata
A document presented as a bank statement from September 2024 but whose PDF metadata records a creation date of February 2025 is a clear indicator of fabrication. Teach staff to access file properties โ right-click > Properties on Windows, Cmd+I on macOS โ and verify the creation date, the authoring application and the PDF version number. Authentic bank-generated PDFs typically show the bank's own PDF generation software as the author; a document showing Adobe Photoshop or Canva as the authoring application warrants immediate escalation.
Building a Three-Tier Training Programme
An effective training programme must differentiate by role and responsibility level. Generic training delivered to all staff simultaneously produces limited retention and no meaningful improvement in detection rates.
Tier 1 โ Awareness (all staff): A two-hour session covering the seven visual cues, hands-on exercises using training documents, and an overview of regulatory obligations under the Money Laundering Regulations 2017. The objective is that every staff member can recognise obvious indicators and knows when to escalate. This tier should be completed during induction and refreshed annually.
Tier 2 โ Analytical depth (document reviewers and KYC analysts): A two-day programme covering lightweight forensic analysis โ metadata examination, PDF structure inspection, font analysis at digital microscope zoom โ alongside escalation procedures and case documentation requirements. These staff form the second line of defence and handle all documents escalated from automated screening.
Tier 3 โ Expert capability (compliance managers and financial crime officers): A five-day programme with internal certification, covering advanced forensic tools, interpretation of AI-generated analysis reports, complex case management and the process for updating the firm's internal cue library as new fraud techniques emerge. Tier 3 personnel are also responsible for designing and updating training content for Tiers 1 and 2.
Programme Implementation Checklist
- Build a training document library using anonymised fraud cases from industry groups or verification providers
- Define certification criteria and pass marks for each tier
- Schedule quarterly refresher sessions for Tier 2 to incorporate new fraud typologies
- Establish an internal escalation channel and case logging system
- Record training completion in the compliance register โ HMRC and FCA supervisory visits may request this evidence
- Assess staff through practical exercises on training documents, not MCQ tests alone
- Appoint a document fraud lead for each team to maintain awareness between formal sessions
- Review and update training materials when new AI generation techniques are identified
For complementary guidance on structuring anti-fraud controls across document processing workflows, see our article on anti-fraud best practices for document teams.
Ready to automate your checks?
Free pilot with your own documents. Results in 48h.
Request a free pilotRegulatory Obligations Under UK AML Rules
UK firms have specific legal obligations that make staff training on AI document fraud a compliance requirement, not a discretionary investment. Understanding the applicable framework helps compliance teams build a defensible training programme.
The Money Laundering Regulations 2017 (as amended by the Money Laundering and Terrorist Financing (Amendment) Regulations 2022) impose a risk-based approach to customer due diligence under Regulation 28. Where a firm cannot verify the authenticity of a supporting document, it cannot complete CDD. Regulation 19 requires firms to establish and maintain training procedures for relevant employees โ a requirement that encompasses training on emerging fraud techniques including AI-generated documents.
The FCA's Financial Crime Guide (FCG) provides detailed expectations for supervised firms. FCG 3.2 sets out the systems and controls that the FCA expects to see in place to counter document fraud, including staff training records. The FCA has used failures in document verification training as grounds for enforcement action in several cases between 2023 and 2025.
The Joint Money Laundering Steering Group (JMLSG) guidance, last substantively updated in 2024, explicitly addresses the growing risk of AI-generated documentation in its sector-specific guidance for retail banking, consumer credit and wealth management. Firms that follow JMLSG guidance as their primary compliance framework should review Part I, Section 5 on CDD and document verification in light of synthetic fraud.
The Information Commissioner's Office (ICO) has issued guidance on the use of AI tools in identity verification contexts, clarifying that the use of automated document analysis must be disclosed in privacy notices and that human review must be available where automated decisions have legal or significant effects. This directly affects the hybrid human-AI workflow design discussed below.
For UK firms with EU counterparties, the EU AI Act (Regulation EU 2024/1689) remains relevant: Article 50 imposes disclosure obligations on systems generating synthetic content, and Article 5 prohibits the deployment of AI systems specifically designed to produce fraudulent identity documents. Awareness of these provisions is relevant to the legal arguments available in fraud recovery actions.
Integrating Human Verification into Automated Workflows
Human verification does not replace automated tools โ it completes them by handling the cases that algorithms cannot resolve without contextual judgement. Designing this handoff correctly is a core operational challenge.
The CheckFile platform processes documents in real time and generates a risk score alongside the specific indicators detected. Documents where the score exceeds the escalation threshold โ configurable based on the firm's risk appetite and document type โ are automatically routed to a human analyst with a detailed report listing the anomalies identified. This eliminates the need for analysts to re-examine every document from scratch.
Three operational principles govern effective human-automated integration:
Document contextualisation: A human analyst can cross-reference a document against other elements in the application file โ consistency between the payslip and the stated bank account, correspondence between the address on a utility bill and the declared address. Automated tools analyse each document in isolation; human review adds the cross-document layer that catches composite fraud.
Ambiguous case adjudication: Some documents present anomalies with innocent explanations โ a low-quality scanner, a foreign document with an unfamiliar layout, a legitimate employer using non-standard payslip software. The human analyst provides the contextual judgement needed to avoid costly false positives and the operational disruption they cause.
Feedback loop: Human analyst decisions feed back into continuous model improvement. Each case resolved by an expert enriches the training corpus and improves the precision of future automated alerts. This feedback mechanism is what prevents automated detection systems from becoming stale against evolving fraud techniques.
Our banking KYC solution integrates this hybrid workflow with real-time supervision dashboards designed for compliance teams. The security page details the data protection measures applied to processed documents, and pricing scales with monthly document volume.
Frequently Asked Questions
Can automated detection tools replace staff training?
No. Automated tools detect the majority of documents produced by currently known generative models, but they face two structural limitations. First, fraud generation models evolve continuously โ a tool trained on 2024 fraud patterns will miss techniques that emerge in 2026. Second, false positives must be adjudicated by an analyst capable of applying contextual judgement. Staff training ensures that the second line of defence functions correctly when the first is insufficient or presents an ambiguous result.
How often should refresher training be scheduled?
Quarterly refresher sessions are the minimum recommended frequency for document reviewers (Tier 2), given the pace at which fraud techniques evolve. For Tier 1 staff, annual training is acceptable if supplemented by internal alerts whenever a new fraud typology is identified on the platform. Compliance managers (Tier 3) should monitor FCA publications, JMLSG updates and industry intelligence continuously rather than relying solely on scheduled sessions.
What UK documents are most commonly targeted by AI-generated fraud?
Driving licences, P60s, payslips and bank statements account for the largest proportion of AI-generated fraud attempts detected across financial services, lettings and professional services. HMRC self-assessment returns are increasingly targeted because real-time verification by third parties is difficult. Utility bills โ from suppliers including British Gas, EDF and Thames Water โ are also frequently fabricated for use as proof of address, particularly in lettings and mortgage applications. The absence of a real supplier reference on a utility bill is one of the more reliable secondary indicators.
Does the UK's departure from the EU affect AMLD6 obligations?
UK firms are not directly subject to AMLD6 (EU Directive 2024/1640), which applies in EU member states. However, UK AML obligations under the Money Laundering Regulations 2017 (as amended) set comparable standards, and HM Treasury's approach to future AML reforms is expected to maintain alignment with EU frameworks where practical. UK firms with EU subsidiaries, EU counterparties or EU customer bases may be subject to AMLD6 requirements through those relationships and should take legal advice on their specific exposure.
How should training records be maintained to satisfy FCA supervision?
Training records should capture: the date of each training session, the name and role of each participant, the content covered, the assessment method used and the result, and the name of the trainer or training provider. These records should be retained for a minimum of five years โ consistent with the general records retention requirement under Regulation 40 of the Money Laundering Regulations 2017. During FCA supervisory visits, firms are routinely asked to produce evidence that relevant employees have received and understood AML training, including on document fraud.
This article is provided for informational purposes only and does not constitute legal, financial or regulatory advice. Regulatory references are accurate as of the publication date (May 2026). Specific obligations vary depending on your firm's regulatory status, the nature of your business and the volume of documents processed. Consult a qualified professional โ a solicitor specialising in financial regulation, a compliance consultant authorised by a relevant professional body โ for guidance specific to your situation. CheckFile accepts no liability for decisions taken on the basis of this article.
Stay informed
Get our compliance insights and practical guides delivered to your inbox.