Document Forgery Detection API: Integration Guide 2026
Integrate a document forgery detection API into your verification workflows. OAuth 2.0 authentication, endpoints, webhooks, confidence scores, and FCA/UK GDPR compliance.

Summarize this article with
A document forgery detection API is a programmatic interface that automatically analyses submitted documents for signs of tampering, fabrication, metadata manipulation, or AI-generated content โ returning a structured verdict and confidence score that your application can act upon in real time. As fraud techniques grow more sophisticated and regulators impose stricter documentation obligations on financial services, recruitment, and property sectors, integrating automated forgery detection directly into your verification workflow has shifted from a competitive advantage to a compliance necessity.
This guide covers everything a developer or compliance engineer needs to know to connect a document forgery detection API to an existing workflow: authentication, endpoint structure, response interpretation, webhook configuration, and the regulatory requirements governing deployment in the United Kingdom and European Union.
Why Integrate a Document Forgery Detection API
Integrating a document forgery detection API directly addresses the core weakness of human-led review: manual processes catch only a fraction of fraudulent submissions. According to the ACFE 2024 Report to the Nations, manual detection accounts for just 37% of fraud discovery across all sectors โ meaning nearly two-thirds of document fraud either goes undetected or is caught through other, often costlier, means. Separately, PwC's 2025 Global Economic Crime and Fraud Survey found that 69% of businesses were affected by fraud in the preceding two years, with document manipulation identified as one of the most prevalent methods.
These figures reflect a structural problem: human reviewers experience fatigue, apply inconsistent standards, and cannot scale to match the volume of submissions that digital-first onboarding generates. An API-based approach applies the same detection logic to every document, every time, at machine speed.
Regulatory pressure compounds the urgency. The Financial Conduct Authority's Consumer Duty requires firms to take reasonable steps to prevent foreseeable harm, which regulators have increasingly interpreted to include robust identity and document verification controls. The UK Information Commissioner's Office makes clear under UK GDPR that personal data submitted during onboarding must be processed securely and proportionately. Meanwhile, the EU AI Act, which applies to systems deployed or used in the European Union, classifies document verification tools as high-risk AI systems under Annex III, triggering obligations around transparency, human oversight, and technical documentation.
For firms operating across both jurisdictions โ or handling documents from EU-based customers โ the compliance landscape is bilateral. A well-integrated API that produces auditable outputs satisfies both regimes more efficiently than any manual process.
Technical Architecture: How API Integration Works
A document forgery detection API operates over HTTPS using a REST architecture, with OAuth 2.0 client credentials flow as the standard authentication mechanism. Your backend exchanges a client ID and client secret for a bearer token, which is then included in the Authorization header of every subsequent request. Tokens are typically short-lived (60 minutes) to limit exposure in the event of interception.
The integration offers two processing modes depending on your latency requirements:
Synchronous mode returns the analysis result within the HTTP response body, typically within 2โ8 seconds. This suits low-volume workflows where a user is waiting for an inline result โ for example, a KYC onboarding screen that displays a verification status before proceeding.
Asynchronous mode accepts the document, returns a job_id immediately (HTTP 202 Accepted), and processes the document in the background. Once analysis is complete, the platform delivers the result to a pre-registered webhook endpoint on your server via an HTTP POST. This mode suits high-volume batch processing or workflows where a document may require deeper forensic analysis that exceeds synchronous timeout thresholds.
Webhook payloads are signed using HMAC-SHA256. Your server should verify the X-CheckFile-Signature header against a hash of the raw payload body using your webhook secret, rejecting any payload where the signature does not match. This prevents replay attacks and spoofed callbacks.
For a broader overview of how document verification APIs fit into multi-stage verification pipelines, see our document verification API developer integration guide and the automation verification guide, which covers orchestration patterns for end-to-end workflows.
Core API Endpoints and Request Format
The primary endpoint for document submission is POST /v1/documents/analyze, which accepts a multipart form upload or a base64-encoded payload and returns a structured analysis object. The table below summarises the key endpoints:
| Endpoint | HTTP Method | Description | Typical Response Time |
|---|---|---|---|
/v1/documents/analyze |
POST | Submit document for forgery analysis | 2โ8 s (sync) |
/v1/documents/{job_id} |
GET | Poll result for async job | Immediate |
/v1/documents/{job_id}/report |
GET | Retrieve full forensic report PDF | 1โ3 s |
/v1/webhooks |
POST | Register or update webhook endpoint | Immediate |
/v1/webhooks/{id} |
DELETE | Remove a registered webhook | Immediate |
A typical synchronous response for a flagged document looks as follows:
{
"job_id": "doc_8f3a2c1b9e47",
"status": "complete",
"document_type": "UK_PASSPORT",
"jurisdiction": "GB",
"confidence_score": 87,
"risk_level": "high",
"signals": [
{
"signal_type": "metadata_inconsistency",
"detail": "Document creation timestamp post-dates purported issue date by 14 months",
"severity": "high"
},
{
"signal_type": "font_anomaly",
"detail": "Machine-readable zone uses non-standard character spacing inconsistent with ICAO 9303",
"severity": "medium"
}
],
"ocr_language": "en",
"processing_mode": "synchronous",
"created_at": "2026-06-19T09:14:32Z"
}
The signals array itemises each detected anomaly with a severity rating, giving your review team a specific starting point for manual escalation rather than a binary pass/fail verdict. CheckFile's platform analyses over 3,200 document types across 32 jurisdictions, with OCR support for 24 languages, making it suitable for international onboarding workflows without separate regional configurations.
Ready to automate your checks?
Free pilot with your own documents. Results in 48h.
Request a free pilotInterpreting Responses and Confidence Scores
A confidence score ranging from 0 to 100 indicates the platform's certainty that a detected anomaly represents genuine forgery rather than a benign artefact such as compression noise or scanning distortion. The higher the score, the stronger the evidence that the document has been manipulated. The risk_level field translates the raw score into an actionable category:
| Score Range | Risk Level | Recommended Action |
|---|---|---|
| 0โ24 | Low | Accept; retain record for audit trail |
| 25โ49 | Medium | Flag for secondary automated check or rule-based review |
| 50โ74 | High | Route to manual review queue; request re-submission if inconclusive |
| 75โ100 | Critical | Reject automatically; log for compliance reporting; consider SAR if applicable |
It is important to contextualise scores within your specific use case. A score of 45 on a scanned bank statement may warrant different handling than the same score on a passport submitted for mortgage application, given the relative consequences of a false negative in each context. The platform's signals array provides the evidence base for that contextual judgement.
Your integration should not rely solely on risk_level for automated decisions. Storing the full response payload against the customer record satisfies audit requirements under FCA Consumer Duty and supports downstream Subject Access Requests under UK GDPR. For more on the underlying detection methodology, see our AI document fraud detection techniques article.
Use Cases by Industry Sector
Document forgery detection APIs apply across any sector where identity or financial documents are submitted as part of an application, claim, or onboarding process. The table below maps common sectors to the documents they receive and the fraud signals the API is configured to detect:
| Sector | Document Type | Primary Detectable Signal |
|---|---|---|
| Retail lending | Bank statements, payslips | Fabricated transaction history, altered employer details |
| Insurance claims | Medical certificates, repair invoices | Metadata tampering, font substitution |
| KYC onboarding | Passports, driving licences, utility bills | MRZ anomalies, security feature absence, AI-generated faces |
| Rental applications | Employment letters, P60s | Letterhead cloning, date manipulation |
| Mortgage underwriting | Companies House extracts, audited accounts | Structural document forgery, altered financial figures |
| HR and recruitment | Degree certificates, professional licences | Hologram removal artefacts, registry mismatch |
For regulated firms, HMRC-issued documents such as SA302 tax calculations and P60 end-of-year certificates are common fraud vectors in mortgage and rental applications. CheckFile's detection layer cross-references document structure against known HMRC template specifications for those jurisdictions, flagging deviations that a human reviewer would typically miss without specialist training.
Our KYC banking solution is built on the same detection infrastructure, providing a worked example of how these endpoints integrate into a regulated onboarding context.
Compliance: FCA, UK GDPR, and Regulatory Requirements
Article 6 of the EU AI Act classifies document verification systems as high-risk AI, which means organisations deploying such tools within the EU or against EU data subjects must maintain technical documentation, implement human oversight mechanisms, and register the system in the EU database for high-risk AI before deployment. This obligation applies regardless of where the API provider is based.
Under UK GDPR Article 32, controllers must implement technical and organisational measures appropriate to the risk of processing. Automated document analysis involves special category-adjacent personal data and generates outputs that may directly affect individuals' access to financial products or services. Your integration must therefore document the legal basis for processing (typically legitimate interests or contractual necessity), define a maximum data retention period for submitted documents and analysis results, and ensure that stored payloads are encrypted at rest and in transit.
The FCA's Consumer Duty, in force since July 2023, requires firms to be able to demonstrate that their verification processes are proportionate and produce good outcomes for consumers. An automated forgery detection layer supports this by creating a consistent, auditable decision record โ but firms must also configure escalation paths that allow for human review of contested or borderline results. A fully automated rejection with no appeals mechanism would likely conflict with the spirit of the Duty.
Data minimisation is a practical as well as a legal concern. The API accepts document images but your integration should not transmit higher-resolution files than the analysis requires, and should not store raw document images on your own servers beyond the retention period specified in your privacy notice. Where possible, store the job_id and reference the CheckFile audit report rather than duplicating raw personal data across systems. Further detail on our security architecture and data handling practices is available at CheckFile Security.
To explore the full capability of the forgery detection platform including deepfake document and AI-generated identity detection, visit CheckFile's AI deepfake detection.
This article is for informational purposes only and does not constitute legal advice. Regulatory requirements vary by jurisdiction and sector. Organisations should seek independent legal and compliance counsel before implementing automated document verification systems in regulated contexts.
For where this fits in the CheckFile offering, see our AI and deepfake detection approach.
Frequently Asked Questions
What is a document forgery detection API
A document forgery detection API is a programmatic service that accepts document files and returns a structured analysis indicating whether the document shows signs of manipulation, fabrication, or AI generation. It works by combining OCR extraction, metadata analysis, structural template comparison, and machine learning models trained on known forgery patterns, then expressing its findings as a confidence score, risk level, and itemised list of detected signals.
How does a document forgery detection API differ from standard OCR
Standard OCR extracts text content from a document but makes no judgement about whether that content is genuine. A document forgery detection API goes further by comparing the extracted content and document structure against authoritative templates, checking metadata consistency, analysing font and typographic properties, and examining security features โ returning a forensic verdict rather than simply a transcript of the document's text.
Is the API compliant with UK GDPR and the EU AI Act
CheckFile's API infrastructure is designed to support compliance with both frameworks, including data residency options, encrypted transmission, configurable retention policies, and audit log exports. However, regulatory compliance is a shared responsibility: the controller (your organisation) must establish a lawful basis for processing, maintain appropriate records of processing activities, and configure the integration in accordance with your own data protection policies. The EU AI Act's high-risk classification for document verification systems also places obligations on the deploying organisation, not only the API provider.
How long does it take to integrate a document forgery detection API
A basic integration using synchronous endpoints โ covering authentication, document submission, and response handling โ typically takes a backend developer two to four days to implement and test against the sandbox environment. Adding webhook handling for asynchronous workflows, configuring retry logic, and building a manual review queue for flagged documents extends this to approximately one to two weeks. Full production deployment including compliance documentation, staff training, and sign-off from your DPO typically takes four to six weeks depending on your organisation's change management processes.
What file formats does a document forgery detection API accept
The CheckFile API accepts JPEG, PNG, PDF, TIFF, and HEIC file formats. PDF submissions may contain multiple pages; the API analyses each page individually and returns signals at the page level where relevant. Maximum file size per submission is 20 MB. For mobile capture workflows, HEIC files from iOS devices are accepted without pre-conversion. Where documents are submitted as scanned PDFs, the platform's OCR layer extracts text prior to structural analysis, with support for 24 languages covering the majority of documents encountered in UK and European onboarding contexts.
Stay informed
Get our compliance insights and practical guides delivered to your inbox.