Automation11 min read

Document Forgery Detection API: Integration Guide 2026

Integrate a document forgery detection API into your verification workflows. OAuth 2.0 authentication, endpoints, webhooks, confidence scores, and FCA/UK GDPR compliance.

CheckFile Team·June 19, 2026

Illustration for Document Forgery Detection API: Integration Guide 2026 — Automation

Summarize this article with

A document forgery detection API is a programmatic interface that automatically analyses submitted documents for signs of tampering, fabrication, metadata manipulation, or AI-generated content — returning a structured verdict and confidence score that your application can act upon in real time. As fraud techniques grow more sophisticated and regulators impose stricter documentation obligations on financial services, recruitment, and property sectors, integrating automated forgery detection directly into your verification workflow has shifted from a competitive advantage to a compliance necessity.

This guide covers everything a developer or compliance engineer needs to know to connect a document forgery detection API to an existing workflow: authentication, endpoint structure, response interpretation, webhook configuration, and the regulatory requirements governing deployment in the United Kingdom and European Union.

Why Integrate a Document Forgery Detection API

Integrating a document forgery detection API directly addresses the core weakness of human-led review: manual processes catch only a fraction of fraudulent submissions. According to the ACFE 2024 Report to the Nations, manual detection accounts for just 37% of fraud discovery across all sectors — meaning nearly two-thirds of document fraud either goes undetected or is caught through other, often costlier, means. Separately, PwC's 2025 Global Economic Crime and Fraud Survey found that 69% of businesses were affected by fraud in the preceding two years, with document manipulation identified as one of the most prevalent methods.

These figures reflect a structural problem: human reviewers experience fatigue, apply inconsistent standards, and cannot scale to match the volume of submissions that digital-first onboarding generates. An API-based approach applies the same detection logic to every document, every time, at machine speed.

Regulatory pressure compounds the urgency. The Financial Conduct Authority's Consumer Duty requires firms to take reasonable steps to prevent foreseeable harm, which regulators have increasingly interpreted to include robust identity and document verification controls. The UK Information Commissioner's Office makes clear under UK GDPR that personal data submitted during onboarding must be processed securely and proportionately. Meanwhile, the EU AI Act, which applies to systems deployed or used in the European Union, classifies document verification tools as high-risk AI systems under Annex III, triggering obligations around transparency, human oversight, and technical documentation.

For firms operating across both jurisdictions — or handling documents from EU-based customers — the compliance landscape is bilateral. A well-integrated API that produces auditable outputs satisfies both regimes more efficiently than any manual process.

Technical Architecture: How API Integration Works

A document forgery detection API operates over HTTPS using a REST architecture, with OAuth 2.0 client credentials flow as the standard authentication mechanism. Your backend exchanges a client ID and client secret for a bearer token, which is then included in the Authorization header of every subsequent request. Tokens are typically short-lived (60 minutes) to limit exposure in the event of interception.

The integration offers two processing modes depending on your latency requirements:

Synchronous mode returns the analysis result within the HTTP response body, typically within 2–8 seconds. This suits low-volume workflows where a user is waiting for an inline result — for example, a KYC onboarding screen that displays a verification status before proceeding.

Asynchronous mode accepts the document, returns a job_id immediately (HTTP 202 Accepted), and processes the document in the background. Once analysis is complete, the platform delivers the result to a pre-registered webhook endpoint on your server via an HTTP POST. This mode suits high-volume batch processing or workflows where a document may require deeper forensic analysis that exceeds synchronous timeout thresholds.

Webhook payloads are signed using HMAC-SHA256. Your server should verify the X-CheckFile-Signature header against a hash of the raw payload body using your webhook secret, rejecting any payload where the signature does not match. This prevents replay attacks and spoofed callbacks.

For a broader overview of how document verification APIs fit into multi-stage verification pipelines, see our document verification API developer integration guide and the automation verification guide, which covers orchestration patterns for end-to-end workflows.

Core API Endpoints and Request Format

The primary endpoint for document submission is POST /v1/documents/analyze, which accepts a multipart form upload or a base64-encoded payload and returns a structured analysis object. The table below summarises the key endpoints:

Endpoint	HTTP Method	Description	Typical Response Time
`/v1/documents/analyze`	POST	Submit document for forgery analysis	2–8 s (sync)
`/v1/documents/{job_id}`	GET	Poll result for async job	Immediate
`/v1/documents/{job_id}/report`	GET	Retrieve full forensic report PDF	1–3 s
`/v1/webhooks`	POST	Register or update webhook endpoint	Immediate
`/v1/webhooks/{id}`	DELETE	Remove a registered webhook	Immediate

A typical synchronous response for a flagged document looks as follows:

{
  "job_id": "doc_8f3a2c1b9e47",
  "status": "complete",
  "document_type": "UK_PASSPORT",
  "jurisdiction": "GB",
  "confidence_score": 87,
  "risk_level": "high",
  "signals": [
    {
      "signal_type": "metadata_inconsistency",
      "detail": "Document creation timestamp post-dates purported issue date by 14 months",
      "severity": "high"
    },
    {
      "signal_type": "font_anomaly",
      "detail": "Machine-readable zone uses non-standard character spacing inconsistent with ICAO 9303",
      "severity": "medium"
    }
  ],
  "ocr_language": "en",
  "processing_mode": "synchronous",
  "created_at": "2026-06-19T09:14:32Z"
}

The signals array itemises each detected anomaly with a severity rating, giving your review team a specific starting point for manual escalation rather than a binary pass/fail verdict. CheckFile's platform analyses over 3,200 document types across 32 jurisdictions, with OCR support for 24 languages, making it suitable for international onboarding workflows without separate regional configurations.

Ready to automate your checks?

Free pilot with your own documents. Results in 48h.

Request a free pilot

Interpreting Responses and Confidence Scores

A confidence score ranging from 0 to 100 indicates the platform's certainty that a detected anomaly represents genuine forgery rather than a benign artefact such as compression noise or scanning distortion. The higher the score, the stronger the evidence that the document has been manipulated. The risk_level field translates the raw score into an actionable category:

Score Range	Risk Level	Recommended Action
0–24	Low	Accept; retain record for audit trail
25–49	Medium	Flag for secondary automated check or rule-based review
50–74	High	Route to manual review queue; request re-submission if inconclusive
75–100	Critical	Reject automatically; log for compliance reporting; consider SAR if applicable

It is important to contextualise scores within your specific use case. A score of 45 on a scanned bank statement may warrant different handling than the same score on a passport submitted for mortgage application, given the relative consequences of a false negative in each context. The platform's signals array provides the evidence base for that contextual judgement.

Your integration should not rely solely on risk_level for automated decisions. Storing the full response payload against the customer record satisfies audit requirements under FCA Consumer Duty and supports downstream Subject Access Requests under UK GDPR. For more on the underlying detection methodology, see our AI document fraud detection techniques article.

Use Cases by Industry Sector

Document forgery detection APIs apply across any sector where identity or financial documents are submitted as part of an application, claim, or onboarding process. The table below maps common sectors to the documents they receive and the fraud signals the API is configured to detect:

Sector	Document Type	Primary Detectable Signal
Retail lending	Bank statements, payslips	Fabricated transaction history, altered employer details
Insurance claims	Medical certificates, repair invoices	Metadata tampering, font substitution
KYC onboarding	Passports, driving licences, utility bills	MRZ anomalies, security feature absence, AI-generated faces
Rental applications	Employment letters, P60s	Letterhead cloning, date manipulation
Mortgage underwriting	Companies House extracts, audited accounts	Structural document forgery, altered financial figures
HR and recruitment	Degree certificates, professional licences	Hologram removal artefacts, registry mismatch

For regulated firms, HMRC-issued documents such as SA302 tax calculations and P60 end-of-year certificates are common fraud vectors in mortgage and rental applications. CheckFile's detection layer cross-references document structure against known HMRC template specifications for those jurisdictions, flagging deviations that a human reviewer would typically miss without specialist training.

Our KYC banking solution is built on the same detection infrastructure, providing a worked example of how these endpoints integrate into a regulated onboarding context.

Article 6 of the EU AI Act classifies document verification systems as high-risk AI, which means organisations deploying such tools within the EU or against EU data subjects must maintain technical documentation, implement human oversight mechanisms, and register the system in the EU database for high-risk AI before deployment. This obligation applies regardless of where the API provider is based.

Under UK GDPR Article 32, controllers must implement technical and organisational measures appropriate to the risk of processing. Automated document analysis involves special category-adjacent personal data and generates outputs that may directly affect individuals' access to financial products or services. Your integration must therefore document the legal basis for processing (typically legitimate interests or contractual necessity), define a maximum data retention period for submitted documents and analysis results, and ensure that stored payloads are encrypted at rest and in transit.

The FCA's Consumer Duty, in force since July 2023, requires firms to be able to demonstrate that their verification processes are proportionate and produce good outcomes for consumers. An automated forgery detection layer supports this by creating a consistent, auditable decision record — but firms must also configure escalation paths that allow for human review of contested or borderline results. A fully automated rejection with no appeals mechanism would likely conflict with the spirit of the Duty.

Data minimisation is a practical as well as a legal concern. The API accepts document images but your integration should not transmit higher-resolution files than the analysis requires, and should not store raw document images on your own servers beyond the retention period specified in your privacy notice. Where possible, store the job_id and reference the CheckFile audit report rather than duplicating raw personal data across systems. Further detail on our security architecture and data handling practices is available at CheckFile Security.

To explore the full capability of the forgery detection platform including deepfake document and AI-generated identity detection, visit CheckFile's AI deepfake detection.

This article is for informational purposes only and does not constitute legal advice. Regulatory requirements vary by jurisdiction and sector. Organisations should seek independent legal and compliance counsel before implementing automated document verification systems in regulated contexts.

For where this fits in the CheckFile offering, see our AI and deepfake detection approach.

Frequently Asked Questions

What is a document forgery detection API

A document forgery detection API is a programmatic service that accepts document files and returns a structured analysis indicating whether the document shows signs of manipulation, fabrication, or AI generation. It works by combining OCR extraction, metadata analysis, structural template comparison, and machine learning models trained on known forgery patterns, then expressing its findings as a confidence score, risk level, and itemised list of detected signals.

How does a document forgery detection API differ from standard OCR

Standard OCR extracts text content from a document but makes no judgement about whether that content is genuine. A document forgery detection API goes further by comparing the extracted content and document structure against authoritative templates, checking metadata consistency, analysing font and typographic properties, and examining security features — returning a forensic verdict rather than simply a transcript of the document's text.

CheckFile's API infrastructure is designed to support compliance with both frameworks, including data residency options, encrypted transmission, configurable retention policies, and audit log exports. However, regulatory compliance is a shared responsibility: the controller (your organisation) must establish a lawful basis for processing, maintain appropriate records of processing activities, and configure the integration in accordance with your own data protection policies. The EU AI Act's high-risk classification for document verification systems also places obligations on the deploying organisation, not only the API provider.

How long does it take to integrate a document forgery detection API

A basic integration using synchronous endpoints — covering authentication, document submission, and response handling — typically takes a backend developer two to four days to implement and test against the sandbox environment. Adding webhook handling for asynchronous workflows, configuring retry logic, and building a manual review queue for flagged documents extends this to approximately one to two weeks. Full production deployment including compliance documentation, staff training, and sign-off from your DPO typically takes four to six weeks depending on your organisation's change management processes.

What file formats does a document forgery detection API accept

The CheckFile API accepts JPEG, PNG, PDF, TIFF, and HEIC file formats. PDF submissions may contain multiple pages; the API analyses each page individually and returns signals at the page level where relevant. Maximum file size per submission is 20 MB. For mobile capture workflows, HEIC files from iOS devices are accepted without pre-conversion. Where documents are submitted as scanned PDFs, the platform's OCR layer extracts text prior to structural analysis, with support for 24 languages covering the majority of documents encountered in UK and European onboarding contexts.

Stay informed

Get our compliance insights and practical guides delivered to your inbox.

Ready to automate your checks?

Free pilot with your own documents. Results in 48h.

Document Forgery Detection API: Integration Guide 2026

Why Integrate a Document Forgery Detection API

Technical Architecture: How API Integration Works

Core API Endpoints and Request Format

Interpreting Responses and Confidence Scores

Use Cases by Industry Sector

Frequently Asked Questions

What is a document forgery detection API

How does a document forgery detection API differ from standard OCR

How long does it take to integrate a document forgery detection API

What file formats does a document forgery detection API accept

Stay informed

Ready to automate your checks?

Related articles

Best Document Verification Software 2026: Features, Pricing & Compliance

Anti-Fraud Technology: Document Detection Tools & Techniques USA 2026

How to Choose Compliance Software: A Buyer's Guide

Why Integrate a Document Forgery Detection API

Technical Architecture: How API Integration Works

Core API Endpoints and Request Format

Interpreting Responses and Confidence Scores

Use Cases by Industry Sector

Compliance: FCA, UK GDPR, and Regulatory Requirements

Frequently Asked Questions

What is a document forgery detection API

How does a document forgery detection API differ from standard OCR

Is the API compliant with UK GDPR and the EU AI Act

How long does it take to integrate a document forgery detection API

What file formats does a document forgery detection API accept

Stay informed

Ready to automate your checks?

Related articles

Best Document Verification Software 2026: Features, Pricing & Compliance

Anti-Fraud Technology: Document Detection Tools & Techniques USA 2026

How to Choose Compliance Software: A Buyer's Guide