Document Verification API: Developer Integration Guide
Integrate document verification via REST API with OAuth 2.0, webhooks and SDKs. Endpoints, code samples, pricing tiers and compliance built-in.

Summarize this article with
A document verification API is a programmatic interface that lets developers submit identity documents, invoices, certificates or proof-of-address files and receive structured verification results โ authenticity checks, data extraction, fraud signals โ without building the underlying AI models themselves. CheckFile's REST API processes a single document in 4.2 seconds on average, returns 98.7% OCR accuracy across 24 languages, and handles 3,200+ document types across 32 jurisdictions.
The EU AI Act (Regulation (EU) 2024/1689, Art. 6 and Annex III) classifies AI systems used for identity document verification in financial services as high-risk, requiring providers to maintain technical documentation, risk management systems, and human oversight capabilities. Any API you integrate must satisfy these obligations โ or you inherit the compliance gap.
This guide covers authentication, core endpoints, webhook configuration, error handling, SDK options, and pricing. It is written for backend engineers, DevOps teams and technical leads evaluating document verification APIs for production integration.
This article is for informational purposes only and does not constitute legal, financial, or regulatory advice.
Authentication and Security
The CheckFile API uses OAuth 2.0 client credentials for machine-to-machine authentication, following RFC 6749, Section 4.4. You exchange your client_id and client_secret for a short-lived bearer token (60-minute expiry), then include that token in the Authorization header of every subsequent request.
All API traffic is encrypted with TLS 1.3. Document payloads are encrypted at rest using AES-256, and PII is automatically redacted from logs, satisfying GDPR Article 32 requirements for appropriate technical measures (GDPR, Art. 32).
# Obtain access token
curl -X POST https://api.checkfile.ai/oauth/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&client_id=YOUR_ID&client_secret=YOUR_SECRET"
# Response
{
"access_token": "eyJhbGciOi...",
"token_type": "Bearer",
"expires_in": 3600
}
Key security features:
- IP allowlisting โ restrict API access to known server IPs
- Rate limiting โ configurable per plan (see pricing section)
- Webhook signatures โ HMAC-SHA256 verification on every callback
- Audit log โ every API call is logged with timestamp, client ID, document type, and result
Scopes and Permissions
| Scope | Permission | Use case |
|---|---|---|
documents:write |
Upload and submit documents | Standard verification flow |
documents:read |
Retrieve results and status | Polling-based integrations |
webhooks:manage |
Create and configure webhooks | Event-driven architectures |
analytics:read |
Access usage metrics | Monitoring dashboards |
admin:manage |
Manage API keys and team access | DevOps and administration |
Core API Endpoints
The API follows RESTful conventions with JSON payloads. Base URL: https://api.checkfile.ai/v1.
Document Submission
POST /v1/documents/verify
Content-Type: multipart/form-data
Authorization: Bearer {token}
# Fields:
# file (required) โ document image or PDF (max 20 MB)
# document_type (optional) โ "passport", "id_card", "invoice", "proof_of_address"
# country (optional) โ ISO 3166-1 alpha-2 code
# webhook_url (optional) โ callback URL for async results
# reference_id (optional) โ your internal reference for correlation
Response (HTTP 202 Accepted):
{
"document_id": "doc_8f3a2b1c",
"status": "processing",
"estimated_completion_seconds": 4,
"created_at": "2026-03-19T10:15:00Z"
}
When document_type is omitted, the API uses its AI classification engine โ which achieves 96.1% classification accuracy on our benchmark of 3,200+ document types โ to detect the type automatically.
Retrieve Results
GET /v1/documents/{document_id}
Authorization: Bearer {token}
Response (HTTP 200):
{
"document_id": "doc_8f3a2b1c",
"status": "completed",
"document_type": "passport",
"country": "GB",
"verification": {
"authentic": true,
"confidence": 0.97,
"fraud_signals": [],
"checks": {
"mrz_valid": true,
"photo_tamper": false,
"expiry_valid": true,
"data_consistency": true
}
},
"extracted_data": {
"full_name": "Jane Smith",
"date_of_birth": "1990-05-12",
"document_number": "123456789",
"expiry_date": "2031-05-11",
"nationality": "GBR"
},
"processing_time_ms": 3840,
"created_at": "2026-03-19T10:15:00Z",
"completed_at": "2026-03-19T10:15:03.840Z"
}
The extracted_data object uses 94.3% field extraction accuracy on our internal benchmark, covering structured fields across all supported document types.
Batch Verification
For high-volume integrations, the batch endpoint accepts up to 50 documents per request:
POST /v1/documents/verify/batch
Content-Type: multipart/form-data
Authorization: Bearer {token}
# files[] โ array of document files
# options โ JSON object with shared settings
Batch requests return a batch_id and deliver results via webhook as each document completes.
Webhook Configuration
Event-driven architectures avoid polling overhead. Register a webhook endpoint to receive real-time notifications when verifications complete.
POST /v1/webhooks
Authorization: Bearer {token}
Content-Type: application/json
{
"url": "https://your-app.com/webhooks/checkfile",
"events": ["document.completed", "document.failed", "document.review_required"],
"secret": "whsec_your_secret_key"
}
Every webhook delivery includes an X-CheckFile-Signature header containing an HMAC-SHA256 hash of the payload. Verify it before processing:
import hmac
import hashlib
def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(
secret.encode(), payload, hashlib.sha256
).hexdigest()
return hmac.compare_digest(f"sha256={expected}", signature)
Webhook retry policy: 3 attempts with exponential backoff (5s, 30s, 300s). After 3 failures, the webhook is disabled and your team receives an email alert.
| Event | Trigger | Payload includes |
|---|---|---|
document.completed |
Verification finished successfully | Full result object |
document.failed |
Processing error (corrupt file, unsupported format) | Error code and message |
document.review_required |
Low-confidence result flagged for human review | Partial result + confidence score |
batch.completed |
All documents in a batch are processed | Summary with per-document statuses |
SDK and Integration Options
While the REST API works from any language, official SDKs reduce integration time from days to hours.
Available SDKs
| Language | Package | Install |
|---|---|---|
| Python | checkfile-sdk |
pip install checkfile-sdk |
| Node.js | @checkfile/sdk |
npm install @checkfile/sdk |
| Java | com.checkfile:sdk |
Maven Central |
| Go | github.com/checkfile/sdk-go |
go get |
Python Integration Example
from checkfile import CheckFileClient
client = CheckFileClient(
client_id="YOUR_CLIENT_ID",
client_secret="YOUR_CLIENT_SECRET"
)
# Synchronous verification
result = client.documents.verify(
file=open("passport.pdf", "rb"),
document_type="passport",
country="GB"
)
print(f"Authentic: {result.verification.authentic}")
print(f"Name: {result.extracted_data.full_name}")
print(f"Processing time: {result.processing_time_ms}ms")
Node.js Integration Example
import { CheckFileClient } from '@checkfile/sdk';
import { readFileSync } from 'fs';
const client = new CheckFileClient({
clientId: process.env.CHECKFILE_CLIENT_ID,
clientSecret: process.env.CHECKFILE_CLIENT_SECRET,
});
const result = await client.documents.verify({
file: readFileSync('passport.pdf'),
documentType: 'passport',
country: 'GB',
});
console.log(`Authentic: ${result.verification.authentic}`);
console.log(`Confidence: ${result.verification.confidence}`);
SDKs handle token refresh, retries with exponential backoff, and webhook signature verification automatically. Our analysis shows that SDK-based integrations reduce median time-to-production from 8 days (raw REST) to 2 days.
Error Handling and Rate Limits
The API uses standard HTTP status codes with structured error bodies:
{
"error": {
"code": "DOCUMENT_UNREADABLE",
"message": "The uploaded file could not be parsed. Ensure DPI >= 300.",
"details": { "min_dpi": 300, "detected_dpi": 72 },
"request_id": "req_9f2c4d1e"
}
}
Common Error Codes
| HTTP Status | Error Code | Resolution |
|---|---|---|
| 400 | INVALID_FILE_FORMAT |
Use PDF, JPEG, PNG or TIFF |
| 400 | DOCUMENT_UNREADABLE |
Increase scan resolution to 300+ DPI |
| 401 | TOKEN_EXPIRED |
Refresh your OAuth token |
| 413 | FILE_TOO_LARGE |
Reduce file below 20 MB limit |
| 429 | RATE_LIMIT_EXCEEDED |
Wait for Retry-After header duration |
| 503 | SERVICE_DEGRADED |
Retry with exponential backoff |
Rate Limits by Plan
| Plan | Requests/minute | Burst | Concurrent uploads |
|---|---|---|---|
| Starter | 60 | 10 | 5 |
| Business | 500 | 50 | 25 |
| Enterprise | 2,000+ | 200 | 100 |
Rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. Build your retry logic around these rather than hardcoding delays.
Compliance and Data Handling
Document verification touches PII across multiple jurisdictions. The API is designed with compliance as a first-class concern.
As of March 2026, GDPR (Regulation (EU) 2016/679, Art. 28) requires data controllers to use only processors providing sufficient guarantees of appropriate technical and organisational measures. CheckFile acts as a data processor, with a signed Data Processing Agreement (DPA) included in all Business and Enterprise plans.
Data handling guarantees:
- Retention: Documents are deleted after processing unless you explicitly request storage (configurable from 0 to 365 days)
- Residency: EU-based processing by default; US and APAC regions available on Enterprise plans
- Audit trail: Every API call generates an immutable audit record with document hash, timestamp, result, and client ID
- SOC 2 Type II certification covers the API infrastructure
- PCI DSS compliant document handling for financial documents
For integrations subject to FCA guidance on outsourcing to cloud and third-party IT services (FG 16/5), CheckFile provides the required third-party assurance documentation, business continuity testing results, and exit strategy terms.
Pricing Structure
CheckFile uses a per-document pricing model with volume discounts. All plans include full API access, webhooks, and audit logs.
| Plan | Monthly price | Included verifications | Extra verification | Support |
|---|---|---|---|---|
| Starter | Free | 100 | -- | Community |
| Business | From EUR 299/mo | 2,000 | EUR 0.12 | Priority email (< 4h) |
| Enterprise | Custom | Custom volume | Negotiated | Dedicated CSM + SLA |
See the full pricing page for details on volume tiers, annual billing discounts, and Enterprise SLA terms.
Our platform analysis shows that organisations switching from manual document checks to API-based verification reduce cost per dossier by 67% and processing time by 83%. The average payback period for Business plan customers is under 3 months when processing 500+ documents per month.
| Manual process | API-automated | Saving |
|---|---|---|
| 12 min/document | 4.2 seconds | 99.4% time reduction |
| EUR 4.80/document (labour) | EUR 0.12-0.15/document | 67-97% cost reduction |
| 89% accuracy (human error) | 98.7% OCR accuracy | Fewer re-checks |
| Business hours only | 99.94% uptime, 24/7 | No scheduling constraints |
Integration Architecture Patterns
Pattern 1: Synchronous (Simple)
For low-volume integrations (< 60 req/min), submit and poll:
Client โ POST /v1/documents/verify โ 202 Accepted
Client โ GET /v1/documents/{id} (poll every 2s) โ 200 with results
Suitable for onboarding flows where the user waits for verification.
Pattern 2: Async with Webhooks (Recommended)
For production workloads, submit and receive results via webhook:
Client โ POST /v1/documents/verify (with webhook_url) โ 202 Accepted
CheckFile โ POST webhook_url (signed payload) โ Your handler processes result
Decouples submission from processing. Scales linearly with volume.
Pattern 3: Batch Pipeline
For back-office processing (nightly KYC reviews, bulk compliance checks):
Client โ POST /v1/documents/verify/batch (up to 50 files) โ batch_id
CheckFile โ POST webhook_url per document as each completes
CheckFile โ POST webhook_url with batch.completed summary
Our platform processes over 180,000 documents per month using these patterns. The async webhook pattern handles 94% of production integrations.
Getting Started
Integration follows four steps:
- Create an account at checkfile.ai and generate API credentials from the dashboard
- Test in sandbox โ the sandbox environment mirrors production with synthetic documents (no billing)
- Integrate using the SDK or direct REST calls, starting with the synchronous pattern
- Go live โ switch to production credentials and configure webhooks
The API documentation includes an interactive playground for testing endpoints, and the pricing page details plan options for your expected volume.
For teams building automated document verification workflows, the API integrates directly with the patterns described in our workflow setup guide. If you are evaluating verification solutions more broadly, our automation verification guide covers the full landscape of document verification approaches.
Frequently Asked Questions
What document types does the API support?
The CheckFile API supports 3,200+ document types across 32 jurisdictions, including passports, national ID cards, driving licences, invoices, bank statements, proof of address, tax notices, payslips, and corporate registration documents. The AI classification engine identifies document types automatically with 96.1% accuracy when the document_type parameter is omitted.
How long does verification take?
Average processing time is 4.2 seconds per document. P95 latency is under 12 seconds for standard document types. Batch submissions process documents in parallel, so a 50-document batch typically completes within 30-60 seconds depending on document complexity.
Is the API GDPR-compliant?
Yes. CheckFile acts as a data processor under GDPR Article 28 with signed DPAs, EU-based processing infrastructure, configurable data retention (0-365 days), and automatic PII redaction from logs. SOC 2 Type II and ISO 27001 certifications cover the API infrastructure.
Can I test the API before committing to a paid plan?
The Starter plan includes 100 free verifications per month, and the sandbox environment allows unlimited testing with synthetic documents at no cost. No credit card is required to start.
What happens if the API cannot verify a document?
Documents that fall below the confidence threshold are flagged with review_required status and routed to your human review queue via webhook. The response includes the partial result with the confidence score, extracted data, and specific fraud signals that triggered the flag. This ensures no document falls through the cracks.