Practical Guide for Enterprise Buyers
AI Document Processing Without Losing Control of Your Data
AI document processing is moving quickly. Organisations are being asked to do more with contracts, records, forms, correspondence, policies, claims, invoices, employee files, and case documents — often without adding headcount or slowing down existing operations. The promise is real: faster classification, better search, automated metadata extraction, summarisation, risk flagging, and new ways to understand large document collections.
But for many enterprises, the first question is not "Can AI read our documents?"
Can we use AI without losing control of our data, governance model, compliance obligations, and audit trail?
That concern is reasonable. Documents often contain sensitive business information, personal information, regulated records, legal obligations, financial data, employee data, customer communications, and operational history. Applying AI to that content should not mean copying it into opaque third-party systems, weakening access controls, or creating outputs that cannot be explained, reviewed, validated, or governed. This guide outlines a practical approach to AI document processing — using FormKiQ's architecture on AWS as a reference model — for organisations that want the benefits of AI while maintaining enterprise-grade control.
1. Start with the Business Problem, Not the AI Feature
AI document processing works best when it is tied to a specific operational problem. A common mistake is to begin with a broad question such as "How can we use AI on our documents?" That usually leads to unclear scope, inconsistent results, and stakeholder discomfort.
A better starting point is: Which document-heavy process is costly, slow, risky, or difficult to scale today?
| Problem Type | Example Business Problem | How AI Addresses It |
|---|---|---|
| Classification bottleneck | Documents arrive from multiple channels; staff manually sort and tag each one before it enters a workflow | AI identifies document type at the point of ingestion and applies classification metadata automatically |
| Data entry burden | Staff manually read invoices, forms, and applications to extract key data and enter it into business systems | OCR and AI extraction pull structured data from documents — names, dates, amounts, identifiers — and apply it as searchable metadata |
| Review volume | Reviewers process hundreds of documents daily and cannot read each one in full before making triage decisions | AI summarisation produces concise summaries for triage, allowing reviewers to prioritise and focus on documents that require detailed attention |
| Hidden obligations | Contracts contain obligations, milestones, and renewal dates buried in lengthy text; staff discover them late or not at all | AI analyses contract text and extracts obligations, deadlines, and key terms as structured, trackable metadata |
| Sensitivity exposure | Documents containing PII, PHI, or confidential information are uploaded without appropriate access controls because nobody classified them at intake | AI sensitivity classification identifies documents containing sensitive content and triggers appropriate access restrictions automatically |
| Governance gaps | Large document collections have incomplete metadata, inconsistent classification, or missing retention categories | AI analyses existing collections and suggests classifications, identifies missing metadata, and flags governance gaps for remediation |
Once the problem is clear, AI can be evaluated as one part of a controlled process — not as an open-ended experiment.
A legal operations team may not need "AI for contracts" in general. It may need to identify renewal dates, notice periods, governing law, assignment clauses, and termination rights across a specific set of supplier agreements. That narrower problem is easier to test, govern, validate, and improve. In FormKiQ, this translates to configuring the AI Processing and Analysis module — powered by Amazon Bedrock — to analyse supplier agreements and extract specific clause types as structured metadata, running within the organisation's own AWS account.
2. Know What Data the AI Will Touch
Before applying AI to documents, organisations should understand the sensitivity of the content being processed. The goal is not to avoid AI entirely. The goal is to match the AI processing model to the risk level of the data.
| Data Sensitivity | Examples | AI Processing Considerations |
|---|---|---|
| Public / low sensitivity | Published policies, marketing materials, public reports | Lowest risk — AI classification and summarisation can proceed with minimal additional controls |
| Internal / business sensitive | Internal reports, project documentation, operational records | Moderate risk — AI processing should use approved infrastructure; outputs should be governed as internal metadata |
| Personal information | Employee records, customer data, applicant information | Higher risk — data residency, privacy legislation (GDPR, PIPEDA), and access control requirements apply to both the documents and the AI outputs |
| Regulated records | Health records (PHI), financial records, legal files | Highest risk — regulatory frameworks (HIPAA, SOX, SEC) impose specific requirements on how this content is processed, stored, and accessed |
| Legally privileged | Attorney-client communications, litigation strategy, investigation files | Highest risk — AI processing must not expose privileged content to unauthorised parties; outputs must respect the same privilege boundaries as the source documents |
FormKiQ's AI processing through Amazon Bedrock runs entirely within the organisation's AWS account, with inference region controls that keep document content within the selected geographic boundary. The risk assessment for AI processing is the same as the risk assessment for document storage — the data doesn't move to a new environment for AI processing.
An organisation might allow AI summarisation on published policies and public board materials with relatively low risk, while applying stricter controls — including mandatory human review of all AI outputs — to HR files, legal agreements, health-related records, or customer claims. In FormKiQ, this is configured per-document-type: AI processing rules are attached to document type definitions, so different document types receive different AI treatment automatically.
3. Understand the AI Deployment Options
Enterprise buyers do not have only one way to apply AI to documents. The architecture matters because it affects data movement, security, auditability, cost, and operational complexity.
| Deployment Option | How It Works | Data Residency | Auditability | Enterprise Control |
|---|---|---|---|---|
| External AI application or hosted API | Document content sent to a vendor-controlled environment for processing | Vendor determines processing location; may involve cross-border transfer | Limited — depends on vendor logging and reporting | Weakest — data leaves your environment; vendor controls retention and access |
| Local model on workstation or server | Open-weight model runs on controlled hardware within the organisation | Data stays on-premises | Limited — depends on local logging infrastructure | Moderate — no external data movement, but difficult to scale, audit centrally, or maintain consistently |
| Cloud provider AI service in your approved cloud environment | Documents remain in your cloud account; AI processing through a managed service (e.g., Amazon Bedrock) | Processing occurs in your account, in the region you select | Strong — CloudTrail, Bedrock invocation logs, and application-level audit trails all in your account | Strongest for cloud-native — your account, your keys, your logs, your region |
| Hybrid orchestration | Combines OCR, rules, AI extraction, confidence thresholds, human review, and workflow automation | Depends on architecture — can be fully in-account if designed that way | Strong if the orchestration layer is governed | Strongest overall — AI is one component in a controlled pipeline, not the only decision point |
FormKiQ uses the third and fourth options in combination. OCR (Tesseract or Amazon Textract) handles text extraction. Amazon Bedrock handles AI classification, extraction, summarisation, and analysis. Workflows, rulesets, and human review queues provide the orchestration. Everything runs within the customer's AWS account — no document content is sent to external services.
4. Keep Documents in Your Environment
For cautious buyers, one of the most important architectural questions is where document content goes during AI processing. In some AI tools, documents are uploaded to a vendor-controlled environment. That raises questions many enterprises struggle to answer satisfactorily.
| Concern | Question to Ask | FormKiQ's Approach |
|---|---|---|
| Data residency | Where is document content processed geographically? | Processing occurs in your AWS account, in the region you select, with Bedrock inference region controls specifying the processing region |
| Data retention by AI provider | Does the AI provider retain document content after processing? | Amazon Bedrock does not retain customer inputs or outputs; no document content is stored outside your account |
| Model training | Is document content used to train or improve AI models? | Amazon Bedrock does not use customer data for model training |
| Vendor access | Can the AI vendor's staff access document content? | No FormKiQ personnel access document content during normal operation; Bedrock processes within your account without external access |
| Subprocessor involvement | Are additional third parties involved in processing? | Processing occurs through AWS services within your account — no additional subprocessors for AI processing |
| Auditability | Can every AI processing event be logged and audited? | CloudTrail records API calls; Bedrock invocation logs capture processing events; FormKiQ's audit trail records document-level actions — all in your account |
| Cross-border transfer | Does AI processing involve moving data across jurisdictional boundaries? | Inference region controls ensure processing occurs within the same region as document storage |
| Incident response | If something goes wrong, who investigates and how? | Your security team investigates in your infrastructure using your CloudTrail, CloudWatch, and FormKiQ audit logs |
The preferred pattern for enterprise AI document processing is: documents stay in the customer-controlled environment. AI processing occurs within approved infrastructure. Outputs are stored, reviewed, and governed like any other system-generated metadata.
If a Canadian public-sector organisation has a requirement to store and process records in Canada, FormKiQ deploys to ca-central-1 (Montreal) or ca-west-1 (Calgary), and Bedrock inference region controls ensure AI processing occurs within the same Canadian region. The organisation doesn't need to evaluate whether an external AI service will honour its data residency requirements — the architecture enforces residency by design.
5. Treat AI Output as Governed Information
AI outputs — classifications, extracted metadata, summaries, risk flags, obligation lists, sensitivity labels, confidence scores — should not be treated as casual notes if they influence business decisions. If an AI output affects a workflow, decision, record, or obligation, it should be governed.
| AI Output Type | Governance Question | FormKiQ Approach |
|---|---|---|
| Document classification | Does the classification drive access controls, retention, or workflow routing? | AI-generated classifications become document metadata — searchable, auditable, and editable by authorised users |
| Extracted metadata | Does the extracted value (date, amount, party name) trigger a business action? | Extracted values stored as structured metadata on the document record; low-confidence values routed to human review before becoming authoritative |
| Summaries | Is the summary used for triage, decision support, or distribution? | Summaries stored as document metadata; accessible through the same access controls as the source document |
| Sensitivity classification | Does the sensitivity label trigger access restrictions? | AI-detected sensitivity triggers ABAC policies automatically — restricting access based on the classified sensitivity level |
| Obligation extraction | Do extracted obligations drive milestone tracking, alerting, or compliance monitoring? | Obligations become structured metadata with due dates, responsible parties, and status tracking; configurable alerts notify stakeholders |
| Confidence scores | Does the confidence score determine whether human review is required? | Confidence thresholds configurable per document type; low-confidence outputs routed to review queues; all routing decisions audit-logged |
FormKiQ's architecture treats AI outputs as first-class metadata. They're stored on the document record, searchable through the same indexes as manually entered metadata, subject to the same access controls, and visible in the same audit trail. There is no separate "AI output" layer that operates outside the governance model.
An AI-generated contract summary may be acceptable as a convenience view. But an AI-extracted renewal date is different — if that date triggers a reminder, task, escalation, or commercial decision, FormKiQ tracks whether the value was reviewed, when it was approved, and who approved it. The renewal date is governed metadata, not a disposable AI output.
6. Use Human Review During Pilots, Then Evolve
Human review is essential during early AI pilots. It helps the organisation understand accuracy, failure patterns, edge cases, and user trust. But full manual review of every AI output may not remain practical at scale.
| Phase | Review Model | What It Achieves |
|---|---|---|
| Pilot | Human review of most or all AI outputs | Establishes accuracy baseline; identifies failure patterns; builds reviewer confidence; validates business fit |
| Controlled production | Human review for high-risk, low-confidence, or exception cases; automated acceptance for high-confidence, low-risk outputs | Reduces review burden while maintaining control over the outputs that matter most |
| Operationalised workflow | AI orchestration with validation rules, confidence thresholds, sampling, periodic quality audits, and continuous improvement | Sustainable production model; human judgement focused where it creates the most value |
FormKiQ supports all three phases within the same platform. During the pilot, every AI output can be routed to a review queue. As confidence grows, the confidence threshold is adjusted — high-confidence outputs are accepted automatically while low-confidence outputs continue to require review. In the operationalised phase, sampling rules and periodic quality reviews replace full review.
During a pilot, every AI-extracted clause from a sample of supplier contracts may be reviewed by legal operations. After enough results are analysed, the organisation may decide that certain simple fields — counterparty name, effective date, governing law — only need exception-based review, while termination rights, assignment restrictions, and indemnity clauses still require human confirmation. In FormKiQ, this is configured through per-field confidence thresholds and document-type-specific review rules.
7. Design for Auditability from the Beginning
AI document processing should be explainable at the process level, even when the underlying model is probabilistic. Organisations should be able to answer: which documents were processed? Which model was used? What output was produced? Who reviewed it? What changed after review?
| Audit Element | What Should Be Recorded | Where FormKiQ Records It |
|---|---|---|
| Document processed | Document identifier, document type, version | FormKiQ document audit trail — in your AWS account |
| AI service used | Model identifier, inference configuration, prompt template version | Bedrock invocation logs + FormKiQ processing metadata — in your AWS account |
| Output produced | Classification, extracted values, summary, sensitivity label, confidence score | Document metadata record — searchable and auditable alongside all other document metadata |
| Reviewer action | Who reviewed, when, what they changed, what they approved or rejected | FormKiQ workflow audit trail — approval decisions with reviewer identity, timestamp, and before/after state |
| Downstream effect | Did the output trigger a workflow, notification, access change, or retention action? | FormKiQ workflow and event logs — every triggered action recorded with the AI output that triggered it |
| Infrastructure events | API calls, authentication, service interactions | AWS CloudTrail — in your AWS account |
Because FormKiQ deploys into your own AWS account, all of this audit evidence is in infrastructure you own. Your compliance team queries it directly — they don't request reports from a vendor.
If an AI-generated obligation list is later questioned — during litigation, audit, or a contract dispute — the organisation can determine which document version was processed, which Bedrock model was used, what the model returned, what the reviewer changed, and which final obligations were accepted into the system of record. The complete processing history is in the organisation's own audit trail.
8. Separate Experimentation from Production
AI pilots often begin informally, but production document processing needs clear boundaries. A pilot can ask "Does this work?" A production process must ask "Can we operate this safely, consistently, and defensibly?"
| Dimension | Pilot | Production |
|---|---|---|
| Document scope | Small, controlled sample — often non-sensitive | Defined document types with known sensitivity levels |
| AI tasks | Exploratory — testing different prompts and models | Approved tasks with defined prompt templates and expected outputs |
| Review model | Full or near-full human review | Risk-based review with confidence thresholds, validation rules, and sampling |
| Data handling | May use isolated test environment | Must comply with production data handling, residency, and security requirements |
| Audit logging | Informal tracking of results | Formal audit trail for every processing event, review decision, and output acceptance |
| Error handling | Manual observation and correction | Defined exception queues, escalation paths, and retry procedures |
| Performance expectations | Learning and calibration | Defined accuracy targets, processing throughput, and SLA expectations |
A team might experiment with five different prompt formats for extracting lease obligations using a sample of 50 non-sensitive leases. In production, the organisation uses the validated prompt template, applies it to all incoming leases via a document action triggered at ingestion, routes low-confidence extractions to a legal review queue, and retains the full processing history in the audit trail.
9. Use Metadata as the Control Layer
AI becomes more useful when it is connected to a strong metadata model. Without metadata governance, AI outputs become inconsistent, difficult to search, and hard to trust. With metadata governance, AI becomes a powerful accelerator for document classification, routing, and lifecycle management.
| Metadata Function | Without Governance | With FormKiQ Metadata Governance |
|---|---|---|
| Document type | AI labels documents inconsistently — "supplier agreement," "vendor contract," "supply contract," "procurement agreement" | AI classifications map to a controlled document type taxonomy; the same document type label is applied consistently regardless of how the source document refers to itself |
| Retention category | AI suggests retention but there's no enforcement mechanism | AI-suggested retention categories map to configured retention policies with automatic enforcement |
| Sensitivity classification | AI detects sensitivity but the label doesn't drive any action | AI sensitivity classification triggers ABAC access restrictions automatically |
| Workflow routing | AI suggests a routing but the suggestion sits in a separate system | AI classification drives FormKiQ workflow routing — documents enter the appropriate workflow based on their AI-determined type |
| Search and discovery | AI extracts entities but they're not indexed or searchable | AI-extracted entities become structured metadata — searchable through OpenSearch alongside all other document metadata |
AI may identify a document as a "supplier agreement," but FormKiQ's document type taxonomy ensures that label maps to a specific set of governance rules: the document inherits the supplier agreement retention policy, enters the contract review workflow, and is classified for ABAC access at the procurement and legal level. The AI did the classification; the metadata architecture makes it actionable.
10. Align AI Processing with Access Control
AI should not become a shortcut around permissions. If a user is not allowed to access a document, they should not be able to access its AI-generated summary, extracted metadata, or answers through a knowledge search interface. AI outputs can reveal sensitive document content even when the original file is protected.
| Access Control Concern | Risk | FormKiQ Approach |
|---|---|---|
| AI summaries expose restricted content | A user who can't open a document reads its AI summary instead | Summaries inherit the document's ABAC access controls — if you can't see the document, you can't see its summary |
| Extracted metadata reveals sensitive information | Metadata fields (patient names, compensation figures, legal exposure) visible to users without document access | Metadata visibility governed by the same ABAC policies as the document itself; sensitive metadata fields can be independently restricted |
| Knowledge search returns restricted content | A user queries the KnowledgeBase and receives answers drawn from documents they shouldn't be able to access | FormKiQ's KnowledgeBase respects document-level access controls — search results filtered based on the querying user's permissions |
| AI processing creates copies outside the governance model | Document content extracted by AI and stored in an ungoverned location | AI outputs stored as metadata on the governed document record — no separate, ungoverned copies of content |
A user who cannot open an HR investigation file should not be able to ask a KnowledgeBase query to summarise the case, list the people involved, or extract sensitive findings. In FormKiQ, the KnowledgeBase query respects the same ABAC boundaries as direct document access — the investigation file is invisible to unauthorised users regardless of whether they access it directly or through AI.
11. Choose Use Cases That Build Confidence
Cautious organisations do not need to begin with the most complex or sensitive AI use case. The goal is to build institutional trust through controlled success.
| Use Case Risk Level | Characteristics | Examples |
|---|---|---|
| Lower risk — good starting points | Clear inputs and outputs; human review during pilot; low downside if AI is wrong; measurable time savings; strong auditability | Draft summaries for internal review; suggested document classification; extraction of obvious fields from structured forms; identification of missing metadata; search enhancement; routing suggestions for incoming documents |
| Moderate risk — second phase | More complex documents; outputs influence workflows; may affect external parties; requires validation rules alongside AI | Contract metadata extraction with review; invoice processing with three-way match validation; application completeness checking; correspondence classification and routing |
| Higher risk — requires mature controls | Outputs drive significant decisions; legal, financial, or regulatory consequences; complex document types; requires confidence thresholds, exception handling, and ongoing quality monitoring | Automated retention categorisation; obligation extraction with milestone tracking; compliance analysis; risk flagging for contracts or regulatory submissions |
| Highest risk — requires strongest controls | Outcomes affect individuals' rights, legal proceedings, or regulatory standing; errors may not be easily reversible | Automated disposition decisions; legal interpretation; customer-impacting determinations; clinical or safety-critical assessments |
FormKiQ's per-document-type AI configuration supports this graduated approach. An organisation can enable AI classification for correspondence (lower risk) while keeping AI analysis for contracts (higher risk) in a separate pilot with full human review — both within the same deployment, governed by the same platform.
12. Evaluate Vendors on Control, Not Just Model Capability
Many AI discussions focus on model performance. That matters, but enterprise buyers should also evaluate the surrounding control environment. The right question is not only "How good is the AI?" — it is: Can this AI capability operate inside our governance, security, and compliance framework?
| Evaluation Category | Questions to Ask | What FormKiQ Provides |
|---|---|---|
| Data handling | Where is content processed? Is it retained by the provider? Is it used for training? Can data residency be met? | Processing in your AWS account; Bedrock does not retain content or use it for training; 20 supported regions with inference region controls |
| Security and access | Does AI respect existing access controls? Are outputs protected? Is data encrypted? | ABAC applies to documents and AI outputs; KMS customer-managed encryption at rest and in transit; outputs inherit document access policies |
| Governance | Can outputs be reviewed before acceptance? Can users correct AI values? Is there an audit trail? | Confidence-based routing to review queues; reviewer corrections tracked; complete audit trail for every AI processing event |
| Operations | Can AI tasks be monitored? Can failures be handled? Can models be changed? Can use cases be expanded incrementally? | CloudWatch monitoring; exception queues; configurable model selection; per-document-type AI configuration |
| Compliance | Can the deployment align with records, retention, and privacy policies? Can processing history be reconstructed? | AI outputs governed as document metadata; retention policies apply to outputs; full processing history in your CloudTrail and FormKiQ audit trail |
Two vendors may both offer contract metadata extraction. One requires uploading contracts to a shared SaaS environment with limited audit detail. The other — FormKiQ — processes contracts within the customer's own AWS account, stores prompts and outputs as auditable records, respects document-level ABAC for all AI outputs, and routes low-confidence extractions to human review before metadata becomes authoritative. The demo output may look similar, but the control posture is fundamentally different.
13. Build a Phased Roadmap
A controlled AI document processing strategy develops in phases. Each phase builds on the confidence and infrastructure established in the previous one.
| Phase | Focus | Key Activities | FormKiQ Support |
|---|---|---|---|
| Phase 1: Assess and prioritise | Identify where AI can reduce manual effort or improve consistency | Rank use cases by business value, data sensitivity, and risk; select one or two where value is clear and risk is manageable | Pilot deployment on FormKiQ Core or Essentials; document collection and metadata schema design |
| Phase 2: Establish control requirements | Define the governance framework for AI processing | Set requirements for data residency, access control, audit logging, human review, metadata governance, and retention | FormKiQ Advanced deployment with ABAC, KMS encryption, audit trail, and workflow configuration |
| Phase 3: Pilot a narrow workflow | Test AI on a controlled document set with full human review | Measure accuracy, user acceptance, review effort, and operational fit; refine prompts and confidence thresholds | AI Processing and Analysis module configured for selected document types; review queues; accuracy tracking |
| Phase 4: Move to governed production | Formalise the AI workflow as an operational process | Define approved prompts, validation rules, review requirements, exception handling, sampling, and audit logging | Production workflow configuration with confidence thresholds, exception queues, validation rules, and sampling |
| Phase 5: Expand carefully | Add document types, departments, and AI tasks | Extend proven patterns to new use cases; maintain the governance framework while broadening scope | Additional document type configurations; new AI capabilities (summarisation, analysis); broader user access |
An organisation might begin by piloting AI classification on incoming correspondence (Phase 3), move to governed production with confidence-based routing (Phase 4), then extend the pattern to contract metadata extraction, invoice processing, and policy gap analysis (Phase 5) — each using the same governance infrastructure with use-case-specific configuration.
14. Practical Checklist for Cautious AI Adoption
Use this checklist before approving an AI document processing initiative:
| Category | Requirement |
|---|---|
| Business fit | Use case is clearly defined with measurable success criteria |
| Business fit | Process owner is identified and engaged |
| Data control | Document types and sensitivity levels are documented |
| Data control | Data residency requirements are understood and the AI architecture satisfies them |
| Data control | Data movement during AI processing is minimised and documented |
| Security | Access controls apply to both documents and AI outputs |
| Security | All data is encrypted at rest (KMS) and in transit (TLS) |
| Security | Administrative access to AI configuration is controlled and auditable |
| Governance | AI outputs are classified as temporary, reviewed, or governed |
| Governance | Human review is required during the pilot phase |
| Governance | Production review rules are defined (confidence thresholds, sampling, exceptions) |
| Governance | Corrections and approvals are tracked in the audit trail |
| Auditability | AI processing events are logged (model, prompt, input, output) |
| Auditability | Review and approval history is retained |
| Auditability | The organisation can reconstruct the processing history for any document |
| Operations | Failures and exceptions have defined handling procedures |
| Operations | Costs can be monitored by process and workload |
| Operations | The process can scale beyond the pilot |
| Operations | Validation rules and quality review processes are defined |
15. What Good Looks Like
A mature AI document processing environment allows an organisation to say:
| Maturity Indicator | What It Means in Practice |
|---|---|
| We know which documents are processed by AI | AI processing is configured per document type — not applied uniformly to everything |
| We know where the data goes | Processing occurs within our AWS account, in the region we selected, with no external data movement |
| We know which AI services and configurations are used | Model selection, prompt templates, and confidence thresholds are documented and versioned |
| We know who can access documents, outputs, and metadata | ABAC policies govern both documents and AI outputs; access is auditable |
| We know which outputs require human review | Confidence thresholds, document sensitivity, and output type determine the review model |
| We can audit prompts, outputs, approvals, and corrections | Complete processing history in CloudTrail, Bedrock logs, and FormKiQ audit trail |
| We can align AI with records, retention, and compliance | AI outputs are governed metadata — subject to the same retention, legal hold, and disposition rules as all other document metadata |
| We can expand without rebuilding governance | New AI use cases use the same infrastructure, the same governance model, and the same audit trail |
How FormKiQ Supports Controlled AI Document Processing
FormKiQ provides the architecture described throughout this guide — AI document processing within a governed document management platform, deployed into your own AWS account.
The AI Processing Stack
| Layer | Technology | What It Does | Availability |
|---|---|---|---|
| OCR | Tesseract | Extracts raw text from scanned documents and images | All editions |
| Structured extraction | Amazon Textract | Extracts tables, form fields, key-value pairs with layout understanding | Essentials+ |
| AI classification | Amazon Bedrock | Identifies document type and applies classification metadata | Advanced/Enterprise |
| AI extraction | Amazon Bedrock | Extracts entities from unstructured text as structured metadata | Advanced/Enterprise |
| AI sensitivity | Amazon Bedrock | Identifies documents containing PII, PHI, financial data, or privileged content | Advanced/Enterprise |
| AI summarisation | Amazon Bedrock | Generates concise summaries for triage, review, and discovery | Advanced/Enterprise |
| AI analysis | Amazon Bedrock | Analyses documents against criteria — compliance, risk, completeness, obligations | Advanced/Enterprise |
Key Architectural Principles
- Processing in your account — every AI processing step runs within your AWS account through Amazon Bedrock; documents never leave your environment
- Inference region controls — you specify which AWS region is used for AI processing, ensuring content stays within your data residency boundary
- No model training on your data — Amazon Bedrock does not use customer data for model training
- Outputs as governed metadata — AI outputs become structured metadata on the document record, subject to the same ABAC, retention, legal hold, and audit trail as all other metadata
- Confidence-based routing — low-confidence AI outputs routed to human review queues; high-confidence outputs accepted automatically based on configurable thresholds
- Selective processing — AI capabilities configured per document type, per workflow, and per site in multi-tenant deployments
- Complete audit trail — every AI processing event logged in your CloudTrail, Bedrock invocation logs, and FormKiQ document audit trail
FormKiQ Editions
| Capability | Core | Essentials | Advanced | Enterprise |
|---|---|---|---|---|
| OCR — Tesseract | ✓ | ✓ | ✓ | ✓ |
| OCR & Structured Extraction — Textract | ✓ | ✓ | ✓ | |
| Custom Extraction Mappings | ✓ | ✓ | ✓ | |
| AI Classification (Bedrock) | ✓ | ✓ | ||
| AI Entity Extraction (Bedrock) | ✓ | ✓ | ||
| AI Sensitivity Classification (Bedrock) | ✓ | ✓ | ||
| AI Summarisation (Bedrock) | ✓ | ✓ | ||
| AI Document Analysis (Bedrock) | ✓ | ✓ | ||
| Inference Region Controls | ✓ | ✓ | ||
| Workflows, Queues & Rulesets | ✓ | ✓ | ✓ | |
| Encryption (KMS — in-transit & at-rest) | ✓ | ✓ | ✓ | |
| ABAC Access Controls | ✓ | ✓ | ✓ | |
| Multi-Instance & Multi-Region Licensing | ✓ | ✓ | ||
| Vendor-Managed & Hybrid Deployment | ✓ | |||
| Support | Community | 2-business-day SLA | Private Slack + 40 hrs onboarding | 8-business-hour SLA + strategic support |
Deployment Models
| Model | Description | Availability |
|---|---|---|
| Customer-Managed AWS | Deploys directly into your AWS account via CloudFormation. Full control of infrastructure, networking, encryption keys, and operations. | All editions |
| Vendor-Managed | FormKiQ manages the AWS infrastructure on your behalf — deployment, updates, and operational support. | Enterprise |
| Hybrid | You retain control of specific components (encryption keys, network config) while delegating operational management to FormKiQ. | Enterprise |
Every deployment is a dedicated, isolated instance. FormKiQ does not operate a shared multi-tenant environment.
Getting Started
FormKiQ Core — including Tesseract OCR — can be deployed to your AWS account in fifteen to twenty minutes. Amazon Textract integration is available from Essentials onward. AI-powered classification, extraction, sensitivity detection, summarisation, and analysis are available on Advanced and Enterprise.
For organisations evaluating AI document processing on AWS, FormKiQ offers a Proof-of-Value program — a three-month deployment in a FormKiQ-managed AWS environment that provides full platform access in a non-production setting.
Frequently Asked Questions
Does document content leave my AWS account during AI processing?
No. Every layer of FormKiQ's AI processing — Tesseract OCR, Amazon Textract, Amazon Bedrock — runs within your own AWS account. Documents are never sent to external services. Inference region controls for Bedrock specify which AWS region is used for processing, ensuring content stays within your data residency boundary.
How does FormKiQ handle AI outputs that are wrong?
Every AI output includes a confidence score. Low-confidence outputs are routed to a human review queue for verification before metadata is finalised. Reviewers can accept, correct, or reject AI-generated values, and all corrections are tracked in the audit trail. High-confidence outputs that are later found to be incorrect can be corrected at any time, with the correction history preserved.
Can I start with basic OCR and add AI capabilities later?
Yes. FormKiQ's AI capabilities are layered and independently enableable. Deploy Core with Tesseract OCR, upgrade to Essentials for Textract structured extraction, and add AI classification, extraction, and analysis on Advanced — all within the same deployment. Each layer builds on the previous without requiring migration.
How does AI processing align with GDPR and data protection requirements?
FormKiQ's AI processing through Bedrock runs within your AWS account with inference region controls that keep personal data within your selected region. AI processing can be configured per document type and purpose, supporting GDPR's purpose limitation principle. AI outputs include confidence scores and can be routed to human review, supporting Article 22 transparency requirements for automated decision-making. Personal data is not shared with external services or used for model training.
Can AI outputs be placed under legal hold?
Yes. AI outputs are stored as metadata on the document record. When a document is placed under legal hold, the entire record — including AI-generated metadata — is protected from modification or deletion. The AI processing history (what was processed, what was produced, what was reviewed) is part of the audit trail and is preserved alongside the document.
How does FormKiQ prevent AI from exposing restricted documents?
AI outputs inherit the document's ABAC access controls. If a user cannot access a document, they cannot see its AI-generated summary, extracted metadata, or answers derived from it through KnowledgeBase queries. The access control model applies uniformly to documents and their AI-generated metadata — there is no separate AI output layer that bypasses document-level permissions.