Updated for 2025
This post was originally published in June 2020 and has been updated to reflect the significant changes in cloud computing, serverless architecture, and AI integration over the past five years.
Amazon Web Services (AWS) has become even larger in 2025 than it was five years ago, despite the sunsetting of some services, including WorkDocs. Beyond the traditional challenge of creating servers in EC2 and managing static content in S3, AWS offers over 250 services spanning compute, storage, AI/ML, generative AI, IoT, and industry-specific solutions. New services and features appear pretty often, though the pace of non-AI products may have slowed in recent months.
And expanding on AI is where the action is for AWS, which is not unlike the other cloud providers. Services like Amazon Bedrock for foundation models and SageMaker's continued evolution with the new Sagemaker AI (the next generation, according to AWS) have fundamentally changed what's possible in the cloud. AWS is no longer just infrastructure—it's a platform for building intelligent applications. In many cases, organizations are not looking simply for an AWS Serverless Stack, but an AWS AI Stack.
There are so many AWS products that Amazon can't fit them onto a single page. Even broken up into categories, many of the 250+ services aren't easy to find. (an old view of AWS Products from 2020)
AWS is too large for any one person to fully understand. There are too many features and services, and while the AWS Well-Architected Framework provides high-level guidance, there are best practices and looming pitfalls for most of what you can find.
There are usually multiple solutions to each architectural problem, and it is not always easy to know which solution is the best. And today's best choice may not be the best choice in six months, whether that's due to your use case evolving or the continuing progress in cloud computing.
Since 2020, when the first version of this post was written, serverless has evolved from an emerging concept to a production-standard approach used at least in part by 70% of AWS customers (at least according to Datadog back in 2023).
What started with AWS Lambda and its functions-as-a-service has expanded into a comprehensive ecosystem:
What many traditional small web applications look like, a pile of servers. Usually overkill (over-resourced), sometimes not enough to keep things running when usage grows. (icon by http://www.freepik.com/)
Serverless, at its core, is not just abstracting servers away so you no longer need to care what server is running your workload; it's about removing operational overhead entirely. Instead of lift-and-shift, organizations are re-architecting around event-driven patterns, microservices, and consumption-based pricing models that scale automatically with demand.
We're not living in a zero-ops world yet, but an architecture built on serverless components and managed services is the closest we've ever come.
Large Language Models have become integrated into business workflows since the first small steps in late 2022, creating new architectural requirements:
LLM-integrated applications often expose their capabilities through APIs, enabling seamless integration with other services. These APIs can use solutions like API Gateway or Lambda Function URLs to connect backend services, support custom domains, and facilitate integration with CloudFront for greater flexibility and streaming support.
This means that organizations must consider not only the choice of model provider and scaling strategy, but also how to manage variable traffic patterns and unpredictable usage spikes. The architecture must also handle a variety of response types from APIs, including streaming and real-time responses, to support LLM workloads.
Traditional architectures weren't designed for these workloads. LLM-integrated applications have variable traffic patterns, require connections with multiple model providers, and need flexible scaling to handle unpredictable demand. This is exactly where serverless excels—and where serverless stacks become essential.
A serverless stack is a pre-built, Infrastructure-as-Code (IaC) architecture that solves a specific workflow or use case. A full stack boilerplate project can accelerate the deployment of serverless stacks by providing a modular, ready-to-use architecture for building complex AI-powered tools and applications. Instead of manually connecting dozens of services, you deploy a tested, optimized pattern that includes:
A serverless stack, in the wild. The FormKiQ Document Platform architectural diagram. (from https://docs.formkiq.com/)
Traditional workflows have become more efficient with mature serverless implementations. Serverless stacks simplify the deployment and management of both backend and frontend components of an app, making it easier to build, deploy, and scale applications. Hosting static assets and web frontends is streamlined using serverless services like S3 and CloudFront, allowing you to host your frontend efficiently. Hosting is a key consideration when deploying serverless applications, as it ensures scalability and reliability for both the app and its users.
Pre-configured API Gateway with Lambda authorizers, rate limiting, and both caching and CloudFront distribution if needed. CloudFront can be used to proxy API Gateway requests, enabling custom domain management and seamless API access. Different authentication methods, such as Lambda Authorizers or custom JWT-based methods, can be implemented for API Gateway to enhance security and flexibility. It is important to verify user credentials or tokens as part of the API management workflow. And rather than requiring autoscaling or manual capacity tweaks now scales automatically with request volume.
Serverless stacks combine S3 and Lambda with processing through machine learning with services such as Textract and Comprehend, to extract and analyze document content. The document processing logic is implemented as a lambda application, enabling modular and scalable workflows. A shared package is often used within the pipeline for authentication or data validation, such as validating JWT tokens. The response from the document processing API can include extracted data, analysis results, or error messages, depending on the outcome of the workflow. Organizations process millions of documents without managing infrastructure, using well-configured workflows to keep results consistent and costs manageable.
Kinesis, Lambda, and DynamoDB work together for streaming data analysis and visualization, handling traffic spikes without capacity planning. As it is important to run tests to ensure the reliability and performance of real-time data processing pipelines, at each stage of the workflow you should verify data integrity and processing accuracy to maintain trust in your results. The response from real-time processing can be visualized in dashboards or used for further analysis to gain actionable insights. This tooling can all be implemented using serverless technologies.
CloudFront, S3, and Lambda@Edge deliver content globally with automatic optimization and security. Hosting static websites and assets is a core use case for these serverless content delivery solutions, ensuring efficient and reliable deployment. Additionally, plugins can be used to automate and enhance the deployment of static content to CloudFront and S3, simplifying workflows and improving functionality. For instance, the FormKiQ website deploys from GitHub to S3 via a GitHub Action, which also invalidates the CloudFront cache.
New capabilities have emerged that benefit from serverless architecture:
Stacks that integrate document processing with Amazon Bedrock to extract, analyze, and generate insights from unstructured content, enabling automated classification, summarization, and metadata extraction.
Systems that combine vector storage, retrieval mechanisms, and LLM inference to answer questions based on organizational knowledge. This includes various approaches to semantic search and context-aware responses that may evolve significantly as best practices emerge.
Processes where LLMs assist with decision-making within structured workflows—not always autonomous AI making decisions independently, but human-defined workflows that leverage language models for classification, routing, and analysis steps. With proper configuration and validation, these non-deterministic AI models can provide efficiency gains without introducing new kinds of errors.
Request routing across different models within Amazon Bedrock with fallback handling and cost optimization, allowing organizations to choose the right model for each task.
Serverless pricing has become even more competitive since 2020:
The key to realizing these cost benefits is proper architecture. Organizations that lift-and-shift traditional applications into the cloud and add serverless components around them often find costs higher than expected. The real savings come from re-architecting to take advantage of serverless patterns: event-driven processing, consumption-based scaling, and eliminating idle resources.
When properly architected, serverless stacks that include cost monitoring, intelligent caching, and efficient service integration can reduce operational costs by 60-80% compared to traditional server-based approaches—while simultaneously improving scalability and reliability.
While there are sometimes write-ups about moving workloads back to on-prem servers, the details in those situations will often showcase examples of architecture that could have been better-optimized for serverless and managed services, rather than a retreat from cloud entirely.
Modern serverless stacks are built using tools like AWS CDK, Terraform, or the Serverless Framework. This means:
FormKiQ stacks deploy via CloudFormation, integrating seamlessly with these IaC approaches. For LLM-integrated workloads specifically, IaC allows you to rapidly experiment with different model combinations, adjust token limits, and optimize costs without manual configuration.
Security and authentication are foundational pillars of any robust AWS Serverless or AWS AI Stack, especially when building serverless AI applications that interact with sensitive data and external model providers. By leveraging AWS Lambda and API Gateway, developers can implement secure authentication and authorization workflows that keep each app’s data separate and protected within a trusted AWS foundation.
For example: the application authentication process typically begins with verifying user credentials and issuing JSON Web Tokens (JWTs), which are then used to secure API requests throughout the application. API Gateway acts as the entry point for all web and API traffic, integrating seamlessly with Lambda functions to enforce authentication and authorization logic. This ensures that only authorized users can access specific resources, keeping your stack’s data isolated from unauthorized access and from other model providers. While machine-to-machine connectivity might use trusted IAM access or API keys, the workflow is similar.
To further enhance security, the AWS Serverless or AWS AI Stack supports custom domain names, which can be configured using AWS Certificate Manager (ACM) and Route53. This not only provides a professional web presence but also enables secure HTTPS communication for all API requests. A domain-oriented architecture allows developers to modularly add or customize security features in a way that suits the structure and branding of the organization, making it easy to adapt to evolving requirements or integrate with other tools.
Infrastructure access management is handled through AWS IAM, giving developers fine-grained control over who can access which resources deployed within the stack. By defining roles and permissions, you can ensure that users and services only have the access they need—nothing more. This approach, combined with the event-driven and serverless nature of the stack, provides a secure, scalable, and flexible foundation for modern scalable architectures and AI apps.
Deploying an AWS Serverless or AI Stack is streamlined and efficient, thanks to IaC, using tools such as CloudFormation, CDK, Terraform, or the Serverless Framework. By using this approach, a blueprint for the stack’s configuration and resources is available; with a single command, developers can deploy all serverless services in the stack. This approach eliminates manual setup and ensures consistency across deployments.
The deployment process is further enhanced by integrating CI/CD tools like GitHub Actions. By setting up a workflow that triggers on code changes, developers can automate the entire pipeline: running tests, building the application, and deploying it directly to the AWS account. This not only accelerates development cycles but also reduces the risk of human error during deployment.
For teams with more complex requirements, other tools such as AWS CodePipeline and AWS CodeBuild can be used to create sophisticated CI/CD pipelines. These tools allow you to automate deployments across multiple environments—dev, staging, and production—ensuring that each environment is up-to-date and consistent with your latest codebase.
By adopting these deployment practices, developers can focus on building features and business logic, confident that their serverless services are reliably and repeatably deployed. The combination of IaC tooling , template and configuration files, and automated workflows provides a modern, scalable approach to managing the full stack lifecycle in AWS.
Ensuring the performance and scalability of your AWS Serverless or AWS AI Stack requires robust monitoring and continuous optimization; unlike more traditional server-based deployments, serverless has more points of orchestration that require observability. AWS provides a suite of tools—such as CloudWatch, X-Ray, and CloudTrail—that give developers deep visibility into their serverless applications. These services allow you to track API requests, monitor Lambda function performance, and audit access to resources, making it easier to identify bottlenecks and optimize your code.
For AI applications that process large volumes of data or require real-time feedback, a serverless stack is equipped to support streaming responses. This capability enables your Lambda functions to handle and deliver data incrementally, improving responsiveness and user experience, especially in scenarios involving powerful LLM models or complex data processing.
AWS Lambda’s built-in support for concurrency and parallel processing allows your application to scale automatically, handling multiple requests simultaneously without manual intervention. Combined with the event-driven architecture of a serverless stack, this ensures that your serverless architecture and AI applications remain responsive and cost-effective, even under unpredictable workloads.
By leveraging these monitoring and optimization tools, developers can proactively address performance issues, fine-tune their serverless services, and ensure that their applications deliver a seamless experience to users. This approach is essential for building serverless applications that are not only scalable and efficient but also reliable and cost-effective in the long run.
The core principle remains unchanged, but is more relevant than ever:
Leverage proven patterns instead of building infrastructure from scratch.
Serverless architectures have matured to the point where well-tested patterns exist for most common workflows. Open source frameworks and reference implementations provide battle-tested approaches:
These frameworks demonstrate how to properly architect LLM-integrated applications with error handling, fallback strategies, and cost controls. They show how to configure API Gateway effectively, and how to connect services for document processing pipelines.
Your engineering time is better spent on:
The cloud is powerful and constantly evolving. LLM integration opens new possibilities. Architectural patterns are being established and refined by the community. Rather than rebuilding what others have already solved and documented, you can build on proven foundations.
Serverless stacks let you leverage collective expertise, deploy production-ready patterns quickly, and focus on what creates value for your customers. Whether you're building traditional web applications, processing documents at scale, or integrating language models into your workflows, serverless stacks provide tested architectural foundations, allowing you to focus your innovation where it matters most.
FormKiQ builds and maintains serverless document management stacks that integrate seamlessly with modern AI services, helping organizations focus on leveraging best-in-class document intelligence, rather than infrastructure management.
Contact us for more information on document processing, workflow automation, and information management.
Get started with FormKiQ through our core offering, which includes all core functionality and is free forever
Install NowGet started with FormKiQ with a Proof-of-Value or Production Deployment of FormKiQ Essentials
Start NowFind out how FormKiQ can help you build your Perfect Document Management or Enterprise Content Management System
Contact Us