AWS Lambda vs ECS: Serverless Functions vs Containers
Compare AWS Lambda and ECS Fargate across cold starts, pricing, scaling, and deployment to choose the right compute model for your workload.
Tags
AWS Lambda vs ECS: Serverless Functions vs Containers
Choosing between AWS Lambda and ECS is one of the most consequential infrastructure decisions in an AWS-based architecture. Both are "serverless" in the sense that you never manage EC2 instances directly (when using Fargate), but their execution models, cost structures, and operational characteristics are fundamentally different. Picking the wrong one leads to either overpaying for idle resources or fighting against platform constraints.
TL;DR
AWS Lambda runs stateless functions triggered by events, scales to zero automatically, and bills per invocation at millisecond granularity. ECS (with Fargate) runs Docker containers as persistent services with full control over runtime, networking, and resource allocation, billed per-second for provisioned compute. Lambda fits event-driven, bursty, short-lived workloads. ECS fits long-running services, steady traffic, and workloads needing custom runtimes or persistent connections. Most production architectures use both, routing each workload to the compute model that matches its characteristics.
Why This Matters
The difference between Lambda and ECS is not just a deployment choice — it shapes how you design your application code, handle state, manage connections, structure your CI/CD pipelines, and reason about costs. A REST API that handles a few thousand requests per day has very different economics on Lambda versus ECS. A WebSocket server that maintains persistent connections cannot run on Lambda at all. A data pipeline that processes files in bursts is wasteful on always-running containers.
Understanding the execution model of each service prevents you from fighting the platform. Teams that deploy a monolithic Express server to Lambda spend months working around timeout limits and cold starts. Teams that run simple webhook handlers on ECS pay for idle containers sitting at near-zero utilization. The goal is to match each workload to the compute primitive that fits its access pattern.
How It Works
Execution Model Differences
Lambda and ECS differ fundamentally in how they execute your code:
Lambda creates an execution environment on demand. When a request arrives and no warm environment exists, AWS initializes a new one (the cold start), runs your handler function, and then freezes the environment for potential reuse. Each environment handles one concurrent request at a time (unless using reserved concurrency with streaming). The environment is destroyed after a period of inactivity.
ECS Fargate runs your Docker container as a long-lived process. The container starts, runs continuously, and handles requests through whatever server framework you have inside it. You control how many container instances (tasks) run, and each task can handle multiple concurrent requests.
// Lambda: stateless handler function
// Each invocation gets its own isolated context
import { APIGatewayProxyHandler } from "aws-lambda";
export const handler: APIGatewayProxyHandler = async (event) => {
// Code outside the handler runs once per cold start
// Do NOT rely on this for persistent state
const body = JSON.parse(event.body || "{}");
const result = await processRequest(body);
return {
statusCode: 200,
headers: { "Content-Type": "application/json" },
body: JSON.stringify(result),
};
};// ECS: long-running server process
// Container stays alive, handles many requests
import express from "express";
const app = express();
// Shared state persists across requests within this container
const cache = new Map<string, CacheEntry>();
app.post("/process", async (req, res) => {
const result = await processRequest(req.body);
res.json(result);
});
app.get("/health", (req, res) => {
res.status(200).json({ status: "healthy" });
});
app.listen(8080, () => {
console.log("Service running on port 8080");
});Cold Starts and Latency
Cold starts are Lambda's most discussed tradeoff. When AWS needs to create a new execution environment, it must download your code, start the runtime, and run your initialization logic before handling the first request.
Cold start duration depends on several factors:
- ›Runtime: Node.js and Python initialize in under 100ms typically; Java and .NET can take several seconds
- ›Package size: Larger deployment packages take longer to download and extract
- ›VPC configuration: Functions inside a VPC previously added significant cold start latency, though AWS has reduced this substantially with Hyperplane ENI
- ›Memory allocation: Higher memory settings allocate proportionally more CPU, speeding up initialization
Provisioned concurrency pre-warms a specified number of execution environments, eliminating cold starts for those slots:
# serverless.yml — provisioned concurrency configuration
functions:
api:
handler: src/handler.main
memorySize: 1024
timeout: 29
provisionedConcurrency: 10
events:
- http:
path: /api/{proxy+}
method: ANYECS containers do not have cold starts in the same sense. The container is already running and accepting requests. However, ECS has its own startup latency when scaling out — launching a new Fargate task takes roughly 30 to 60 seconds as AWS provisions the underlying infrastructure, pulls the container image, and runs your startup logic. This is why ECS relies on scaling ahead of demand rather than reacting instantly.
Pricing Comparison
The cost models are structurally different, and which is cheaper depends entirely on your utilization pattern.
| Dimension | Lambda | ECS Fargate |
|---|---|---|
| Billing unit | Per request + per ms of compute | Per second of provisioned vCPU + memory |
| Idle cost | Zero (scales to zero) | You pay for running tasks even at zero load |
| Free tier | 1M requests + 400,000 GB-seconds/month | None |
| Minimum charge | 1ms per invocation | 1 minute per task |
| Network | Standard AWS data transfer | Standard AWS data transfer |
| Storage | 512MB–10GB ephemeral /tmp | EFS or EBS volumes, configurable |
For a workload handling 100 requests per second with 200ms average duration at 1GB memory, Lambda costs roughly the same as a comparable Fargate task. Below that utilization, Lambda is cheaper. Above it, Fargate wins because you are paying for provisioned compute regardless of whether a request is actively executing.
Scaling Behavior
Lambda scales horizontally by creating more execution environments. Each new concurrent request gets its own environment. Scaling is nearly instant for bursts (subject to account-level concurrency limits, default 1,000 concurrent executions per region). You can request limit increases, and reserved concurrency guarantees capacity for critical functions.
ECS scales by launching more tasks based on CloudWatch metrics or Application Auto Scaling target tracking:
# ECS Auto Scaling with target tracking
Resources:
AutoScalingTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 20
MinCapacity: 2
ResourceId: !Sub "service/${ClusterName}/${ServiceName}"
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
ScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyType: TargetTrackingScaling
ScalingTargetId: !Ref AutoScalingTarget
TargetTrackingScalingPolicyConfiguration:
TargetValue: 70
PredefinedMetricSpecification:
PredefinedMetricType: ECSServiceAverageCPUUtilization
ScaleInCooldown: 300
ScaleOutCooldown: 60The key difference: Lambda scales per-request with near-zero delay. ECS scales per-task with a delay of 30 to 60 seconds. For traffic that spikes abruptly, Lambda handles it without dropped requests. ECS needs either over-provisioned baseline capacity or predictive scaling.
Deployment Patterns
Lambda deployments typically use the Serverless Framework, AWS SAM, or CDK. Deployments are fast because you are uploading a zip file or container image, not replacing running infrastructure:
// CDK: Lambda function deployment
import * as cdk from "aws-cdk-lib";
import * as lambda from "aws-cdk-lib/aws-lambda-nodejs";
const apiFunction = new lambda.NodejsFunction(this, "ApiHandler", {
entry: "src/handler.ts",
handler: "main",
runtime: cdk.aws_lambda.Runtime.NODEJS_20_X,
memorySize: 1024,
timeout: cdk.Duration.seconds(29),
environment: {
TABLE_NAME: table.tableName,
STAGE: props.stage,
},
bundling: {
minify: true,
sourceMap: true,
externalModules: ["@aws-sdk/*"], // Use Lambda-provided SDK
},
});ECS deployments use rolling updates, blue/green, or canary strategies. They are slower but give you more control over rollback and traffic shifting:
// CDK: ECS Fargate service deployment
import * as ecs from "aws-cdk-lib/aws-ecs";
import * as ecsPatterns from "aws-cdk-lib/aws-ecs-patterns";
const service = new ecsPatterns.ApplicationLoadBalancedFargateService(
this,
"ApiService",
{
cluster,
taskImageOptions: {
image: ecs.ContainerImage.fromEcrRepository(repo, "latest"),
containerPort: 8080,
environment: {
NODE_ENV: "production",
DB_HOST: dbEndpoint,
},
},
cpu: 512,
memoryLimitMiB: 1024,
desiredCount: 2,
circuitBreaker: { rollback: true },
deploymentController: {
type: ecs.DeploymentControllerType.ECS,
},
}
);Monitoring Differences
Lambda provides built-in metrics per function: invocation count, duration, error rate, throttles, and concurrent executions. CloudWatch Logs automatically captures stdout/stderr from every invocation. X-Ray tracing can be enabled with a single configuration flag.
ECS monitoring requires more setup. You need to configure log drivers (typically awslogs) to send container output to CloudWatch, set up custom metrics for application-level observability, and configure health checks so the load balancer can route around unhealthy tasks. Container Insights provides CPU and memory utilization at the task and service level, but application-level metrics (request latency, error rates) require instrumentation in your code or a sidecar like the CloudWatch agent.
Practical Implementation
A hybrid architecture that uses both Lambda and ECS is common in production:
API Gateway
├── /api/users/* → ECS Fargate (steady traffic, database connections)
├── /api/reports/* → Lambda (bursty, CPU-intensive generation)
└── /api/webhooks/* → Lambda (unpredictable, event-driven)
S3 Bucket Events → Lambda (file processing triggers)
EventBridge Rules → Lambda (scheduled tasks, cross-service events)
SQS Queues → Lambda (async job processing)
Route workloads to the compute model that matches their profile. Use ECS for your core API that maintains database connection pools and handles steady request volume. Use Lambda for event-driven tasks where traffic is unpredictable and you do not want to pay for idle capacity.
Common Pitfalls
Running a full web framework on Lambda. Deploying Express or Fastify on Lambda works technically (using adapters like serverless-http), but you lose most benefits of the Lambda model. You pay cold start costs to initialize a framework designed for long-running processes. If your application already uses Express, ECS is a more natural fit.
Ignoring Lambda's 15-minute timeout. Any workload that might exceed 15 minutes cannot run on Lambda. Data migrations, large file processing, and ML inference jobs that run longer need ECS or Step Functions to orchestrate chunked execution.
Not managing database connections on Lambda. Each Lambda execution environment opens its own database connection. Under high concurrency, this can exhaust your database connection limit. Use RDS Proxy or a connection pooling layer to prevent this.
Over-provisioning ECS for variable traffic. Running a fixed number of ECS tasks sized for peak traffic wastes money during off-peak hours. Configure auto-scaling with appropriate metrics and cooldown periods, and consider scheduled scaling if traffic patterns are predictable.
Ignoring Fargate task startup time in scaling policies. Since new Fargate tasks take 30 to 60 seconds to start, scale-out cooldowns that are too long can leave you under-provisioned during traffic ramps. Set scale-out cooldowns shorter than scale-in cooldowns.
When to Use (and When Not To)
Use Lambda when:
- ›Traffic is bursty or unpredictable with periods of zero activity
- ›Functions execute in under 15 minutes
- ›You want automatic scale-to-zero with no idle costs
- ›Workloads are event-driven (S3 triggers, SQS, EventBridge, API Gateway)
- ›You want minimal operational overhead and no container management
Use ECS Fargate when:
- ›Traffic is steady and predictable enough that idle costs are justified
- ›You need persistent connections (WebSockets, gRPC streams, database connection pools)
- ›Workloads run longer than 15 minutes
- ›You need full control over the runtime environment (custom binaries, specific OS packages)
- ›Your application is already containerized and uses patterns that assume long-running processes
Consider a hybrid approach when:
- ›Different endpoints or services have different traffic characteristics
- ›You have both synchronous API workloads and asynchronous event processing
- ›Cost optimization requires matching each workload to its best-fit compute model
FAQ
What is the main difference between AWS Lambda and ECS?
Lambda runs individual functions triggered by events with automatic scaling to zero and sub-second billing. ECS runs Docker containers as long-running services with more control over runtime, networking, and resource allocation. Lambda abstracts away all infrastructure; ECS gives you container-level control while still managing the underlying servers (when using Fargate).
When should I use Lambda over ECS Fargate?
Use Lambda for event-driven workloads like API endpoints with variable traffic, file processing triggers, scheduled jobs under 15 minutes, and webhook handlers. Lambda is ideal when requests are bursty, execution time is short, and you want zero infrastructure management with automatic scale-to-zero.
How do Lambda cold starts affect performance?
Cold starts occur when Lambda creates a new execution environment, adding latency ranging from under 100ms for Python and Node.js to several seconds for Java and .NET. Provisioned concurrency eliminates cold starts by keeping environments warm, but adds cost. For latency-sensitive workloads with consistent traffic, ECS avoids this problem entirely since containers stay running.
Can I use Lambda and ECS together?
Yes, hybrid architectures are common in production. A typical pattern uses ECS for core API services with steady traffic and Lambda for event-driven tasks like image processing, notifications, and scheduled jobs. API Gateway can route to both Lambda functions and ECS services through ALB integration.
Which is cheaper, Lambda or ECS Fargate?
It depends on utilization. Lambda is cheaper for sporadic, low-utilization workloads because you only pay per invocation. ECS Fargate becomes more cost-effective when utilization is consistently above roughly 20 to 30 percent, since you pay per-second for provisioned vCPU and memory regardless of whether requests are actively being processed.
Collaboration
Need help with a project?
Let's Build It
I help startups and established companies design, build, and scale world-class digital products. From deep technical architecture to pixel-perfect UI — let's bring your vision to life.
Related Articles
How to Design API Contracts Between Micro-Frontends and BFFs
Learn how to design stable API contracts between Micro-Frontends and Backend-for-Frontend layers with versioning, ownership boundaries, error handling, and schema governance.
Next.js BFF Architecture
An architectural deep dive into using Next.js as a Backend-for-Frontend, including route handlers, server components, auth boundaries, caching, and service orchestration.
Next.js Cache Components and PPR in Real Apps
A practical guide to using Next.js Cache Components and Partial Prerendering in real applications, with tradeoffs, cache strategy, and freshness considerations.