- Published on
Complete Guide: Deploying Spring Boot API on AWS ECS
- Authors
- Name
- Motions Technologies
Complete Guide: Deploying Spring Boot API on AWS ECS
Introduction
Deploying a Spring Boot API to production requires careful planning and a deep understanding of both application architecture and infrastructure requirements. This comprehensive guide will walk you through every aspect of deploying a Spring Boot API on Amazon ECS (Elastic Container Service), providing you with a production-ready setup that includes proper monitoring, scaling, and security measures.
Why Choose AWS ECS for Spring Boot Applications?
Amazon ECS offers several compelling advantages for Spring Boot deployments:
Container Orchestration Benefits
- Automated container placement and scheduling
- Built-in service discovery
- Integrated load balancing
- Automated container recovery
- Easy rolling updates and rollbacks
AWS Integration Features
- Native CloudWatch integration for logs and metrics
- AWS Secrets Manager for sensitive data
- AWS Parameter Store for configuration
- IAM roles for fine-grained security
- VPC integration for network isolation
Cost Optimization Capabilities
- Pay-per-use pricing model
- Spot instance support for cost savings
- Right-sizing recommendations
- Resource utilization tracking
- Reserved capacity options
Operational Excellence
- Managed container infrastructure
- Automated patch management
- Built-in high availability
- Multiple deployment strategies
- Extensive monitoring capabilities
Prerequisites and Environment Setup
Before starting the deployment process, ensure you have:
- Development Environment
# Required software versions
java -version # Java 17 or later
mvn -version # Maven 3.8+ or Gradle 7+
docker --version # Docker 20.10+
aws --version # AWS CLI 2.0+
git --version # Git 2.0+
- AWS Account Configuration
# Configure AWS CLI
aws configure
AWS Access Key ID: YOUR_ACCESS_KEY
AWS Secret Access Key: YOUR_SECRET_KEY
Default region name: us-east-1
Default output format: json
# Verify configuration
aws sts get-caller-identity
- Required AWS Permissions
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:*",
"ecr:*",
"elasticloadbalancing:*",
"cloudwatch:*",
"logs:*",
"ec2:*",
"iam:PassRole"
],
"Resource": "*"
}
]
}
1. Spring Boot Application Setup
1.1 Application Configuration
Basic Spring Boot Setup
- Project Structure
spring-api/
├── src/
│ ├── main/
│ │ ├── java/
│ │ │ └── com/example/api/
│ │ │ ├── Application.java
│ │ │ ├── config/
│ │ │ ├── controller/
│ │ │ ├── service/
│ │ │ └── model/
│ │ └── resources/
│ │ ├── application.yml
│ │ ├── application-prod.yml
│ │ └── logback-spring.xml
├── Dockerfile
├── docker-compose.yml
└── pom.xml
- Essential Dependencies (pom.xml)
<dependencies>
<!-- Spring Boot Starters -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-validation</artifactId>
</dependency>
<!-- Monitoring -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<!-- AWS SDK -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-secretsmanager</artifactId>
<version>1.12.261</version>
</dependency>
</dependencies>
- Application Properties (application.yml)
spring:
application:
name: spring-api
profiles:
active: ${SPRING_PROFILES_ACTIVE:local}
server:
port: 8080
shutdown: graceful
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
endpoint:
health:
show-details: always
probes:
enabled: true
health:
livenessState:
enabled: true
readinessState:
enabled: true
logging:
pattern:
console: "%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n"
level:
root: INFO
com.example.api: DEBUG
app:
cors:
allowed-origins: ${CORS_ALLOWED_ORIGINS:*}
- Production Properties (application-prod.yml)
spring:
main:
banner-mode: off
server:
tomcat:
max-threads: 200
accept-count: 100
management:
endpoints:
web:
exposure:
include: health,prometheus
endpoint:
health:
show-details: never
logging:
pattern:
console: "%d{ISO8601} [%X{traceId}/%X{spanId}] %-5level [%t] %C{40}: %msg%n"
1.2 Containerization Setup
- Dockerfile (Multi-stage build)
# Build stage
FROM maven:3.8.4-openjdk-17-slim AS builder
WORKDIR /app
COPY pom.xml .
# Download dependencies first (cache layer)
RUN mvn dependency:go-offline -B
COPY src ./src
RUN mvn clean package -DskipTests
# Runtime stage
FROM eclipse-temurin:17-jre-focal
WORKDIR /app
# Add non-root user
RUN addgroup --system --gid 1001 appuser && \
adduser --system --uid 1001 --group appuser
# Install tools for debugging and health checks
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
jq \
&& rm -rf /var/lib/apt/lists/*
# Copy jar from builder stage
COPY /app/target/*.jar app.jar
# Set permissions
RUN chown -R appuser:appuser /app
USER appuser
# Health check
HEALTHCHECK \
CMD curl -f http://localhost:8080/actuator/health || exit 1
# Environment variables
ENV JAVA_OPTS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0"
# Expose port
EXPOSE 8080
# Run application
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]
- Docker Compose (for local testing)
version: '3.8'
services:
api:
build: .
ports:
- "8080:8080"
environment:
- SPRING_PROFILES_ACTIVE=local
- JAVA_OPTS=-Xmx512m -Xms256m
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/actuator/health"]
interval: 30s
timeout: 3s
retries: 3
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
1.3 Local Testing and Validation
- Build and Run Locally
# Build application
mvn clean package
# Build Docker image
docker build -t spring-api .
# Run container
docker run -p 8080:8080 spring-api
# Test endpoints
curl http://localhost:8080/actuator/health
curl http://localhost:8080/actuator/info
- Performance Testing
# Install Apache Benchmark
apt-get install apache2-utils
# Run load test
ab -n 1000 -c 10 http://localhost:8080/actuator/health
# Memory monitoring
docker stats spring-api
- Common Issues and Solutions
- Memory Issues
# Check container logs
docker logs spring-api
# Adjust memory settings
docker run -p 8080:8080 -e JAVA_OPTS="-Xmx1g -Xms512m" spring-api
- Connection Issues
# Check container networking
docker network inspect bridge
# Test container DNS
docker exec spring-api nslookup google.com
- Security Scanning
# Scan Docker image
docker scan spring-api
# Check for vulnerabilities in dependencies
mvn dependency-check:check
[End of Part 1 - Continue with Part 2 (AWS Infrastructure Setup)?]
2. AWS Infrastructure Setup
2.1 Network Architecture Overview
2.2 Network Component Setup
- VPC Creation
# Create VPC
VPC_ID=$(aws ec2 create-vpc \
--cidr-block 10.0.0.0/16 \
--tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=spring-api-vpc}]' \
--query 'Vpc.VpcId' \
--output text)
# Enable DNS hostnames
aws ec2 modify-vpc-attribute \
--vpc-id $VPC_ID \
--enable-dns-hostnames
# Create Internet Gateway
IGW_ID=$(aws ec2 create-internet-gateway \
--query 'InternetGateway.InternetGatewayId' \
--output text)
# Attach Internet Gateway to VPC
aws ec2 attach-internet-gateway \
--vpc-id $VPC_ID \
--internet-gateway-id $IGW_ID
2.3 Subnet Architecture
- Create Subnets
# Public Subnets
PUBLIC_SUBNET_1=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block 10.0.1.0/24 \
--availability-zone us-east-1a \
--tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=public-1a}]' \
--query 'Subnet.SubnetId' \
--output text)
PUBLIC_SUBNET_2=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block 10.0.3.0/24 \
--availability-zone us-east-1b \
--tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=public-1b}]' \
--query 'Subnet.SubnetId' \
--output text)
# Private Subnets
PRIVATE_SUBNET_1=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block 10.0.2.0/24 \
--availability-zone us-east-1a \
--tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=private-1a}]' \
--query 'Subnet.SubnetId' \
--output text)
PRIVATE_SUBNET_2=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block 10.0.4.0/24 \
--availability-zone us-east-1b \
--tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=private-1b}]' \
--query 'Subnet.SubnetId' \
--output text)
2.4 Security Group Architecture
- Create Security Groups
# ALB Security Group
ALB_SG_ID=$(aws ec2 create-security-group \
--group-name alb-sg \
--description "ALB Security Group" \
--vpc-id $VPC_ID \
--query 'GroupId' \
--output text)
# ECS Security Group
ECS_SG_ID=$(aws ec2 create-security-group \
--group-name ecs-sg \
--description "ECS Security Group" \
--vpc-id $VPC_ID \
--query 'GroupId' \
--output text)
# Configure Security Group Rules
aws ec2 authorize-security-group-ingress \
--group-id $ALB_SG_ID \
--protocol tcp \
--port 80 \
--cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress \
--group-id $ECS_SG_ID \
--protocol tcp \
--port 8080 \
--source-group $ALB_SG_ID
2.5 Load Balancer Architecture
When deploying Spring Boot applications in production, the Application Load Balancer (ALB) serves as the primary entry point for all traffic. Understanding its architecture is crucial for a reliable deployment.
Understanding Load Balancer Components
Listeners and Routing
- Port 80 (HTTP): Used for redirection to HTTPS
- Port 443 (HTTPS): Primary listener for secure traffic
- Rules determine how traffic is routed to targets
Target Groups
- Groups ECS tasks that can receive traffic
- Implements health checking
- Manages task registration/deregistration
- Handles load balancing algorithms
Health Checks
- Regular checks against /actuator/health endpoint
- Determines task availability
- Affects traffic routing decisions
Implementation Guide
[Rest of the implementation details...]
Would you like me to continue with the complete implementation details and best practices?
- Create Application Load Balancer
# Create ALB
ALB_ARN=$(aws elbv2 create-load-balancer \
--name spring-api-alb \
--subnets $PUBLIC_SUBNET_1 $PUBLIC_SUBNET_2 \
--security-groups $ALB_SG_ID \
--scheme internet-facing \
--type application \
--query 'LoadBalancers[0].LoadBalancerArn' \
--output text)
# Create Target Group
TG_ARN=$(aws elbv2 create-target-group \
--name spring-api-tg \
--protocol HTTP \
--port 8080 \
--vpc-id $VPC_ID \
--target-type ip \
--health-check-path /actuator/health \
--health-check-interval-seconds 30 \
--health-check-timeout-seconds 5 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3 \
--query 'TargetGroups[0].TargetGroupArn' \
--output text)
# Create Listener
aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--protocol HTTP \
--port 80 \
--default-actions Type=forward,TargetGroupArn=$TG_ARN
2.6 Infrastructure Validation
- Network Validation
# Test VPC
aws ec2 describe-vpcs --vpc-ids $VPC_ID
# Test Subnets
aws ec2 describe-subnets \
--filters "Name=vpc-id,Values=$VPC_ID" \
--query 'Subnets[].{ID:SubnetId,CIDR:CidrBlock,AZ:AvailabilityZone}'
# Test Security Groups
aws ec2 describe-security-groups \
--group-ids $ALB_SG_ID $ECS_SG_ID
- Load Balancer Validation
# Check ALB health
aws elbv2 describe-load-balancer-health \
--load-balancer-arn $ALB_ARN
# Check target group health
aws elbv2 describe-target-health \
--target-group-arn $TG_ARN
2.7 Common Infrastructure Issues and Solutions
- Connectivity Issues
# Check route tables
aws ec2 describe-route-tables \
--filters "Name=vpc-id,Values=$VPC_ID"
# Verify NAT Gateway status
aws ec2 describe-nat-gateways \
--filter "Name=vpc-id,Values=$VPC_ID"
# Check security group rules
aws ec2 describe-security-group-rules \
--filters "Name=group-id,Values=$ECS_SG_ID"
- Load Balancer Issues
# Check ALB access logs
aws s3 ls s3://your-alb-logs-bucket/AWSLogs/
# Enable access logging
aws elbv2 modify-load-balancer-attributes \
--load-balancer-arn $ALB_ARN \
--attributes Key=access_logs.s3.enabled,Value=true \
Key=access_logs.s3.bucket,Value=your-alb-logs-bucket
[End of Part 2 - Continue with Part 3 (ECS and Container Registry Setup)?]
3. Container Registry and ECS Setup
3.1 Understanding Container Registry (ECR)
Amazon Elastic Container Registry (ECR) serves as the backbone of your container deployment strategy. Think of it as a secure, managed container image library that seamlessly integrates with ECS.
Why ECR is Critical for Production
Security:
- Private repository for your container images
- Integration with IAM for access control
- Automatic vulnerability scanning
- Encryption at rest using AWS KMS
Performance:
- Fast image pulls within AWS network
- Reduced latency for ECS deployments
- Regional replication for global deployments
Cost Efficiency:
- Pay only for storage and data transfer
- No infrastructure to manage
- Lifecycle policies for automatic cleanup
Here's how to set up your ECR repository with production-grade configurations:
# Create repository with security features enabled
REPO_URI=$(aws ecr create-repository \
--repository-name spring-api \
--image-scanning-configuration scanOnPush=true \
--encryption-configuration encryptionType=KMS \
--image-tag-mutability IMMUTABLE \
--query 'repository.repositoryUri' \
--output text)
What this configuration does:
- Enables automatic vulnerability scanning
- Uses KMS encryption for security
- Makes tags immutable to prevent overwrites
- Provides a unique URI for your repository
3.2 Image Lifecycle Management
In production environments, managing container images effectively is crucial. You need to balance between keeping necessary images and controlling storage costs.
Best Practices for Image Management:
Tagging Strategy:
# Example tagging scheme VERSION=$(git rev-parse --short HEAD) ENVIRONMENT="prod" BUILD_NUMBER=${CIRCLE_BUILD_NUM:-local} docker tag spring-api:${VERSION} ${REPO_URI}:${VERSION} docker tag spring-api:${VERSION} ${REPO_URI}:${ENVIRONMENT}-${BUILD_NUMBER} docker tag spring-api:${VERSION} ${REPO_URI}:latest
Why this matters:
- Git hash provides traceability
- Environment tag helps in deployment management
- Build number enables rollback capabilities
Lifecycle Policies:
{
"rules": [
{
"rulePriority": 1,
"description": "Keep last 30 production images",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["prod"],
"countType": "imageCountMoreThan",
"countNumber": 30
},
"action": {
"type": "expire"
}
}
]
}
This policy:
- Retains important production images
- Automatically cleans up old images
- Reduces storage costs
- Maintains compliance requirements
3.3 ECS Cluster Architecture
Your ECS cluster design significantly impacts scalability, reliability, and cost efficiency.
Understanding Capacity Providers
When setting up your ECS cluster, choosing the right capacity provider strategy is crucial:
aws ecs create-cluster \
--cluster-name production \
--capacity-providers FARGATE FARGATE_SPOT \
--default-capacity-provider-strategy \
capacityProvider=FARGATE,weight=1,base=1 \
capacityProvider=FARGATE_SPOT,weight=3
This configuration:
- Uses both Fargate and Fargate Spot instances
- Maintains at least one Fargate task for stability
- Leverages Spot instances for cost savings
- Provides automatic scaling capabilities
Real-world tip: Many organizations use a mix of 30% Fargate and 70% Fargate Spot for production workloads, saving up to 70% on compute costs while maintaining stability.
3.4 Task Definition Deep Dive
Your task definition is the blueprint for how your application runs in ECS. Getting this right is crucial for production performance.
Memory and CPU Allocation
Understanding container memory patterns is crucial:
{
"containerDefinitions": [
{
"memory": 2048,
"memoryReservation": 1024,
"cpu": 1024,
"environment": [
{
"name": "JAVA_OPTS",
"value": "-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0"
}
]
}
]
}
Why these settings matter:
- Memory reservation ensures resource availability
- CPU units align with Java thread pool sizing
- JVM settings optimize garbage collection
- Container support enables proper resource detection
Health Check Configuration
Proper health checks are crucial for production reliability:
"healthCheck": {
"command": [
"CMD-SHELL",
"curl -f http://localhost:8080/actuator/health || exit 1"
],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
Real-world considerations:
- Start period accommodates Java warm-up time
- Interval balances responsiveness and overhead
- Timeout prevents hanging health checks
- Retries prevent false negatives