AeroBuild | Engineering Blog & Services

Production readiness is not about perfection; it’s about visibility, resilience, and fast recovery.
This guide walks through the essential practices for running Spring Boot applications reliably in production.

Introduction: Why Production Readiness Matters

In development, the primary question is: “Does it work?”
In production, the question becomes: “Can we keep it working under pressure?”

Production readiness is not just about clean code. It is about:

Observability when things go wrong
Safe configuration management
Operational confidence at scale
Faster incident response

Spring Boot provides excellent tooling out of the box. When used correctly, it enables applications that are observable, resilient, and maintainable.
This article focuses on four pillars of production readiness:

Health checks
Metrics
Logging
Configuration hygiene

1. Health Checks: Your Application’s Vital Signs

Health checks are the foundation of reliable production systems. They tell platforms, orchestrators, and humans whether your service is alive and ready to serve traffic.

Actuator: The Foundation

Spring Boot Actuator exposes production-grade endpoints for monitoring and management.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

Expose only what you need:

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics
  endpoint:
    health:
      show-details: always
      probes:
        enabled: true

💡 Tip: Never expose all actuator endpoints publicly in production.

Built-in Health Indicators

Spring Boot automatically checks:

Database connectivity
Disk space
Messaging brokers
Caches

These checks surface under /actuator/health.

Custom Health Indicators

For application-specific dependencies, create custom indicators:

@Component
public class DatabaseHealthIndicator implements HealthIndicator {

    private final DataSource dataSource;

    @Override
    public Health health() {
        try (Connection connection = dataSource.getConnection()) {
            if (connection.isValid(1000)) {
                return Health.up()
                    .withDetail("database", "Available")
                    .build();
            }
        } catch (SQLException e) {
            return Health.down(e)
                .withDetail("database", "Unavailable")
                .build();
        }
        return Health.down().build();
    }
}

🚨 Rule of thumb: A health check should fail fast and never block.

Liveness vs Readiness (Kubernetes)

For containerized workloads:

management:
  health:
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true

Liveness: Should this container be restarted?
Readiness: Should this instance receive traffic?

2. Metrics: Quantitative Insight into Behavior

Metrics help you answer questions like:

Are users experiencing latency?
Is memory usage trending upward?
Did error rates spike after deployment?

Micrometer Integration

Spring Boot uses Micrometer as a vendor-neutral metrics facade.

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

Metrics You Should Always Track

System Metrics

JVM memory & GC
Thread usage
CPU consumption

Application Metrics

HTTP request count & latency
Database connection pool usage
Cache hit/miss ratios

Business Metrics

Orders placed
Payments processed
Domain-specific KPIs

Custom Business Metrics

@Component
public class OrderMetrics {

    private final Counter orderCounter;
    private final Timer orderProcessingTimer;

    public OrderMetrics(MeterRegistry meterRegistry) {
        this.orderCounter = Counter.builder("orders.total")
            .description("Total number of orders")
            .register(meterRegistry);

        this.orderProcessingTimer = Timer.builder("orders.processing.time")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(meterRegistry);
    }

    public void recordOrder(Order order) {
        orderCounter.increment();
        orderProcessingTimer.record(() -> processOrder(order));
    }
}

📊 Best practice: Prefer histograms and percentiles over averages.

Prometheus Configuration

management:
  endpoints:
    web:
      exposure:
        include: prometheus
  metrics:
    distribution:
      percentiles-histogram:
        http.server.requests: true
    tags:
      application: ${spring.application.name}
      environment: ${ENVIRONMENT:development}

3. Logging: Structured, Searchable, Actionable

Logs tell the story of what happened but metrics tell you how often.

Logback for Production

Use logback-spring.xml for environment-aware logging.

Key practices:

JSON logs for aggregation systems
Rolling files with size limits
Environment metadata in every log

⚠️ Never log secrets, tokens, or personal data.

Structured Logging with MDC

@Component
public class RequestLoggingFilter extends OncePerRequestFilter {

    @Override
    protected void doFilterInternal(HttpServletRequest request,
                                   HttpServletResponse response,
                                   FilterChain filterChain)
            throws ServletException, IOException {

        MDC.put("requestId", UUID.randomUUID().toString());
        MDC.put("clientIp", request.getRemoteAddr());

        long start = System.currentTimeMillis();
        try {
            filterChain.doFilter(request, response);
        } finally {
            MDC.put("durationMs",
                String.valueOf(System.currentTimeMillis() - start));
            MDC.clear();
        }
    }
}

🔍 Why MDC matters: It enables per-request tracing across distributed systems.

Runtime Log Level Changes

Enable dynamic log level tuning:

management:
  endpoint:
    loggers:
      enabled: true

This avoids redeployments during incidents.

4. Configuration Hygiene: Secure and Predictable

Externalize Everything

Never hardcode configuration values.

spring:
  application:
    name: order-service
  profiles:
    active: ${ENVIRONMENT:development}

Production profile:

spring:
  config:
    activate:
      on-profile: production
  datasource:
    url: ${DATABASE_URL}
    username: ${DATABASE_USERNAME}
    password: ${DATABASE_PASSWORD}

Validate Configuration at Startup

@ConfigurationProperties(prefix = "app")
@Validated
@Component
public class ApplicationProperties {

    @NotNull
    private String externalServiceUrl;

    @Min(1)
    @Max(100)
    private int maxConnections;
}

✅ Fail fast if configuration is invalid.

Use slim JRE images
Run as non-root
Tune JVM for containers

Kubernetes Probes

Liveness: /actuator/health/liveness
Readiness: /actuator/health/readiness

🚀 Proper probes prevent cascading failures.

6. Monitoring and Alerting Strategy

Alerts That Actually Matter

Error rate > 5%
P99 latency above SLO
Memory usage > 80%
Instance health check failures

Dashboards Should Show

Traffic, errors, latency
JVM memory & GC
Database pool usage
Business KPIs

Conclusion: Production Excellence Is a Practice

Production readiness is not a checkbox; it's a continuous discipline.

Key takeaways:

Start with Actuator and build observability early
Prefer structured logs and meaningful metrics
Validate and externalize all configuration
Monitor what can fail and alert only when it matters
Test your failure scenarios before users do

Spring Boot gives you the tools. Operational excellence comes from using them deliberately.

Spring Boot Production Readiness