Register for the AI4J Leadership Summit
99 Days
:
08 Hours
:
56 Minutes
:
05 Seconds

Your JVM Is an AWS Architecture Decision – Which Most Teams Miss

Smart Summary

When Java workloads on AWS misbehave, the instinct is to look at the infrastructure: instance sizing, autoscaling, network config. But in many cases, the JVM is the layer that deserves closer attention first. This post explains why the Java runtime belongs in every AWS architecture review, and how treating it as a genuine architectural decision changes performance, cost, and operational outcomes. 

In this post you will learn: 

  • Why Java performance and stability issues on AWS are often misdiagnosed as infrastructure problems 
  • How JVM behaviour affects throughput, tail latency, warmup, CPU efficiency, and operational consistency 
  • Where Azul Prime has the most measurable impact for AWS Java workloads 
  • Why the JVM belongs in the AWS Well-Architected Framework discussion 

Architects running Java on AWS spend a lot of time on the right things. Instance families. Autoscaling policies. Graviton migration. EKS configurations. Cost governance. These are real levers and they deserve real attention. But in most Java architecture reviews, one layer is quietly missing from the conversation: the JVM itself. 

That omission is understandable. The runtime feels like background technology, something the framework handles, something that was settled long ago. In practice, the JVM shapes throughput, tail latency, warmup behaviour, CPU efficiency, memory footprint, and the quality of diagnostics available when things go wrong. For Java workloads on AWS, treating the runtime as a fixed given means the architecture review is incomplete before it starts. 

Does your Java problem actually live in the infrastructure?

A pattern that comes up repeatedly in Java environments is this: A service starts showing latency spikes or unexpected CPU pressure. The team opens the AWS console, reviews instance sizing, checks autoscaling behaviour, examines network configuration. Sometimes they find the answer there. Often they do not, however, because the problem originated inside the runtime, not around it. 

JVM behaviour can look a lot like infrastructure problems from the outside. A long GC pause looks like a network hiccup. Warmup overhead looks like instance underprovisioning. JIT compilation activity looks like CPU saturation. When teams investigate only at the infrastructure layer, they spend considerable time ruling out causes that were never the real issue. 

For Java workloads on AWS, the runtime is part of the architecture, not a footnote to it. 

This is one of the most important shifts in how Java teams should approach AWS reviews. The JVM is not plumbing. It is one of the most consequential moving parts in a Java system. The JIT compiler influences steady-state throughput. The garbage collector influences latency consistency. Warmup behaviour influences how quickly services stabilise after a cold start or scale-out event. Runtime observability influences how quickly engineers can isolate root cause in production. All of those are architectural outcomes, not implementation details. 

What changes when the JVM is part of the architecture review?

Once the runtime is on the table, several questions that were previously invisible become answerable. How efficiently is the JVM converting CPU cycles into business throughput? How much of the fleet’s compute spend is absorbed by repeated JIT compilation after restarts and scale-out events? How quickly do fresh pods reach optimised performance? What happens to tail latency when heap pressure increases? 

These are not exotic edge cases. They are routine concerns in any Java deployment on elastic AWS infrastructure where pods are rescheduled, deployments happen frequently, and autoscaling introduces new JVM instances under load. In that environment, the path a workload takes to reach optimised performance matters as much as the peak it eventually achieves. 

Azul Prime addresses this across three areas that matter most for AWS Java workloads: 

Performance & Tail Latency 

Azul Prime’s Falcon JIT compiler and C4 garbage collector eliminate the stop-the-world pauses that generate tail latency and complicate SLA commitments, particularly under large heap sizes and sustained load. C4 is designed to maintain high throughput and low latency regardless of heap size, removing the class of reliability incident that begins as a GC pause and surfaces as a failed request or a retry storm. 

For example, LMAX Group relies on Azul Prime to ensure low latency trade execution in processing sustained volumes of over 100,000 orders/second, capable of bursting to almost a million orders/second per exchange at peak. 

Cloud Cost Efficiency 

In horizontally scaled AWS environments, one of the hidden taxes on cloud spend is repeated JIT compilation work. Every time a pod is rescheduled or a new instance spins up under autoscaling pressure, that JVM starts over, rediscovering profiles, recompiling hot methods, working toward a state the rest of the fleet may have reached hours earlier. 

Azul’s Optimizer Hub breaks that pattern by sharing compilation work and warmup profiles across the fleet. That changes the economics of horizontal scale: compiled knowledge stops being a disposable local artifact and becomes a reusable fleet-level asset. 

For example, a recent Forrester Total Economic Impact™ (TEI) study of Azul Prime found that numerous organizations reduced cloud costs, data center spend, and engineering performance tuning for 129% ROI. As one VP of digital and data platforms at a financial services company said: “Through more efficient utilization of our Java applications, we’ve been able to reduce the number of physical and virtual servers in use, resulting in a 15% to 20% reduction in cloud costs.” 

Operational Visibility 

Many of the hardest production issues in Java systems do not show up cleanly at the application framework level. They show up as CPU spikes, lock contention, allocation pressure, or latency outliers that cannot be explained from infrastructure metrics alone. When the platform team sees CPU, the SRE team sees latency, and the developer team sees no obvious bug, the runtime is often the blind spot in the middle. 

Azul Prime’s support for Java Flight Recorder, Azul Mission Control, and async-profiler gives teams the runtime observability needed to move from symptoms to root cause rather than guessing whether an issue is instance sizing or GC behaviour. 

For example, with Azul Prime, Mastercard went from nearly 10,000 full garbage collection pauses a day to virtually all of the garbage collection pauses eliminated, enabling much flatter, more consistent performance, essential for real-time fraud detection during enormous transaction volumes. 

Another example: Azul Prime helped Workday eliminate 95% of operational issues, freeing up over 42,000 developer-hours across an 18-month period that would have been spent on performance tuning and troubleshooting. 

How does this connect to the AWS Well-Architected Framework?

AWS’s Well-Architected Framework covers six pillars: performance efficiency, cost optimisation, operational excellence, reliability, security, and sustainability. The JVM touches all six. Performance efficiency and cost optimisation are the obvious ones; the runtime directly influences throughput, CPU usage, and how efficiently the workload converts compute spend into business value. 

But reliability matters here, too. Most reliability incidents in Java systems begin as runtime instability — latency spikes, warmup inconsistency, or erratic GC behaviour — before they surface as failed requests or breached SLAs. And operational excellence depends on having a runtime that is observable enough to explain itself when something goes wrong. 

The question that should sit at the back of every Java architecture review on AWS is not whether the JVM matters. It clearly does. The question is why it is still absent from so many of those reviews, and what gets found when teams finally decide to look. 

Together with Azul Senior Product Manager Jiří Holuša, I will be hosting a webinar on 25 June to explore this topic in greater depth and hands-on detail. Please join us: 

“Unlocking Java Performance: The Hidden Impact of JVM Choices on AWS”

Thursday, 25 June 2026  ·  1:00 PM BST / 2:00 PM CEST  ·  ~60 minutes including Q&A 
Hosted by Jiří Holuša (Senior Product Manager) and Daniel Witkowski (Principal Sales Engineer), Azul