Understanding JVM Latency with jHiccup

Latency is how long an application takes to respond to a request. For Java applications, latency can be because of the application’s performance or how the JVM delivers a managed runtime environment. This tutorial demonstrates how a simple, non-intrusive tool, jHiccup, can isolate and measure latency not directly introduced by an application. With this information, JVM tuning becomes much more manageable.

Video Transcript

Hey, it’s Carl Dea here. I’m a developer advocate at azul.com. Today, I’m going to show you how to compare performance metrics between a standard OpenJDK distribution and Azul’s platform prime JDK. The focus of this tutorial is for application response times and latency simulating real world workloads. To understand the impact of JVM pauses, we have to… determine whether it’s the JVM runtime or the OS events that are occurring with the latencies. So once you roll out the OS events, we can now measure the JVM pauses using the well-known tool called jHiccup. So what you want to do first is for this tutorial jHiccup is used as an agent that will be attached when you’re running it against your application. So go ahead and there’s required software to begin this tutorial. You want to go to azul.com slash products slash components slash jHiccup to download and click on the download button and you want to download that onto your server or your remote application where that’s located. If you go to the Dacapo benchmark suite or daca and click on the download button, and this is the application that we would run and attach the jHiccup as an agent. So when you download that, it’s a single jar file application, just a standard Java application. So jHiccup, just a quick note. is a agent that helps you determine latency over time and percentile distributions such as P99 and Pmax. So to begin, assuming you’ve downloaded the two jar files, here I’ve already downloaded it in a directory called compare underscore perf. So there I have it listed with the two jar files. Actually, when you download jhicup, it’s actually the project itself in a zip file. Once you decompress it, creates this directory called jhicup. It’s currently at version 2.0.10. On that root directory, the jar file exists as the agent. Before you run the DeCapo benchmark, I just want to make sure that I’m running a standard OpenJDK distribution and in this case, I’m using Azul’s Zulu Community Edition, which is just standard OpenJDK. Another thing to note is about the DeCapo benchmark suite. One of the benchmarks in particular is called the H2Benchmark, which is used to simulate banking transactions and a in-memory database. So in this case, you type in Java and dash Java agent colon and I’m going into the directory locally, and then it’s at the top level of that directory. Then you can set the maximum memory to eight gigs. In this case, I have a small JVM on an AWS instance, and it’s maximum at eight gigs of RAM. So then you have the dash, capital X, X colon plus disable explicit GC. That’s a command line switch so that no code is actually invoking the.gc method to do a garbage collect. And so simply just like any Java jar file that you run, you say dash jar and then the decapo benchmark and then there’s parameters at the end. Dash n is the number of iterations that you would want to run it so that you can warm up the JVM and do JIT compilation. And then these parameters here, dash s, huge h2, it picks the h2 benchmark. So when you run it, it starts to generate the transactions and here I’m using a two processor core or two vCores. This takes a very long time, so expect this to run for possibly an hour. So since I’ve already run this before, I’m just going to fast forward this. So we’re back from our benchmark run that took a very long time. If you just look at what the output has occurred, it is the jHiccup file, which is hiccup.somedate timestamp, and then you have the extension on the end.hlog. So that file will be visualized in another tool that we will talk about later. But now what you want to do, is switch to the Azul platform prime and you want to run jheck up against that and generate another jheck up so that we can actually look at the latencies and compare them. So what I just want to mention what it actually looks like when it’s finished running the benchmark you should see something similar to this. Now that we’ve run the benchmark just fast forward and have the files already downloaded. I also renamed them so that we can take a look at what we have. So the way I renamed them, I actually ran the test against a two core processor using Zulu, which is a standard OpenJDK distribution, and Zing, which is our former name of Azul Platform Primes JDK. So I also ran it on a four core processor, and I have those log files available. So to view these H log files, you have to use a visualization tool called Histogram Log Analyzer. So what you have to do is go to GitHub and go to HDR Histogram slash Histogram Log Analyzer. So if you go there at GitHub, and what you’ll have to do is clone the application, which we’ll do at this point. So I created a directory called Azul Prime pilot, as if we’re piloting an actual application in production. So there’s nothing here except the log files. So what you wanna do is just get clone and you want to clone the application. And assuming you have Maven installed, you wanna go into the directory and you want to actually do a Maven clean package. and this will build the jar file or the desktop application. So once it’s built, it’s really simple. Just run it like you normally do as a desktop application. The artifact is in the target directory and you just run the jar file. So once you launch the application, you should see an empty screen. Go ahead and click on the toolbar, the open. button. And then from the Azul Prime pilot directory where I copied it and renamed it, we can actually look at the files. So individually, you can just pick one or you can pick multiple, but in this case, I just want to pick the one, the two processor or two core and with the standard OpenJDK. I just want to load that guy. And so here you could see latencies within the JVM when it’s doing a garbage collect. Here at this height of this, you could see it’s like over 270 milliseconds, which is quite a long time. And then like P99s are quite important for SLAs or… service level agreements for many companies and response times. So for instance at 99.99 percentile it’s just above the 225 millisecond mark. So that’s not so great. So what we’ll do is we’ll load the comparison of the two core for using Zing. What you want to do is put it in the current tab because this allows you to compare the two side by side. So on the left you see a standard OpenJDK just like any other vendor or distribution of the standard OpenJDK. And on the right you have Azul’s platform prime which has lower latencies. by many times over. So here, for instance, you can see a lot of these numbers, just most of them are under this 10 millisecond range for latency duration. And so this is over time. And so over here in the bottom, you’re going to see latency by percentile distribution. Of course, that’s very important to many companies. But what’s great about this is 99.99 percentile, it’s still under like, it’s probably like six milliseconds latency duration. So even P max, if you go further out from three nines or… But even over at max, it’s under 24 milliseconds or 25 milliseconds. But even better, I ran it against the four core processors. You want to see a standard OpenJDK. We’ll just do it on a new tab. And then over here, you could see It looks very much similar to the other one that we saw in this graph here, going up to 275 or even in the 200 range. So we’re going to load the Azul Platform Prime version of the JHICUP log on the current tab, then it’ll do it side by side. So here you see that in its P99 comparison over here, in Azul Platform Prime, you see things under five milliseconds, which is crazy or even, or I’m sorry, P99s. 2 9s is under 10 milliseconds and 3 9s is under 15 milliseconds and P max is still well under 17.5 milliseconds latency. So this is great, under 20 milliseconds is amazing. So if you’re simulating a production system. that’s doing a lot of JDBC calls to a database, whether it’s in memory or within disk, using jHiccup, you can characterize the performance and you know that Azul Platform Prime is just way faster than a standard OpenJDK. If you’re interested in more about jHiccup and how it analyzes performance, go over to docs.azul.com slash prime and search for jHiccup. Should be the very first option and there you can learn more about jHiccup and what it actually does underneath the covers. So there you have it, a quick tutorial on how to compare performance metrics between a standard OpenJDK.

Understanding JVM Latency with jHiccup

Video Transcript

Connect with the Azul Sales Team

Migrating Applications to Platform Core

Cloud Native Compiler

Why is Prime Different?