Developing Java applications to use a microservice architecture and deploying them to the cloud is both flexible and cost-effective. The drawback to having many different JVMs running services is that there is no sharing of compiled code for multiple service instances or record of code that’s been compiled before. This video explains how these problems can be addressed and cloud costs reduced by using Azul’s Cloud Native Compiler, one that delivers JIT-as-a-service.
Hello and welcome to this introduction to the Azure Cloud Native Compiler. My name is Simon Ritter. I’m the Deputy CTO at Azure. The idea behind the Cloud Native Compiler is to look at how people deploy applications into the cloud and think about how we can improve the efficiency of JVM applications in that environment. If we look at the idea of a microservice architecture, we often see something like this. We’ve broken up a monolithic application into a number of different services, which we then communicate between in order to deliver the functionality of the overall application. Now this is a very good approach to designing applications because it gives us a lot of flexibility. We can have different services developed independently. We can have different teams work on that. We can have people with domain expertise work on specific services that they understand about how they work. We can also deliver them at a time when we have new updates for a particular service independent of other services. So we’re not tied to delivering everything all in one go. But one of the really big benefits of using this kind of architecture is that if we look at this particular example here I’ve got five different services which are being used and service five in the middle is being used by all of the other services. Now that could get to a point where because of the load that’s being placed on service five that starts to become a bottleneck and the performance degrades of service five not being able to respond as quickly as is required and that slows down the rest of the application. If we were using a monolithic application, we’d be very limited in how we could address that. We would literally have to say to ourselves, right, we need to provide a bigger machine, more CPU cores, more memory for the whole application in order to just address this one particular part of it. When we use a microservice architecture, we have greater flexibility, especially if we’re working in the cloud. What we can do in this case is say, well, let’s design it so that service five can be duplicated. And that way we can start up another instance of service five. We can do load balancing amongst the services that are making use of that and then reduce the load so that it no longer becomes a bottleneck. That’s very flexible. It also means that we can do that dynamically so that as the load goes up, we can spin up new instances of service five and then as the load decreases and we don’t need so many instances, we can shut some of them down and that reduces our cloud bill, infrastructure bill. So this is very flexible and works really well for this kind of application. However, If we look at this, we can see that all of the services are JVM based. So they’re using a JVM to execute the code of that particular service. All of the services are tightly coupled in the sense that they know about each other and they have well defined interfaces, how they pass messages and so on. But if we look at the JVMs. they are all completely independent. They have no idea about other JVMs in the system, they have no idea about what’s happening with other JVMs, all they’re aware of is the code that they’re executing for a particular service. Now this is limiting because if we have multiple instances of a particular service then there’s no ability to share information amongst JVMs running the same service. The other thing that we will find is if we look at the performance characteristics of a JVM based application, we’re going to see something like this graph. This is a very typical graph. We see here the idea of warm up time associated with our application. And that’s because the JVM is a virtual machine, the Java virtual machine. The instructions that we pass to it are not native instructions for the platform we’re running on. It’s not Windows or Linux, Intel or ARM. What we’re actually doing is passing byte codes that need to be converted into those instructions for whichever platform we’re using. Now that takes time and initially we will run in what’s called interpreted mode where we’re taking each bytecode and converting it as we see it to those instructions. To improve the performance of that what we do is we use a just-in-time compiler or a JIT compiler that will take methods that are being used frequently and compile them into native instructions so that we don’t have to interpret each bytecode. That improves the performance. but it takes time to find out which of those methods are being used frequently. We also have two different compilers that we use. First of those called C1 is a very fast compiler to generate code as quickly as possible. But then what we do is profile how the application is using that code and then recompile it a second time using C2 or in case of platform prime Falcon. And that will provide much more heavily optimized code to get even better performance. So we end up with a period of time takes to get to the optimum level of performance to run our application. Again, if we look at running this in the cloud and running multiple services, the problem we see is again, that the JVM has no knowledge, not only of other JVMs in the system, but also of previous runs of the application. We can run our application, our service first time, and we go through the process of identifying which code to compile, compiling it and recompiling it. When we run the application and the service a second time, we have to do the same thing again. We have to identify the same code, we have to recompile the code again. When we run it a third time, we do exactly the same thing, identifying the same methods, compiling them, recompiling them. Absolutely the same thing every time, which is very wasteful. If we look at the architecture of the JVM, there are a number of different parts in there. So there’s obviously the idea of loading classes and being able to verify them to make sure that they’re doing the things that they should do and not doing the things they shouldn’t do. And we have runtime data area that includes things like the heap and the stack and various other areas that are used by the JVM internally. And then we’ve got the actual execution engine, which deals with executing either the bytecodes through or running the JIT compiler to generate this code that can be used in place of the bytecodes. And if we look at that, what we decided to do was to say, okay, this JIT compiler is the area that we can focus on, and especially in a cloud environment, we can take advantage of that, understanding how that works, and maybe making some changes that will improve the performance of our systems. And that’s effectively what we’ve done. We’ve said, why don’t we take the JIT compiler out of the JVM and make it into a centralized service? If we’re running in the cloud, even if we’re running in our data center, so we don’t have to run in the cloud, then we can say, let’s put a centralized service. in our system so that each of the JVMs that are running our services, rather than having to do the compilation work itself, can pass that work to our JIT as a service, our cloud native compiler, have it execute there and then return to the JVM. That can be shared by many JVMs running different services or even instances of the same service and take advantage of all the benefits of how that is actually working. The idea behind this is that by having the cloud native compiler running effectively as a service in the background, it’s running all the time. When we look at this idea of starting a service and then restarting a service or starting up multiple instances of the same service, then by having a centralized service that’s running all the time, it can have memory about what’s been happening in the past. What that leads to is a number of quite significant advantages. First of those is what we’re effectively doing is moving the compilation to a centralized service, as I say. And the important thing about that is it can initially reduce the load on the individual JVMs. Because if you think about it, as the code is being compiled, as we warm up, that’s reducing the number of resources available in terms of the application code running itself. If you’re running in a container, let’s say you’ve provisioned two vCores to that container, then what you’ll find is one of those V-Cores will end up just doing compilation. So you’ve effectively halved the amount of resources for computation available to your application, which will reduce the throughput. Not just in terms of the fact it’s interpreting bytecodes or using C1 compiled code, it’s also having to do compilation at the same time. So it has a significant impact on performance. By shifting that work to a centralized service, we avoid that problem. So now we’ve got both V-Cores in the case of that. container available to run our application code. The other thing we can do is cache the code that we compile. Rather than just throwing it away, we keep it in our centralized service and that way when the same service starts up a second time and it requests a particular method from the compilation cloud native compiler, rather than having to compile that code again we simply take it out of a cache and return it straight away. So that improves the speed at which we can start up our application because now we’re getting that code very quickly and it reduces the workload on the outdated compiler, which means that we’re not using as much in a way resources. So it’s sharing resources, not just amongst multiple JVMs, but in fact amongst multiple runs of the same code. The other thing we can do is by having more resources available to our cloud native compiler, we can have it work on more complex optimizations and we can do that in the background. We can take a method with a set of profiling information and say, right, let’s compile that and optimize it in one particular way so we can deliver that to the service straight away. But let’s also, while we’ve got some spare resources, have the compiler take different approaches to optimizing that code, takes longer but delivers better levels of performance. And then the next time the service needs that particular piece of code, we can return even more heavily optimized code to give better performance. And potentially if we wanted to, we could then also make the JVM aware that we’ve got a better version of that code and it could download it if it wanted to. So what that does is deliver application warm up, which is much faster than we would see running all these individual JVMs and give greater throughput from the start for all of these different services. So it’s a win in terms of performance by reducing the amount of resources we need within each JVM and also sharing the results by having a centralized drive compiler service. So that’s a quick overview of the Azure Cloud Native Compiler.