Wednesday, 22 June 2016

The Fault in Our JARs: Why We Stopped Building Fat JARs

The Fault in Our JARs: Why We Stopped Building Fat JARs: "The Fault in Our JARs: Why We Stopped Building Fat JARs
JUN 16, 2016 / BY JONATHAN HABER

inShare
10
HubSpot’s backend services are almost all written in Java. We have over 1,000 microservices constantly being built and deployed. When it comes time to deploy and run one of our Java applications, its dependencies must be present on the classpath for it to work. Previously, we handled this by using the maven-shade-plugin to build a fat JAR. This takes the application and all of its dependencies and bundles them into one massive JAR. This JAR is immutable and has no external dependencies, which makes it easy to deploy and run. For years this is how we packaged all of our Java applications and it worked pretty well, but it had some serious drawbacks.

 Fat JAR Woes

 The first issue we hit is that JARs are not meant to be aggregated like this. There can be files with the same path present in multiple JARs and by default the shade plugin includes the first file in the fat JAR and discards the rest. This caused some really frustrating bugs until we figured out what was going on (for example, Jersey uses META-INF/services files to autodiscover providers and this was causing some providers to not get registered). Luckily, the shade plugin supports resource transformers that allow you to define a merge strategy when it encounters duplicate files so we were able to work around this issue. However, it’s still an extra "gotcha" that all of our developers need to be conscious of.

The other, bigger issue we ran into is that this process is slow and inefficient. Using one of our applications as an example, it contains 70 class files totalling 210KB when packaged as a JAR. But after running the shade plugin to bundle its dependencies, we end up with a fat JAR containing 101,481 files and weighing in at 158MB. Combining 100,000 tiny files into a single archive is slow. Uploading this JAR to S3 at the end of the build is slow. Downloading this JAR at deploy time is slow (and can saturate the network cards on our application servers if we have a lot of concurrent deploys)."



'via Blog this'

No comments:

Post a Comment