Published March 28, 2023

Using Gradle’s “application” plugin to track a growing project’s dependencies

 

When we build a car, we don’t always want to (or know how to, or have time to) reinvent each wheel from scratch. Similarly, when we create software, we reuse pieces of code (libraries) created by others. Great so far, but such libraries also depend on other libraries. And so on and so on.

A growing number of these dependencies is a pain in most of the actively developed software products. Even more considerable annoyance is that often we don’t realise what we bring to our product when we include a new library. And with the lack of this knowledge comes the growing risk of bugs, security holes, compatibility issues, performance drops, and – sometimes – licensing/legal problems.

There are tools like Dependabot which helps keep track of these dependencies and sometimes even raise your awareness/inform you about new (or old) known issues in them. But these tools have some limitations and don’t always support the build system we’re using. E.g. the current industry standard for JVM-based applications is Gradle. And Gradle’s build configurations are tough to analyse statically, as these configurations are programs, and only after running them we are able to receive the list of included libraries.

So how to deal with this?

Analysing dependencies

Gradle provides us with the  dependencies task that can list the classpath of the built artifacts:

We could parse this output and store it in some report after each build, then compare it with the previous release of the app (or the last build or merged pull request). It provides much information about the dependencies between each library, where they came from, what their GAV (Group:Artifact:Version) coordinates are and so on. It is also super quick and does not actually build the project (although it still needs to download all the libraries).

This solution has one drawback: the output is supposed to be human-readable, and there is no official specification in this regard. Parsing can be tricky, and we can only hope that all Gradle versions we have used so far (and will be using in the future) will format that list the same way.

There is another solution. We can take advantage of the “application” plugin. This is the plugin that most JVM-base applications have enabled by default. It provides a few vital tasks:

  • ./gradlew run for running the app locally (with optional --args=”something something”)
  • ./gradlew distTar for creating deployment structure with all the libraries and a launcher. There’s a second version, ./gradlew distZip, that does basically the same, but the output is a zip file, not a tar. The main drawback of a zip file is the loss of file modes – and we really want these “rwxr-xr-x” permissions for our launcher. Both distTar and distZip are called during regular ./gradlew build, but we can call them separately as well – then they will only build our application without running the tests or any static analysis or other checks. IMPORTANT: Normally, the filename is in the form of <module-name>.tar, but if we set project.version property,
    this version will also be added to the filename: <module-name>-<version>.tar

We will take a look at this distTar today. It creates a file in the build/distributions directory with the following content:

And this is indeed great, as we can now unpack it to our destination machine or, with three lines of Dockerfile, create a runnable image:

No longer do we have to create any fat-jars, shadow-jars or anything artificial. Gradle takes care of it for us; we just run a convenient shell script.

That is the intended use case for this plugin. Now, let’s see how we can take advantage of it to track our dependencies.

Let’s analyse such a tarfile to list the lib directory for actual dependencies. As we can see, we lost information on GAV coordinates and how the libraries depend on each other, but we can live without it for now.

Of course, we don’t want to do it manually each time; we’re programmers, and our objective is to make a computer do it for us. It means: we will write a program that builds the two different versions of our application and compares the content of the resulting tar files. And since we are JVM developers, we will write a Kotlin program for that.

Launching Gradle from JVM

Gradle itself is an open-source JVM application, so there should be a way to launch it from our program directly, but that has one showstopper: We don’t know which version was used in the application we are analysing. Especially since we have been building the application from various points in the past, and most probably, the Gradle version used within the project was changing too. So instead, we need to run the ./gradlew wrapper that should be in our repository.

The basic code that would do this for us looks something like this:


ProcessBuilder
is a standard java.lang class, so that comes for free, but it’s worth mentioning that the code above is not general use (especially concurrent reading of two streams – STDOUT and STDERR together is a tricky part), but in our specific case, it will be good enough.

Reading tar file content

We have our application built; we got our tar file. Now let’s list its content. We could reuse the Launcher class we created earlier and call the shell program tar -tvf build/distributions/..., but then we would need to parse its output, and that’s not the most convenient way. Instead, we can use some standard utility libraries like org.apache.commons:commons-compress. This way, we also prove the point that even the simplest of programs like to grow with dependencies on third-party libraries 🙂

The following code will look into all tar files built within our project and subProjects (there can be multiple applications inside one repository, after all) and then read the entries for each of such files.

Checking out specific commits and/or branches to build

Here we need to make some assumptions that may vary from project to project. E.g. the used version control system. The most commonly used software today is git, and it’s safe to assume that each of us has it installed (and configured with proper credentials and whatnot) on our development machines. We could use some JVM git client like JGit from Eclipse, but since we already have the code to launch (and wait for) Gradle, we can reuse it for git as well:

It is advisable not to work on our regular directory, where we perform development work, but instead have the repository cloned to some temporary directory (or, e.g. something in $USER_HOME/.dependencies-analyzer/) – that way, our regular work won’t be affected with these builds.

The last building block is selecting two versions we want to compare. This depends on the service we use and the commitment culture in our organisation. If it is GitHub / GitHub Enterprise, and we are using the pull requests feature, I’d suggest the gh command line tool, which can format its output to JSON. Something like this:

The gh pr list command will list up to 1000 last pull requests, which should be good enough for starters.

Then, we can easily map it to some data object (using Kotlin Serialization):

When we have such commit ids (from mergeCommit field for merged PRs and headRefName for open ones), for each of them, we can run the sequence of previously created functions:

(It’s a smart move to store such reports for future uses, especially if the builds take more time)

We are now ready to generate the reports for all the merged pull requests:

Comparing two versions We now have all the functionality needed to build all previous versions of our application, and we can compare them to see which dependencies have been introduced and which have been removed:

Such code will output something like:

Which is not too readable yet, so the last part is formatting it into a nice Markdown:

This gives us a nicely formatted report of application life:

Dependencies change for PR 2 (Add parsing of args)

    • Added to build/distributions/dependencies-analyzer.tar:
      • kotlinx-cli-jvm-0.3.5.jar

Dependencies change for PR 3 (Reading tarfile)

    • Added to build/distributions/dependencies-analyzer.tar:
      • commons-compress-1.22.jar

Dependencies change for PR 5 (Calling cli)

    • Removed from build/distributions/dependencies-analyzer.tar:
      • kotlin-stdlib-jdk8-1.7.20.jar
      • kotlin-stdlib-jdk7-1.7.20.jar
      • kotlin-stdlib-1.7.20.jar
      • kotlin-stdlib-common-1.7.20.jar
    • Added to build/distributions/dependencies-analyzer.tar:
      • kotlinx-serialization-json-jvm-1.4.1.jar
      • kotlinx-serialization-core-jvm-1.4.1.jar
      • kotlin-stdlib-jdk8-1.7.21.jar
      • kotlin-stdlib-jdk7-1.7.21.jar
      • kotlin-stdlib-1.7.21.jar
      • kotlin-stdlib-common-1.7.21.jar

Dependencies change for PR 10 (Upgrade kotlin and gradle)

    • Removed from build/distributions/dependencies-analyzer.tar:
      • kotlin-stdlib-jdk8-1.7.21.jar
      • kotlin-stdlib-jdk7-1.7.21.jar
      • kotlin-stdlib-1.7.21.jar
      • kotlin-stdlib-common-1.7.21.jar
    • Added to build/distributions/dependencies-analyzer.tar:
      • kotlin-stdlib-jdk8-1.8.10.jar
      • kotlin-stdlib-jdk7-1.8.10.jar
      • kotlin-stdlib-1.8.10.jar
      • kotlin-stdlib-common-1.8.10.jar

And that’s a good place to stop today. As an exercise, we can now perform such comparisons not only for MERGED but also for OPEN pull requests. And post these results as comments to these PRs. Have fun hacking!