Do some puzzles sometimes

Wait, I here you say “This guy writing there, didn’t he write about not having time and energy for site projects?”:

Well, yes, sadly that’s the case and I did cancel a couple of things for good (BTW, I am looking for someone kind to take over the leadership of Aachens Java User Group EuregJUG, maybe there will be a time again for meetings).

On the other hand, I try not to go completely nuts and the last couple of weeks have not made this easy. It’s grey, cold (which I don’t even mind), but wet, wet, wet, muddy, muddy and then some: Running and especially cycling is a bit hard. Normally that would keep me on the track.

Instead, I started puzzle on Advent of Code again. I have a dedicated repository for my solutions, find it at michael-simons/aoc. Why more coding?

What I really like about the puzzles is the fact that they absolutely have nothing todo with frameworks, annotations, microservices, DDD platforms, build systems or moving JSON between endpoints. Just plain, logical puzzles to tinker with. A bit like doing a crossword thing each day.

In my repository, I tackled every puzzle with what’s available in a language, no libraries. Setup in a way that someone who want’s to run needs only to install one thing, the language. The repository contains Java (of course), Kotlin, Rust, Ruby, Go, SQL, Cypher, PHP, Scala, Typescript and some other things. I guess I managed to be more idiomatic in some languages than others, though. In most of the puzzles you’ll realize that you need various ideas, mathematical concepts, algorithms again and again. It’s helpful to compare how easy or hard is to implement those in various languages.

Up until this week I did run some circles around languages like Clojure or Lisp in general or things like Haskell. I find them intimidating at first sight and I don’t have a university background with knowledge about their theoretical concepts.

I started to read up on Clojure a bit and while I do not yet understand everything, I was able to create the following script

(def input (clojure.string/trim-newline (slurp "input.txt")))
 
(def freq (frequencies input))
(def starOne
    (- (get freq \() (get freq \))))
(println starOne)
 
(def starTwo
    (count
        (take-while (fn [p] (not= p -1))
            (reductions (fn [sum num] (+ sum num)) 0 (map {\( 1 \) -1} input)))))
(println starTwo)

It reads an input file into memory and uses the frequencies function to compute the frequencies of different characters in it. It assigns the difference of occurrences of `(` and `)` to the a variable named starOne and prints it.

The second part counts the number of iterations needed to map all opening brackets to 1 and closing brackets to -1 and summing them up (in several reductions) until one of them hits -1.

Many important things are already in there: Reading files, working with maps and lists, applying functions to every item in a list, calling functions and defining anonymous functions. I can work with that.

Fast forward a couple of days, having a look at I Was Told There Would Be No Math. Well, the math is actual super simple in that. Read a file with lines like 2x3x4. They give you the dimension of a parcel (length, width, height).

Compute according to some rules paper and ribbon needed to wrap those parcels or presents. Paper area is given by “find the surface area of the box, which is 2*l*w + 2*w*h + 2*h*l. The elves also need a little extra paper for each present: the area of the smallest side.”

The good object oriented person and the happy Java 15 user with preview feature I am I started to create a record to model that thing:

record Present(int l, int w, int h) {
    int surfaceArea() {
        return 2 * (l * w + w * h + h * l);
    }
 
    int slack() {
        return Math.min(l * w, Math.min(w * h, h * l));
    }
 
    int volume() {
        return l * w * h;
    }
 
}

I mean, basic math, right? Solution to the first question is just summing surface area and slack up like var starOne = presents.stream().mapToInt(p -> p.surfaceArea() + p.slack()).sum();. Easy, right?

Always thinking in objects primes you to things. Here to length, width, height. I should have realized what I am doing when I computed the smallest area (computed 3 areas and chose the smallest one): It doesn’t matter which value is assigned to length, width and height: I can just sort them, take the two smallest and multiple them. I realized that when I computed the smallest perimeter of the parcel:

int smallestPerimeter() {
    return Stream.of(l, w, h).sorted().limit(2).mapToInt(v -> 2 * v).sum();
}

Enter the Clojure solution. It felt very clumsy to define a type just for that.

Here’s what I came up with instead

(use '[clojure.string :only (trim-newline split-lines split)])
 
(def input 
    "I use again slurp to read the file and two library functions
     to trim the newlines and split the whole thing into a list. 
     An anonymous function is used on each line to split line by 
     the letter `x` and map the values to an int. Those ints are than
     sorted and the variable `input` will be a lazy list of int arrays."
    (map (fn [v] (sort (map bigint (split v #"x"))))
    (split-lines (trim-newline (slurp "input.txt")))))
 
(defn paper
    "As I know that the array is sorted, I can deconstruct it into the 
     3 values contained. The riddle for the paper is that the smallest area 
     is in there 3 and not 2 times. As the smallest area is defined by the first
     two elements, we multiple them 3 instead of 2 times like the rest."
  [dimensions]
  (let [[l w h] dimensions]
    (+ (* 3 l w) (* 2 l h) (* 2 w h))))
 
(defn ribbon  
    "Same idea as above: The smallest perimeter is defined by the 2 smallest values.
     The volume is of course the product of all 3."
  [dimensions]
  (let [[l w h] dimensions]
    (+ (* 2 l) (* 2 w) (* l w h))))
 
(println (reduce + (map paper input)))
(println (reduce + (map ribbon input)))

Find my prose inside the program.

I do like the approach not using to many data structures.

In the end: It is good to have a look outside the box sometimes and reset your brain with fresh ideas.

Thanks to Tim, Stefan and Jan for the multiple times in which you brought Clojure into my bubble.

Title picture by Bannon Morrissy on Unsplash

| Comments (1) »

03-Feb-21


Minecraft terminology for Java developers

My eldest kid – 11 at the time of writing – has been into Minecraft for some time now. I tried to motivate him on various ages to do some kinda programming with me. We tried Scratch, Lego Mindstorms (one of the few sets that is gathering dust) and a couple of other things. We didn’t have much fun with any of those. The nicest thing done together have been actually a couple of the Advent of Code challenges in which I tried to reduce to setup of things to a bare minimum (aka Texteditor).

I don’t blame the kid for failure at all… I am not a good teacher but what is worse, I have not much interest myself in tinkering with a game, never had (basically the same with regards to cycle, I prefer doing the thing actually), so I was quite happy that the kid himself wants to do stuff.

A bit of terminology

I was a bit surprised how many different kinds of Minecraft client and servers are out there:

First of all, there’s the Java based, “vanilla” server you can download here: original, “vanilla” Minecraft Java server. This edition of the server does not support custom plugins. It does however support Minecraft Forge. Forge is a modification loader for the vanilla server. You would program against the Forge API and that API encapsulates away the interaction with the server code.

Only the Minecraft Java edition client can connect against the Java server. This is a bit sad, as many friends of my kid would only have access to a gaming console. The Minecraft edition available on Switch, X-Box or Playstation does not support the Java server. They only connect to something called “Bedrock” edition.

Custom mods with MCreator

Back to Forge: Forge as the modloader must be installed into your Java Minecraft server via an installer that fits the server version. You get the installers here. After that you can pick and choose from a plethora of already existing mods at curseforge.

As far as I understood this, mods can also be installed client side only, but I don’t see the point of that when playing with multiple people on a server.

Anyway, we want to create our own mods. Mods can change many aspects of the games. They can include new commands, new recipes, blocks, bioms, creates, enchantments and more.

Adam Ness pointed me to MCreator. From the site: “MCreator is open source software used to make Minecraft Java Edition mods, Minecraft Bedrock Edition Add-Ons, and data packs using an intuitive easy-to-learn interface or with an integrated code editor.” This was exactly what I was looking for. A full fledged IDE but dedicated to Minecraft and the environment.

After download MCreator will setup a workspace for you. It uses Gradle behind the scenes to setup everything, including a development client and server. The nice thing here: You or your kid won’t need a Minecraft client license at that point. The documentation of the thing is stellar, it has an exhaustive Wiki and a good Knowledgebase.

MCreator contains tooling to package the mod and distribute it.

Custom plugins for the Spigot Minecraft server

I could stop here, as MCreator and the ability of mods to change a vanilla server is everything my kid was looking for, but for completeness and for fellow Java developers, here’s more.

There once was a fork of the Minecraft server called “Bukkit” but a little fluster cluck happened and now there’s Spigot. Spigot is something like a fork of the Minecraft server, but depends on the original binary (read the first link about to understand way). It does everything that the vanilla server does, but more. Especially it supports full fledged plugins.

To get started with developing plugins of any kind – btw, I know of one that uses Neo4j and Neo4j-OGM, yes, I helped the plugin author as part of my day job already – you first need to build the Spigot server. They don’t offer prebuild binaries due to the license mess (I found binaries at getbukkit.org but I am unsure about that being legit).

Building Spigot is pretty much trivial when you have a recent JDK installed on your machine. Grab the Spigot BuildTools.jar, save it somewhere and run in that directory java -jar BuildTools.jar to get a new server. I am unsure if you need to have Git installed for your machine (I have on all machines), but the build tools use it to clone a bunch of repositories.

After a while, you’ll find spigot-1.x.y.jar inside the same directory (at the time of writing, 1.16.4). This is your fresh Minecraft server supporting custom plugins.

How to write plugins? You can start of with a basic Maven project. The Spigot Plugin API lives under the following coordinates in the provided scope: org.spigotmc:spigot-api:1.16.4-R0.1-SNAPSHOT. They are not on central, so you would need to add the Spigot-Repo, too:

<repositories>
    <repository>
        <id>spigot-repo</id>
        <url>https://hub.spigotmc.org/nexus/content/repositories/snapshots/</url>
    </repository>
</repositories>
 
<dependencies>
    <dependency>
        <groupId>org.spigotmc:spigot-api:1.16.4-R0.1-SNAPSHOT</version>
        <scope>provided</scope>
    </dependency>
</dependencies>

The Spigot wiki gives more details.

The server version and naming a is a mess, not to speak about the licensing issues. But apart from that, I found the API pleasant to use: Spigot Java API docs. It gives full control about basically everything you can do in the game and I was able to create a simple plugin very fast.

To build the whole thing, you would need of course Maven or Gradle and some idea to set this up. I did install the Plugman plugin. This allows to load / unload / reload other plugins without restarting the server every time which cuts down on the feedback loop.

Summary

To sum this up: The various forks, versions, unclear naming of things regarded Minecraft is intimidating. People without much knowledge will eventually stumble from one YouTube “tutorial” to the next and download all sort of things in various qualities… After managing that initial step, things are not that bad and one can do neat things.

MCreator gives modders of all age great tooling without too much diversions in terms of installing things to express their minds about custom interactions in Minecraft. In a classroom situation or a scenario where a non-developer modern wants to try out things, I would recommend that.

Java developers will probably enjoy the Spigot API more. There’s even an IntelliJ plugin that creates full projects for you with the required setup.

| Comments (2) »

03-Jan-21


Music 2020. Wrapped.

Everyone and their dog is posting their Spotify Wrapped thing. It’s 2020, i still don’t have Spotify, but despite my increasing age, I still listen to a ton of music.

When I started to work remotely back in 2018, one of the biggest perks for me was – apart from not having to commute – to be able to listen to whatever thing I currently like as loud as I want without headphones. Well, that changed a bit during the course of the COVID-19 pandemic as my wife is now working remotely as well, but alas, it turned out, the volume knob is still working.

So, no Spotify for me. But let’s see what the MariaDB – the database powering the scrobble engine running for dailyfratze.de is up to. How do I fill this data? I have a custom iTunes script written ages ago that calls a REST endpoint with the stuff I’m listening. Pretty basic, actually.

I am working for Neo4j now since 2.5 years and I honestly love the company for manifold reasons. However, it seems that it is considered rude to post SQL in the company Slack and we should prefer to use only Cypher 😉 Well, this post will contain a lot of SQL and use the scheme I had a couple of times in this SQL talk of mine.

Interested in Cypher? Cypher is a language for querying Neo4j, the Graph database by the vendor of the same name,

You can do awesome stuff with Cypher and you’ll find talks by me as well about that topic, but today I’ll keep it to a 1990’s joke: MATCH (n) RETURN n SKIP $no LIMIT /* no */ $ /* no */ limit 😉

General database stats

39497 tracks by 9661 artists and 161141 played tracks by 9 different users. First plays stored April 27 in 2005.

We will make use of the rank function to compute the exact position of things we are interested in.

Top 10 tracks in 2020

A simple approach without rank would be something like this:

SELECT a.artist,
       t.name,
       COUNT(*)
FROM plays p
JOIN tracks t ON t.id = p.track_id
JOIN artists a ON a.id = t.artist_id
WHERE p.user_id = 1
AND YEAR(p.played_on) = 2020
GROUP BY a.artist, t.name
ORDER BY COUNT(*) DESC
LIMIT 10

but that would fill already several places with tracks that have the same absolute count. This is where the rank() and dense_rank() functions come into play.

Both functions assign rows in a row set a rank based on their given order. Both functions can do this over partitions or windows over the whole data. Therefore these analytics functions are often called window functions. Both variants of rank functions assigns the same rank to rows having the same value. Thus, two tracks that have been played the same number of time will receive the same rank. However, rank will skip n ranks if there are 1 + n items in a rank wheres the dense_rank function will not. I want consecutive ranks, that is: All tracks played the most n times will be 1 first place, the next rank second and so forth.

Let’s give it a shot. We see that the a query creating a window function over the dense rank of count gives us 8 tracks in total when I ask for the top 5 places:



And what can I say: German Hip-Hop/Punk-Band Antilopen Gang is on my radar now for 2 years, but in 2020, they have become my meds and therapy. If you would have ever told me, that I would totally fall in love with German Hip-Hop in my early 40ties, I would have said you’re mad, but there we are: “Wünsch Dir nix”, so fitting for 2020:

Also in the top 5, Patientenkollektiv. Such goose bumps:

We will see the Antilopen later on. I was a bit surprised by S&M2, a new album in 2020. A retake of Metallicas symphonic metal approach with the San Francisco Symphony orchestra. That version of The Unforgiven III is not something I would have ever expected by James. An incredible performance:

Last but not least: Ozzy Osborne. This guy has reached Lemmy Kilmister undying level.

Back to SQL. There are no partitions in the above query. The partitions would come in handy if would like to see my top 1 track over the last years in one query. Let’s give it a try:



It’s basically the same query but notice now how I create the rank: dense_rank() OVER (partition by year(played_on) ORDER BY count(*) DESC) AS rank. The rank is computed now for each year in which I played tracks, separately.

But wait, 2019. What did I drink?

Albums

Let’s be safe and let us aggregate that stuff. Yes, I do still listen to whole albums. It is basically the same query, but group by album, not by single tracks. And I excluded compilations. Apart from that, the query is hardly different:



Antilopen Gang with 3 albums. Holy crap. But yes, they did release two albums in 2020, “Abbruch Abbruch” and “Adrenochrom”. The later a reply to some people in the music circus going lunatic and believing a lot of shit. I haven’t heard one album in the last 10 years so often like this. It is streamable on all major platforms.

Let’s play “Dinge” from Deichkinds “Wer sagt denn das?”:

For me 2020 proved that I don’t need too many things. Some stuff is essential for me: Feeling secure and snug with my family, working in a good company that also makes me feel safe. Ok, bicycles are my personal issue, but that’s a different topic…

Back to Deichkind: Electropunk. Punk is a good keyword, but I only spot “5, 6, 7, 8 Bullenstaat” by Die Ärzte in the above list. 2020 had some more punk. Let’s filter the above list to albums that have been released in 2020:



And we will see Madsen, Ferris MC and Slime. Madsen, a “Deutsch Rock” band released the Punk Rock Album of 2020 (which is 200% more Punk than Die Ärzte these days), and Ferris MC, who played for a decade with Deichkind, joined forces with Swiss und die anderen and dropped an incredible Rap-Punk-Rock piece. Slime are Slime. The subtitle of this blog, “120 Dezibel” are a quote from “Missglückte Asimetrie” and one reason I will have always music in my life:

Ich dreh auf und die Erde steht still bei 120 Dezibel
Alles was ich brauch und will sind 120 Dezibel
Ich kann euch alle nicht mehr hören bei 120 Dezibel
Nichts was mich noch stört bei 120 Dezibel
120 Dezibel
120 Dezibel

Let’s have a look at Madsen. They got some stand-ins for “Alte weiße Männer”:

And one of our favorite songs among the adults and kids, “Quarantäne für immer”

Listen to “Sorry, kein Sorry” by Ferris. After that, give it a go with Slime. That band is as old as me:

And back to the database and the

Artists

What have been my preferred artists in 2020? I expect no surprises here. The query will be very similar to before, only the grouping changes again (it becomes simpler):



We see again Antilopen Gang and on the next two ranks – if I didn’t restrict to top 10 – we would have seen Juse Ju and Fatoni, two more German rapper who are also somewhere near Antilopen Gang. A graph database like Neo4j would show this connection and probably discover the “Anti Alles Aktion” on it’s own. Want to learn how? Have a look at Going from relational databases to databases with relations with Neo4j.

People who know me a bit longer know that I have been to more than one Heavy Metal festival and certainly to more than one Grindcore gig. That much German hip-hop in my playlists? I would have never thought. Ok, there’s still the usual suspects like Motörhead (I love running with Motörhead in my ears), the mighty Black Sabbath and even Body Count had a decent album out this year.

Can my database answer the change in artist or preference as well? Hmm, we would need the current years rank of something and the previous one. I think we can do this.

But in the meantime, enjoy Faith Alone 2020 by Bad Religion:

So be prepared for the with-with clause or “Common Table Expression”. The keen eye did already see that I used subqueries in my queries above. Why? To filter on the rank (top 5 or top 10 only). I cannot do this in the same select as in which the rank is computed. Therefore I nested the query and made it accessible that way.

The subquery works, but is kinda hard to read and cannot be reused. A relation inside a with clause is somewhat like a named subquery or a view that only exists during that query. Fun fact: CTEs can refer to themselves, thus become recursive.

Anyway, I think they read nice and reminds me a lot of the with clause in Cypher which is used to stick together multiple segments of a query to one pipeline.

But show me the code:

WITH rank_per_year AS (
  SELECT YEAR(p.played_on)  AS YEAR,
         a.artist,
         dense_rank() OVER (partition BY YEAR(played_on) ORDER BY COUNT(*) DESC) AS rank
  FROM plays p
  JOIN tracks t ON t.id = p.track_id
  JOIN artists a ON a.id = t.artist_id
  WHERE p.user_id = 1
  AND YEAR(p.played_on) BETWEEN 2015 AND  2020
  AND t.compilation = 'f'
  GROUP BY YEAR(p.played_on), a.artist
) 
SELECT YEAR, artist, rank, 
       ifnull(
         lag(rank) OVER (partition BY artist ORDER BY YEAR ASC) - rank, 
         'new'
       )  AS `change`
FROM rank_per_year
WHERE rank <= 5
ORDER BY YEAR ASC, rank ASC;

We do compute the rank per year and artist (“group by year and artist”) for years 2015 to 2020, partitioned by the year and give this whole thing a name (“rank_per_year”). This is a new relation that can now be used in a select clause, like we do.

In that select clause, we do find lag. lag is a window function that can go n rows backward over a partition that is ordered. The partition here is defined by the artist and in that, ordered by year. lag picks the value of the rank of the previous year. rank is a variable in that case, coming from the CTE named “rank_per_year”, not from the window function of the same name!

From that lagged value we subtract the current and get the change from the previous to this year. As one artist can be under the top 5 artist for the first time in a year, we need to check whether the previous rank is null. That’s what ifnull is for. Neat, isn’t it? And the result? Here we go (I added some blank lines manually):



I hope you enjoyed this a bit. At least I did. Was nice doing some SQL again and digging through the stuff I have been listening in 2020. In total I listend to about 9847 tracks so far in 2020 with a duration of roughly 27 days.

I leave you with a track that captures my mood in 2020 all to perfect. Danger Dan mit Nudeln und Klopapier:

The screenshots of the code have been created with Carbon. I was too lazy to fiddle around with something else that would have fit both the queries and the output. I generated the query format with window chrome and the output without and appended both files than with ImageMagick like this: convert carbon.png carbon\(1\).png -append tracks-2020.png.

| Comments (0) »

05-Dec-20


About the tooling available to create native GraalVM images.

A couple of days ago I sent out this tweet here:

This tweet caused quite some reaction and with the following piece I want to clarify my train of thoughts behind it. First of all, let’s pick the components here and explain what they are.

The complete source code for all examples is on GitHub: michael-simons/native-story. Title image is from Alexander. Cheers, hugs and thank you to Gerrit, Michael and Gunnar for your reviews, proofreading and feedback.

GraalVM

GraalVM is a high-performance runtime that provides significant improvements in application performance and efficiency which is ideal for microservices. Quoted from the GraalVM website:

“GraalVM is a high-performance runtime that provides significant improvements in application performance and efficiency which is ideal for microservices. It is designed for applications written in Java, JavaScript, LLVM-based languages.”

One benefit of the GraalVM is a new just-in-time compilation mechanism, which makes many scenarios running on GraalVM faster than running on a comparable JDK. However, there is more. Also quoting from the above intro: “For existing Java applications, GraalVM can provide benefits by […] creating ahead-of-time compiled native images.”

SubstrateVM

The SubstrateVM is the part of GraalVM that is responsible for running the native image. The readme states:

(A native image) does not run on the Java VM, but includes necessary components like memory management and thread scheduling from a different virtual machine, called “Substrate VM”. Substrate VM is the name for the runtime components (like the deoptimizer, garbage collector, thread scheduling etc.). The resulting program has faster startup time and lower runtime memory overhead compared to a Java VM.

The GraalVM team has a couple of benchmarks showing the benefits from running microservices as native images.
Those numbers are impressing, no doubt, and they will have a positive effect for many applications.

I wrote my sentiment not as an author of applications, but as an author and contributor of database drivers supporting encrypted connections to servers as well as an object mapping framework that takes arbitrary domain objects (in form of whatever classes people can think of) and creates instances of those dynamically from database queries and vice versa.

This text is not an exhaustive take on the GraalVM and its fantastic tooling. It’s a collections of things I learned during making the Neo4j Java Driver, the Quarkus Neo4j extension, Spring Data Neo4j 6 and some GraalVM polyglot examples native image compatible.

Since my first ever encounter with GraalVM back in 2017 at JCrete, things have become rather easy for application developers. There is the native-image tool that takes classes or a whole jar containing a main class or the corresponding Maven plugin and produces a native executable.

There is a great getting started Install GraalVM which you can follow as an application developer step by step. Make sure you install the native-image tool, too.

Giving the following trivial program – which can be run as a single source file java trivial/src/main/java/ac/simons/native_story/trivial/Application.java Michael producing Hello, Michael

package ac.simons.native_story.trivial;
 
public class Application {
 
	public static void main(String... args) {
 
		System.out.println("Hello, " + (args.length == 0 ? "User" : args[0]));
	}
}

Compile this with first with javac and after that, run native-image like this:

javac trivial/src/main/java/ac/simons/native_story/trivial/Application.java 
native-image -cp trivial/src/main/java ac.simons.native_story.trivial.Application app

It will produce some output like this

Build on Server(pid: 21148, port: 50583)
[ac.simons.native_story.trivial.application:21148]    classlist:      71.34 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]        (cap):   1,663.79 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]        setup:   1,850.67 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]     (clinit):     107.06 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]   (typeflow):   2,620.63 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]    (objects):   3,051.08 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]   (features):      83.23 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]     analysis:   5,962.31 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]     universe:     112.18 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]      (parse):     218.57 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]     (inline):     494.42 ms,  4.55 GB
[ac.simons.native_story.trivial.application:21148]    (compile):     912.43 ms,  4.43 GB
[ac.simons.native_story.trivial.application:21148]      compile:   1,828.57 ms,  4.43 GB
[ac.simons.native_story.trivial.application:21148]        image:     465.08 ms,  4.43 GB
[ac.simons.native_story.trivial.application:21148]        write:     135.90 ms,  4.43 GB
[ac.simons.native_story.trivial.application:21148]      [total]:  10,465.92 ms,  4.43 GB

and eventually you can run a native executable like this ./app Michael. Adding the corresponding Maven plugins to the project makes that part of the build. Pretty neat.

So far, so good and done? From this application, of course. But having framework needs is a bit more elaborated.

A fictive “framework”

Let’s take this simple “hello-world” application and turn it into something artificially complicated. Imagine we are writing a complex application, having some framework like traits. So, the “greeting” must be turned into an interface based service:

public interface Service {
 
	String sayHelloTo(String name);
 
	String getGreetingFromResource();
}

Of course, we need a factory to get instances of that service

public class ServiceFactory {
 
	public Service getService() {
		Class<Service> aClass;
		try {
			aClass = (Class<Service>) Class.forName(ServiceImpl.class.getName());
			return aClass.getConstructor().newInstance();
		} catch (Exception e) {
			throw new RuntimeException(\\_(ツ)_/¯", e);
		}
	}
}

The implementation of the service should look something like this

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UncheckedIOException;
import java.util.stream.Collectors;
 
public class ServiceImpl implements Service {
 
	private final TimeService timeService = new TimeService();
 
	@Override
	public String sayHelloTo(String name) {
		return "Hello " + name + " from ServiceImpl at " + timeService.getStartupTime();
	}
 
	@Override
	public String getGreetingFromResource() {
		try (BufferedReader reader = new BufferedReader(
			new InputStreamReader(this.getClass().getResourceAsStream("/content/greeting.txt")))) {
 
			return reader.lines()
				.collect(Collectors.joining(System.lineSeparator()));
		} catch (IOException e) {
			throw new UncheckedIOException(e);
		}
	}
}

That looks actually rather simple. As an added bonus, it includes a TimeService that returns the start of the application. That service is implemented in a super naive way:

import java.time.Instant;
 
public class TimeService {
 
	private final static Instant STARTED_AT = Instant.now();
 
	public Instant getStartupTime() {
		return STARTED_AT;
	}
}

It’s problematic on its own, but that shall not be the point here. Last but not least, let’s blow up the application itself a bit:

import java.lang.reflect.Method;
 
public class Application {
 
	public static void main(String... a) {
 
		Service service = new ServiceFactory().getService();
		System.out.println(service.sayHelloTo("GraalVM"));
 
		System.out.println(invokeGreetingFromResource(service, "getGreetingFromResource"));
	}
 
	static String invokeGreetingFromResource(Service service, String theName) {
 
		try {
			Method method = Service.class.getMethod(theName);
			return (String) method.invoke(service);
		} catch (Exception e) {
			throw new RuntimeException(e);
		}
	}
}

I tried to make up some examples that need to be addressed due to limitations of Graals ahead of time compilation described here.

What do we have?

  • A factory producing an instance based on a dynamic class name (non compile time constant), the ServiceFactory
  • A dynamic method call (could be a field call or whatever through java.lang.reflect in Application)
  • A service that uses some resources (getGreetingFromResource).
  • Another service that uses a static field initialized during class initialization containing a sensible value dependent on the current time (TimeService)

When I package this application as a jar file, containing a manifest entry pointing to the main class, I can run it like this:

java -jar only-on-jvm/target/only-on-jvm-1.0-SNAPSHOT.jar 
Hello GraalVM from ServiceImpl at 2020-09-15T09:37:37.832141Z
Hello, from a resource.

However, pointing native-image to it, now results in a couple of warnings

native-image -jar only-on-jvm/target/only-on-jvm-1.0-SNAPSHOT.jar 
...
Warning: Reflection method java.lang.Class.forName invoked at ac.simons.native_story.ServiceFactory.getService(ServiceFactory.java:8)
Warning: Reflection method java.lang.Class.getMethod invoked at ac.simons.native_story.Application.invokeGreetingFromResource(Application.java:18)
Warning: Reflection method java.lang.Class.getConstructor invoked at ac.simons.native_story.ServiceFactory.getService(ServiceFactory.java:9)
Warning: Aborting stand-alone image build due to reflection use without configuration.
Warning: Use -H:+ReportExceptionStackTraces to print stacktrace of underlying exception
Build on Server(pid: 26437, port: 61293)
...
Warning: Image 'only-on-jvm-1.0-SNAPSHOT' is a fallback image that requires a JDK for execution (use --no-fallback to suppress fallback image generation and to print more detailed information why a fallback image was necessary).

A fallback image that requires a JDK means that the resulting image – however not being much smaller or larger than a non-fallback – requires the JDK to be present at runtime. If you remove the JDK from your path and try to execute it, it will greet you with:

./only-on-jvm-1.0-SNAPSHOT 
Error: No bin/java and no environment variable JAVA_HOME

What tools are available to address the issues? Let’s first tackle the first two, both dynamic class loading and Java reflection. We have two options:

We can enumerate what classes need to be present in the native image and what methods as well and to which methods reflection based access should be available. Or we can substitute classes or methods when run on GraalVM.

Enumerating things present in the native image

The GraalVM analysis intercepts calls like the one to Class.forName and tries to reduce their arguments to a compile time constant. If this succeeds, the class in question is added to the image. The above example is contrived so that the analysis cannot do this. This is where the “reflection config” can come into place. The native-image tool takes -H:ReflectionConfigurationFiles as arguments which points to JSON files containing something like this:

[
  {
    "name" : "ac.simons.native_story.ServiceImpl",
    "allPublicConstructors" : true
  },
  {
    "name" : "ac.simons.native_story.Service",
    "allPublicMethods" : true
  }
]

Here we declare that we want allow reflective access to all public constructors of ServiceImpl so that we can get an instance of it and allow access to all public methods of the services interface.

There are more options as described here.

One way to make native-image use that config is to pass it as
-H:ReflectionConfigurationFiles=/path/to/reflectconfig, but I prefer having one
native-image.properties in META-INF/native-image/GROUP_ID/ARTIFACT_ID which is picked up by the native-image tool.

That native-image.properties contains so far the following:

Args = -H:ReflectionConfigurationResources=${.}/reflection-config.json

Pointing to the above config.

This will compile the image just nicely. However, it will still fail with a NullPointerException: The greeting.txt resource has not been included in the image.

This can be fixed with a resources-config.json like this

{
  "resources": [
    {
      "pattern": ".*greeting.txt$"
    }
  ]
}

The appropriate stanza needs to be added to the image properties, so that we have now:

Args = -H:ReflectionConfigurationResources=${.}/reflection-config.json \
       -H:ResourceConfigurationResources=${.}/resources-config.json

Note The arguments for specifying configuration in form of some JSON “things” come in two options: As XXXConfigurationResources and XXXConfigurationFiles which I learned in this issue (which is great example of fantastic communication from an OSS project). The resources-form is for everything inside your artifact, the files-form is for external files. The wildcard ${.} resolves accordingly. All the options to specify can be retrieved with something like this: native-image --expert-options | grep Configuration

Now the image runs without errors:

 ./reflection-config-1.0-SNAPSHOT                                                                                          
Hello GraalVM from ServiceImpl at 2020-09-15T15:02:47.572800Z
Hello, from a resource.

But does it run without bugs? Well not exactly. I wrote a bit more text, time went on and when I run it again, it prints the same date. Look back at the TimeService. It holds an instance of private final static Instant STARTED_AT = Instant.now();. It must be initialized before the time service is used.

I’m actually unsure why the native image tool considers the TimeService class as “safe” (described here) and choses to initialize it at build time (which also contradicts Runtime vs Build-Time Initialization stating “Since GraalVM 19.0 all class-initialization code (static initializers and static field initialization)”. At first I thought that happens as I “hide” the TimeServices usage behind my reflection based code, but I can reproduce it without it, too.

At the time of writing, I asked for this on the GraalVM slack and we see how it will be answered. Until then, I’m happy to have a somewhat contrived example. The TimeService must be of course initialized at runtime, it is not safe. This is done via --initialize-at-run-time arguments to the native image tool.

So now we have:

Args = -H:ReflectionConfigurationResources=${.}/reflection-config.json \
       -H:ResourceConfigurationResources=${.}/resources-config.json \
       --initialize-at-run-time=ac.simons.native_story.TimeService

And a correctly working, native binary.

Substitutions

Working on making the Neo4j driver natively compilable was much more effort. We used Netty underneath for SSL connections. A couple of things need to be enabled on the native image tool to get the groundworks running (like having those -H:EnableURLProtocols=http,https --enable-all-security-services -H:+JNI options which can be added in the same manner like we did above).

A couple of other things needed active substitutions.

With the “SVM” project the GraalVM provides a way to substitute whole classes or methods during the image build:

<dependency>
	<groupId>org.graalvm.nativeimage</groupId>
	<artifactId>svm</artifactId>
	<version>${native-image-maven-plugin.version}</version>
	<!-- Provided scope as it is only needed for compiling the SVM substitution classes -->
	<scope>provided</scope>
</dependency>

Now we can provide them like this in a package private class like CustomSubstitutions.java hidden away.

import ac.simons.native_story.Service;
import ac.simons.native_story.ServiceImpl;
 
import com.oracle.svm.core.annotate.Substitute;
import com.oracle.svm.core.annotate.TargetClass;
 
@TargetClass(className = "ac.simons.native_story.ServiceFactory")
final class Target_ac_simons_native_story_ServiceFactory {
 
	@Substitute
	private Service getService() {
		return new ServiceImpl();
	}
}
 
@TargetClass(className = "ac.simons.native_story.Application")
final class Target_ac_simons_native_story_Application {
 
	@Substitute
	private static String invokeGreetingFromResource(Service service, String theName) {
 
		return "#" + theName + " on " + service + " should have been called.";
	}
}
 
 
class CustomSubstitutions {
}

The names of the classes don’t matter, the target classes do of course.

With that, -H:ReflectionConfigurationResources=${.}/reflection-config.json can go away (in our case). You can do a lot of stuff in the substitutions. Have a look at what we do in Neo4j Java driver.

The tracing agent

Thanks to Gunnar I learned about GraalVMs Reflection tracing agent. It can discover most of things described above for you.

Running the only-on-jvm example from the beginning with the agent enabled, it generates the full configuration for us. For this to work, you must of course be running the OpenJDK version of the GraalVM already:

java --version
openjdk 11.0.7 2020-04-14
OpenJDK Runtime Environment GraalVM CE 20.1.0 (build 11.0.7+10-jvmci-20.1-b02)
OpenJDK 64-Bit Server VM GraalVM CE 20.1.0 (build 11.0.7+10-jvmci-20.1-b02, mixed mode, sharing)
 
java  -agentlib:native-image-agent=config-output-dir=only-on-jvm/target/generated-config -jar only-on-jvm/target/only-on-jvm-1.0-SNAPSHOT.jar
Hello GraalVM from ServiceImpl at 2020-09-16T07:12:27.194185Z
Hello, from a resource.

The result looks like this:

dir only-on-jvm/target/generated-config 
total 32
14417465 0 drwxr-xr-x  6 msimons  staff  192 16 Sep 09:12 .
14396074 0 drwxr-xr-x  8 msimons  staff  256 16 Sep 09:12 ..
14417471 8 -rw-r--r--  1 msimons  staff  278 16 Sep 09:12 jni-config.json
14417468 8 -rw-r--r--  1 msimons  staff    4 16 Sep 09:12 proxy-config.json
14417470 8 -rw-r--r--  1 msimons  staff  226 16 Sep 09:12 reflect-config.json
14417469 8 -rw-r--r--  1 msimons  staff   77 16 Sep 09:12 resource-config.json

Looking into the reflect-config.json we find a less coarse version of what I used above:

[
{
  "name":"ac.simons.native_story.Service",
  "methods":[{"name":"getGreetingFromResource","parameterTypes":[] }]
},
{
  "name":"ac.simons.native_story.ServiceImpl",
  "methods":[{"name":"<init>","parameterTypes":[] }]
}
]

The configuration is in fact complete in my example, as none of the dynamic method calls depend on input. If input varies the method calls, the agent has ways of merging the generated config.

In anyway, the agent is a fantastic tool to get you up and running with a base configuration for your libraries native config.

Quintessence

Without much effort I can make up a framework or program that is not exactly a good fit for a native binary. Of course, those examples here are contrived but I am pretty sure a couple of things I did here are to be found in many many applications still written today.

Also, reflection is used a lot in frameworks like Spring-Core, Hibernate ORM and of course Neo4j-OGM and Spring Data. For DI related frameworks, reflections make it easy to create injectors and wire dependencies. Object mappers don’t have an idea of what people are gonna throw at them.

Some of the things can be solved very elegantly with compile-time processors and resolve annotations and injections into byte code. This is what Micronaut does for example. Or with prebuilt indexes for domain classes like the Hibernate extensions in Quarkus do.

Older frameworks like Spring that also integrate over a lot of other things don’t have that luxury right now.

Either way, the tooling on the framework sides is improving a lot. Quarkus has several annotations and config options that generates the appropriate parameters and things as described above and a nice extension mechanism I described here. Spring will provide similar things through the spring-graalvm-native project. For Spring Data Neo4j the hints will probably look similar to this. In the end: Those solutions will translate to what I described above eventually.

Also bear in mind that there’s more that needs configuration: I addressed only reflection and resources but not JNI or proxies. There are shims and actuators to make them work as well.

I think that all the tooling around GraalVM native images is great and well documented. However, as you can see in my contrived example, there can be some pitfalls, even with applications that may seem trivial. Just pointing the native-image command against your class or jar file is not enough. Your test scenarios for services running native must be rather strict. If they spot errors, there is a plethora of utilities to help you with edge cases.

If you want to have more information, I really like this talk given at Spring One 2020 by Sébastien Deleuze and Andy Clement called “The Path Towards Spring Boot Native Applications” and I think it has a couple of takeaways that are applicable to other frameworks and applications, too:

In the long run, the work we as library authors put into making this things possible will surely pay out. But the benefit that a native image provides for many scenarios is not a free lunch.

| Comments (5) »

15-Sep-20


Hacking with Spring Boot 2.3

A couple of weeks ago, I received a paper copy of “Hacking with Spring Boot 2.3 – Reactive Edition” by Greg L. Turnquist. Greg sent this copy to me free of charge for a review. Thanks for that!

I’m happy about the opportunity to read a new Spring Boot book after I published one myself nearly two years ago (See Spring Boot Buch) about Spring Boot 2.0.

I also have only high regards of Greg. While I work at Neo4j, Inc. and Greg at VMWare we happen to have very similar tasks: We both work at various Spring Data modules and meet regularly at the shared standup.

About the print

Greg chose to self-publish his new book and I can totally relate. My copy came as “print on demand” version from Amazon. The quality is really nice and in no way worse than Greg’s previous book.

I may prefer a more black typesetting, but it’s still readable well enough. Apart from that, you clearly see it’s AsciiDoc based sources 🙂

Lexicon or hands on?

Hacking with Spring Boot 2.3 is definitely a hands on book. After a quick introduction to Spring Boot itself, Greg jumps right away from it 🙂 The focus lies on introduction reactive programming concepts and a very simple kitchen domain.

All the time, it really feels like you are sitting with Greg together on the keyboard, pair programming.

What I like about not doing Spring and Spring Boot related stuff at the very beginning is the mere fact that this is what Spring should be about: Providing the plumbing for your domain around it, not inside it. Having the start designed this way is excellent.

Of course, Greg needs to introduce core Spring Boot concepts at the beginning and we get a good overview about Spring Boot starters, auto configuration, metadata and more.

On data access

The book is about hacking with Spring Boot, not with Spring Data. But I agree 100% with Greg that an application without data is somewhat meaningless. As the book targets reactive programming especially, Greg has to pick a database that supports real reactive drivers, not something wrapped in a thread pool. At the time of writing, MongoDB was the predominant one. I would have of course loved to see SDN/RX and Neo4j, but I guess I cannot have everything.

In the meantime, you can go for Neo4j or with R2DBC and several SQL-Databases (amongst them, PostgresQL).

Anyway, I do think that Greg manages to cover standard idioms and best practices to work with Spring Data based data access code a like. It’s an exhausting topic, but the overview is just right.

On developer tools

Solid content on the “standard” Spring Boot developer tools, like restarting mechanisms and caching. Complete and nothing much to say about. The new logging groups are mentioned.

The information about Project Reactors debug mode, the logging mechanism for reactive assemblies and also the reference to the BlockHound a very valuable, even for a seasoned library developer like myself.

On testing

Testing with Spring Boot is a solid introduction to the testing support. It recaps the differences between unit, integration and end-to-end-tests. It clarifies that Spring Boot (and Spring) always strived for testability, in contrast to other believes.

While I would have probably chose different examples for testing services – for example not firing up the Spring Context at all as long as possible, especially with everything constructor injected as one should do – I super like the stance on reactive testing support. Starting about Project Reactors step verifiers and some more gems:

I learned about to things: Mono#hide respectively the corresponding fact that Project reactor may optimizes empty monos away and about Blockhounds JUnit 5 integration (Remember: Blockhound detects blocking calls in reactive code):

<dependency>
	<groupId>io.projectreactor.tools</groupId>
	<artifactId>blockhound-junit-platform</artifactId>
	<version>1.0.4.RELEASE</version>
	<scope>test</scope>
</dependency>

The above dependency brings in a test execution listener that instruments all running tests. Sweet and short.

Operations with Spring Boot

So must of the Docker and container related stuff in that chapter is brand new Spring Boot 2.3. Spring Boot Actuator has been there forever, but it’s helpful of course for people being new to Boot. Also: Thanks for the reminder to epxose actuator endpoints consciously.

Anyway, I myself found the explanation of how to use Spring Boot 2.3s new layered Jar with layered Docker images and WITHOUT the also possible Buildpack approach very usable.

Rest? Not for the RESTFul!

I wouldn’t be happy having a book by Greg without a good chapter on REST. But of course, I get one: Greg evolves a standard REST controller (Just returning plain JSON) to support HATEOAS to support affordances, all in a reactive way and all accompanied by Spring REST docs test.

A reader must do some deeper dive in some of the junction points afterwards, but they clearly have an idea what is possible.

Subliminal Messages

The most valuable piece of information for me here is the very nice explanation of Project Reactors publishOn and subscribeOn operators as well as the different schedulers revisited.

RSocket 🚀

This chapter makes me want use this link to start.spring.io, add our SDN/RX to it and jump into developing an RSocket proxy to reactive Neo4j repositories.

Especially on the topic of RSocket of which I don’t know yet much, Greg’s approach showing what is possible works very well for me. It gives you an idea and let’s you build from there on.

I personally will do some more reading afterwards on the backgrounds, though and this would of course also be my recommendation (the very same recommendation I have for people after many conference talks: Check the techs background and ideas, before you take conference driven development home and just go with the flow).

“Spring security is a beast”

Well, chapters about security: They are a can of worms. Writing them exposes the same problems that Spring Boot 1.x has had by making educated guesses of how Spring Security should be configured (they changed that later on, nowadays Spring Boot uses Spring Security’s defaults and backs of entirely when one configures a single aspect of it).

Anyway: There’s only so much depth an author can go, and so Greg shows only the basic concepts of Spring Security (for reactive applications).

I guess that in the real world, that chapter will leave the most open questions, for example how to integrate with Keycloak, Kerberos, LDAP or whatever.

As Greg writes about Spring Security’s OAuth2 support, my approach would be applying what you learn there and delegate everything to some IDAM, for example bespoke Keycloak (in contrast of running your own OAuth server based on a Spring Boot Spring Security application).

The main takeaway of that chapter for me is actually some good ideas what I can do once I have an authenticated user in the context.

Verdict

As I said already on twitter:

Greg’s book gives a rock solid overview, both about core Spring and Spring Boot ideas as well as reactive programming paradigms. The later was super valuable for me.

If you want to go more into technical details – why auto configuration or Spring data repositories work – you need to look for a different resource, most possible the great documentations and talks from the Spring team (this one from Madhura) or my own talks or book (given you can read German).

While I work for Neo4j and I find the lack of Neo4j examples in Greg’s book disturbing, I’m gonna recommend it anyway to my colleagues in the field teams. After reading, they will have good ideas what’s in the box and where to dig further.

| Comments (0) »

03-Jul-20