How to become an Open Source committer?

A couple of days ago I was asked above questions: “How to become an Open Source committer?” I think the answer might be interesting to other as well, so I am sharing it here as well.

At some point I wrote a GitHub README.md, which you can read over there: github.com/michael-simons. I think it gives a pretty good idea what I am up to.

While that I probably would fail all of the FAANG assessment tests for getting a job and I am not the person to look for implementing low level optimizations and stuff like that, I think that I am a decently good generalists with deep knowledge in a couple of important things, such as databases and the bigger (server side) frameworks, but also on a language level. My aim is to keep an overview on how things work together and go rather deep into the topic that I am working on in a project.

Anyway, I created a GitHub account back in 2010, my first contribution was a pull request to a jQuery add-on I like but that was missing a piece I wanted. I needed that for my foto project Dailyfratze.de. At that time I was working for nearly 8 years already at a small German company in Aachen, mostly doing Oracle Database based stuff… At the frontend with Oracle Forms Client/Server: Not exactly the stuff that is popular on GitHub. However, I was often blogging about some work related stuff right here, for example this here bulk operations in Oracle, an article that still has hits, or integrating Hibernate of old with Oracle spatial. Many of these old posts have been accompanied by issues in the trackers of the respective projects.

And yes, I do absolutely think that good issue reports are valuable and Open Source contribution on their own rights, very much like contributions to documentation and heck, just fixing typos. So from my perspective, a proper first step can be reporting things. If you work in a company that uses Open Source projects and you ran into issues take the time to create a reproducer. Don’t just go to a project and yell “this doesn’t work”, but pay back by something that is runnable and shows the error. Most maintainers I know are happy to work with that. If you employer doesn’t permit this, I would rethink my time there: Your employer is saving money by using Open Source. While contributing features or fixes is actually not always legally possible, taking the time for proper issues is. And if your own project is that secretive, do a dummy.

My personal journey continues with a log of blogging about rails. I really used it a lot and I like it to this day. For whatever reason, it just worked back then and I found everything I wanted in there. Some posts might be more angry and ranty, but that usually boiled down to gems (those are projects you can pull in as libraries) required native counterparts on the system and well, let’s say that ecosystem was worse on macOS around 2010 than it is today.

Somewhen around 2011 I rewrote aforementioned foto project from Ruby on Rails to Java with Springframework which really got my deep into Java. We started to experiment with Spring in the company, replaced a couple of old applications for the customer with new ones and great success and in 2013, Spring Boot appeared and I have been using it ever since:



The application that I went public with is this biking.michael-simons.eu, still maintained and used (by me). Back in the early days I provided tons and tons of feedback to Boot and somewhen, also small features and bug fixes. In the end, it’s about trust.

At the same time, I had the great opportunity to visit ISAQB trainings and met Gernot Starke, a great personal inspiration. What an honor that he asked me to contribute to arc42byExample. Of course I took this chance.

Up until here the important part to understand is this: I am diligent at my work. Sometimes I overwork, but not too much. I have other interests than spent all day and night coding (More about this here). I was blessed that I started my professional live in a company that fostered and supported its employees: With a budget for trainings and a budget to take time for exploring and learning things. I cannot repeat this often enough: It depends so much on the first impression you get of work life which trajectory you take on and what expectations you allow yourself to develop. If you are a junior and your find yourself in a place that is all about the grind and hustle: Get out. Find something sustainable. I honestly don’t believe in “I hustle until my 30ths and than I stop” thing. Find a place that enables growth on a mutual level.

Anyway, I didn’t grind away with OOS contribution either. I started to experiment with giving talks. First at lokal meetups and (Java) user groups with good success. I somewhat like it, but it stresses the hell out of me. I am more of a writer. But it with some persistence it give me some kinda good name, which is of course valuable, both for contribution and growing a network.

Talks this days is a so-so topic. I personally find it really hard to justify traveling through the world and giving talks at every possible place (and not only eying the pandemic here). There’s no reason for me to fly to Brazil or so giving a talk about Spring Boot or Quarkus. There are excellent developers out in the world everywhere and I would rather coach a person from anywhere in the world to talk about Neo4j than flying ten thousand miles todo it myself just for a day off.

Back to open source: Make yourself a bit of a name. Report issues. Reach out over appropriate channels (tickets, Gitter, Slack; no unsolicited private messages). If you want to contribute code: Look for projects that have for contribution labels.

When you’re excited about a new idea and you might already have implemented it in someway or form and you think “hey, let’s just submit it!”, think again: Imagine the people at the receiving end. Is it a new feature? How will it be maintained in the future? Is it just something for a very small use case? Does it fit the rest? Who will own it? It is often safer to open up an issue and discuss if someone wants a new feature in their project or not. This will safe everybody’s time in the end.

Twitter is a good place for some discussions as well and from a question like this, a great learning across several people can come: Fix potential exponential backtracking in ReflectionUtils array parsing.

If its possible, go out to local meetups or conferences. Don’t just sit in there and spent your time passive, meet people. Talk with them. Listen. Build a network and contribute. In the end it’s a lot about trust and people, as I wrote already in 2017.

I personally was lucky: My close work with Spring and Spring Data people brought me into conversations with two people I would call both inspirational and friends these days: Oliver Drotbohm and later Michael Hunger. With a slight detour trying out consultancy at the good company INNOQ, I ended up at Neo4j. At Neo4j I maintain one of our Open Source projects, Spring Data Neo4j, together with Gerrit Meier. Several other modules have been spun off from there.

This month, I celebrated my 4th anniversary:

Of course, Neo4j doesn’t earn money by paying me and the teams that work on pure Open Source modules (such as connectors and drivers). We do need them however to facilitate the usage of our main products, such as Neo4j Enterprise and Neo4j AuraDB.

For me it was a once-in-a-lifetime opportunity. I get to learn from so many smart people in a world-class database company and on the same time do my work in the open. As said, don’t grind yourself mindlessly away, but also don’t let open doors pass.

Last but not least: Nobody is a worse developer if they don’t do Open Source. It is useful, educational and a lot of fun most of the time, but it’s not required at all to be good at a job.

Update: I had short conversation with Tim about the at when is a good time to enter a project: With rather young or mature projects:

If you follow that thread, you’ll see I spoke also about whether small equals insignificance or not (Hint: Small does not mean insignificance for me… Heck, you might even start something small that YOU need yourself and maybe attracting contributors on your own).

And while I was thinking about that topic, I remembered the initiative from iJUG last year, explicitly sponsoring new people to get into Open Source projects. Markus Karg wrote about this here (in German). While I am personally deeply into Spring and Quarkus these days, the Jakarta EE and Adoption projects are really valuable to the whole Java ecosystem.

And last but not least, sometimes things just don’t work out. Have a look at that small 5 Minute video:

Markus and Andres a both well known in the Java ecosystem, both avid Open source contributors and committers. And even though they followed the recommended approach, asking before contributing a new (small) feature, they weren’t able to get it in. This is super frustrating, but it happens and it happens to experience people as well. Don’t let it get to you if it happens to you.

Title photo by Peter Herrmann on Unsplash

| Comments (1) »

03-Jul-22


Winding down 2021

It’s late December and I am winding down with 2021, which was pretty much 2020 too, while looking skeptical into actual 2022.

I will come up with a personal review after I am done with the #Rapha500 and will focus here on what I found out to be great in 2021 work wise (aka programming Java and database related things).

Spring Data Neo4j 6

Spring Data Neo4j 6 6.0.0 was actually released October 2020, super-seeding SDN5+OGM. The project started out as early as 2019 as SDN/RX and we at Neo4j had big ambitions to create a worthy successor. We in this case are Gerrit Meier and me.

I think we did succeed in many terms: We managed to get on the Reactive-Hypetrain with SDN 6. Something that would not have been possible with Neo4j-OGM, which basically tries to recreate a subgraph from the Neo4j database on the client side just before mapping. That subgraph creation did not play nicely with a reactive flow, so we needed to come up with something else and focussed on individual records to be mapped.

And that came with a couple of issues: We thought we knew everything that customers and users had been throwing at Neo4j-OGM over the years, but boy… You’ll never stop learning. And adding insult to injury: While we had a really long beta period with SDN/RX, a long enough warning that SDN 6 would be a migration and not an upgrade and also had betas there, 2021 started with… surprised users. Of course.

Until then we 26 releases of Spring Data Neo4j 6 this year: 7 patches for 6.0, 5 milestones and 1 RC for 6.1, 6.1 itself followed by 7 patches again, 3 milestones of 6.2, one RC and eventually 6.2 last month.

A big thank you to Mark Paluch who not only gave us so much invaluable feedback in that time, but also ran most of the releases.

No more zoom talks…

I tried to give a talk with the following abstract twice this year:

2014, the reactive manifesto has been written: A pledge to make systems responsive, resilient, elastic and message driven. Two years later, in 2016, reactive programming started to go mainstream, with the rise of Spring Webflux and Project Reactor. Another two years later, many NoSQL databases provide reactive database access. Neo4j, a transactional Graph database, didn’t have a solution back than, but we started a big cross team effort to provide reactive database access to Neo4j. It landed in 2019: A reactive stack, right from the query execution engine, to the transport layer and the driver (Client connection) at the other end.

But that left the team working at Neo4j-OGM and Spring Data Neo4j out: What todo with an Object mapper that had been deeply inspired by Hibernate and was working on a fully hydrated subgraph client side?

Well, we did what many developers did: Just let us rewrite the thing, take some inspiration from Spring Data JDBC and also from modern approaches to querying databases like jOOQ and be done with it.

While we did manage to make a lot of new users happy, we didn’t expect so many tickets from old users. They where not complaining about changed annotations or configuration, but more about that we removed things we considered drawbacks of the old system but had been features people actually used.

If In-Person conferences should ever happen again, I am inclined to actually do this at some point. However, not remote. I am done with Zoom-talks, I just can any more…

TL;DR

Right now, SDN 6.2 is in an excellent shape. We have been able to iron out all outstanding big issues, made our ideas clearer and also added brand new things: Like GraphQL support via Query-DSL integration, a much improved Cypher-DSL, the most feature rich projection mechanism of all Spring Data projects (which got even back-ported into Spring Data Commons) and all that by only being in the same room once in 2021 (albeit, for BBQ).

I am really thankful to have colleagues and friends like Gerrit. It’s great when you can not only dream up things together, but also take care of issues later on.

Cypher-DSL

In mid 2020, Michael Hunger gave us the neo4j-contrib/cypher-dsl repository and coordinates to be used to our rebooted Cypher-DSL. We extracted the Cypher-DSL from SDN/RX before it became SDN 6 as we thought tooling like this is valuable beyond an object mapping framework and we guessed right.

In 2021 we pushed out 23 releases: In the last incarnation we support building most statements that Neo4j 4.4 supports, both in a fluent fashion and in a more imperative way. The Cypher-DSL now provides a static type generator based on SDN 6 annotations as well as a Cypher-Parser. The parser module is probably the thing I am most proud of in the projects: It utilizes Neo4js own JavaCC based parser to read Cypher strings into the Cypher-DSL-AST, ready to be changed, validated or transformed.

The Cypher-DSL is used in SDN 6 as the main tooling to build queries but it also used in neo4j-graphql-java, a JVM based stack for transforming GraphQL queries into Cypher. I have written about that here. In addition to that, I hear rumors that GraphAware is using it, too. Well, they can be happy, we just removed all experimental warnings and released 2022.0.0 already.

I appreciate the feedback from Andreas Berger and Christophe Willemsen on that topic a lot. Thank you for being with me on that topic in 2021.

Quarkus and Neo4j

I do wear my “I made Quarkus 1.0” shirt with pride, and I am happy that the Neo4j-extension was part of Quarkus from release 1 up to 2.5.3.
It want be in 2.6.0 directly.

What?! i hear you scream… Be not afraid, it’s one of the first citizen in the Quarkiverse Hub. You’ll find it in quarkiverse/quarkus-neo4j and of course on code.quarkus.io and it learned so many new tricks this year, especially the Quarkus Dev-Services support for which I even created a video:

I fully support the decision to move extension to a separate org while retaining the close connection to the parent project, both via the orgs and the code generator. The discussion and the arguments are nothing but stellar, have a look at Quarkus #16870, Moving extensions outside of core repository.

My life as a maintainer of that extension is much easier in that form. A big shoutout to the people in the above discussion, especially to Guillaume Smet, George Gastaldi and also to Galder ZamarreΓ±o. It’s a pleasure working with you.

Neo4j-Migrations

My database refactoring toolkit for Neo4j, Neo4j-Migrations completely escalated in 2021. While it started off as a small shim to integrate SDN 6 into JHipster (btw, I’m still super happy about every encounter with Frederik and Matt), it now does a ton of things:

  • Has CLI
  • Is distributed as native packages for macOS, Linux and Windows
  • Has a fully automated release process
  • Has a quarkus extension
  • Supports lifecycle callbacks

and more… which just got released as 1.2.2. The biggest impact on that project and my motivation has been made by Andres Almiray and JReleaser. Andres not just reached out to me to teach about JReleaser, he picked up my project, played with it, came up with a suggested workflow and we hacked together the missing pieces in an afternoon. Stunning.

If you find either my Neo4j-Migrations tooling or JReleaser useful, leave a star, or support Andres in a form that suites you.

More things

Similar to the way we created Neo4j support in Quarkus for a nice OOTB experience, Dmitry Alexandrov and me started writing a similar extension in Oracles Helidon.io project. I really, really appreciate that companies can work together in a positive way despite the fact that they are competitors in other areas.

Speaking about Oracle: Every single interaction with their GraalVM team has been just splendid. Thanks Alina Yurenko and team!

Thanks to Kevin Wittek we have been able to participate in the beta-testing of AtomicJars “Testcontainers Cloud, an absolutely lovely experience. I do see a bright future for the journey that Richard North and Sergei Egorov started.

There are many more people who’s input and feedback I appreciate a lot, not only this year, but previous and upcoming as well. Here a just a couple of them Gunnar Morling, knowledgable in so many ways and always fun talking with, Samuel Nitsche for input way beyond “just the tech” and surely Markus Eisele for always having an open ear.

Of course, there are even more. Remember, you all are valid. And more often than not, you do influence people, in some way or the other. I’m grateful to have a lot of excellent people in my life.

And with that, I sincerely hope that my first statement in this article will be just a bad pun and that 2022 will not be 2020, too and we can eventually safely meet in person again. Until then, stay safe and do create cool things. I still think that not all is fucked up, actually.

(Titel image by Vidar Nordli-Mathisen.)

| Comments (1) »

20-Dec-21


GraalVM and proxy bindings with embedded languages.

A while ago I had the opportunity to publish a post in the GraalVMs Medium blog titled The many ways of polyglot programming with GraalVM which is still accurate.

A year later, the GraalVM just got better in many dimensions: Faster, supporting JDK17 and I think its documentation is now quite stellar. Have a look at both the polyglot programming and the embeddings language. The later is what we are referring to in this post.

The documentation has excellent examples about how to access host objects (in my example Java objects) from the embedded language and vice versa. First, here’s how to access a host object (an instance of MyClass) from embedded JavaScript:

public static class MyClass {
    public int               id    = 42;
    public String            text  = "42";
    public int[]             arr   = new int[]{1, 42, 3};
    public Callable<Integer> ret42 = () -> 42;
}
 
public static void main(String[] args) {
    try (Context context = Context.newBuilder()
                               .allowAllAccess(true)
                           .build()) {
        context.getBindings("js").putMember("javaObj", new MyClass());
        boolean valid = context.eval("js",
               "    javaObj.id         == 42"          +
               " && javaObj.text       == '42'"        +
               " && javaObj.arr[1]     == 42"          +
               " && javaObj.ret42()    == 42")
           .asBoolean();
        assert valid == true;
    }
}

This is the essential part: context.getBindings("js").putMember("javaObj", new MyClass());. The instance is added to the bindings of JavaScript variables in the polyglot context. In the following eval block, a boolean expression is defined and returned, checking if all the values are as expected.

Vice versa, accessing JavaScript members of the embedded language from the Java host looks like this:

try (Context context = Context.create()) {
    Value result = context.eval("js", 
                    "({ "                   +
                        "id   : 42, "       +
                        "text : '42', "     +
                        "arr  : [1,42,3] "  +
                    "})");
    assert result.hasMembers();
 
    int id = result.getMember("id").asInt();
    assert id == 42;
 
    String text = result.getMember("text").asString();
    assert text.equals("42");
 
    Value array = result.getMember("arr");
    assert array.hasArrayElements();
    assert array.getArraySize() == 3;
    assert array.getArrayElement(1).asInt() == 42;
}

This time, a result is defined directly in the JavaScript context. The result is a JavaScript object like structure and its values are asserted. So far, so (actually) exciting.

There is a great api that allows what in terms of members and methods can be accessed from the embedding (read more here) and we find a plethora of options more (how to scope parameters, how to allow access to iterables and more).

The documentation is however a bit sparse on how to use org.graalvm.polyglot.proxy.Proxy. We do find however a good clue inside the JavaDoc of the aforementioned class:

Proxy interfaces allow to mimic guest language objects, arrays, executables, primitives and native objects in Graal languages. Every Graal language will treat instances of proxies like an object of that particular language.

So that interface essentially allows you to stuff a host object into the guest and there it behaves like the native thing. GraalVM actually comes with a couple of specializations for it:

  • ProxyArray to mimic arrays
  • ProxyObject to mimic objects with members
  • ProxyExecutable to mimic objects that can be executed
  • ProxyNativeObject to mimic native objects
  • ProxyDate to mimic date objects
  • ProxyTime to mimic time objects
  • ProxyTimeZone to mimic timezone objects
  • ProxyDuration to mimic duration objects
  • ProxyInstant to mimic timestamp objects
  • ProxyIterable to mimic iterable objects
  • ProxyIterator to mimic iterator objects
  • ProxyHashMap to mimic map objects

Many of them provide static factory methods to get you an instance of a proxy that can be passed to the polyglot instance as in the first example above. The documentation itself has an example about array proxies. The question that reached my desk was about date related proxies, in this case a ProxyInstant, something that mimics things representing timestamps in the guest. To not confuse Java programmers more than necessary, JavaScript has the same mess with it’s Date object than what we Java programmers have with java.util.Date: A think to represent it all. Modern Java is much more clearer these days and call it what it is: An java.time.Instant (An instantaneous point on the time-line).

So what does ProxyInstant do? ProxyInstant.from(Instant.now()) gives you an object that when passed to embedded JavaScript behaves in many situation like JavaScripts date. For example: It will compare correctly, but that’s pretty much exactly how far it goes.

Methods like getTime, setTime on the proxy inside the guest (at least in JavaScript) won’t work. Why is that? The proxy does not map all those methods to the JavaScripts object members and it actually has no clue how: The proxy can be defined on a Java instant, date or nothing thereof at all and just use a long internally…

So how to solve that? Proxies in the host can be combined and we add ProxyObject:

public static class DateProxy implements ProxyObject, ProxyInstant {
}

ProxyObject comes with getMember, putMember, hasMember and getMemberKeys. In JavaScript, both attributes and methods of an object are referred to as members so that is exactly what we are looking for to make for example getTime working. One possible Proxy object to make Java’s instant or date work as JavaScript date inside embedded JS on GraalVM therefor looks like this

@SuppressWarnings("deprecation")
public static class DateProxy implements ProxyObject, ProxyInstant {
 
  private static final Set<String> PROTOTYPE_FUNCTIONS = Set.of(
    "getTime",
    "getDate",
    "getHours",
    "getMinutes",
    "getSeconds",
    "setDate",
    "setHours",
    "toString"
  );
 
  private final Date delegate;
 
  public DateProxy(Date delegate) {
    this.delegate = delegate;
  }
 
  public DateProxy(Instant delegate) {
    this(Date.from(delegate));
  }
 
  @Override
  public Object getMember(String key) {
    return switch (key) {
      case "getTime" -> (ProxyExecutable) arguments -> delegate.getTime();
      case "getDate" -> (ProxyExecutable) arguments -> delegate.getDate();
      case "setHours" -> (ProxyExecutable) arguments -> {
        delegate.setHours(arguments[0].asInt());
        return delegate.getTime();
      };
      case "setDate" -> (ProxyExecutable) arguments -> {
        delegate.setDate(arguments[0].asInt());
        return delegate.getTime();
      };
      case "toString" -> (ProxyExecutable) arguments -> delegate.toString();
      default -> throw new UnsupportedOperationException("This date does not support: " + key);
    };
  }
 
  @Override
  public Object getMemberKeys() {
    return PROTOTYPE_FUNCTIONS.toArray();
  }
 
  @Override
  public boolean hasMember(String key) {
    return PROTOTYPE_FUNCTIONS.contains(key);
  }
 
  @Override
  public void putMember(String key, Value value) {
    throw new UnsupportedOperationException("This date does not support adding new properties/functions.");
  }
 
  @Override
  public Instant asInstant() {
    return delegate.toInstant();
  }
}

Most of the logic is in hasMember and the actual dispatch in getMember: Everything that a member can represent can be returned! So either concrete values that are representable inside the embedded language or proxy objects again. As we want to represent methods on that JavaScript object we return ProxyExecutable! Execution will actually be deferred until called in the guest. What happens in the call is of course up to you. I have added examples for just getting values from the delegate but also for manipulating it. Because of the later I found it sensible to use a java.util.Date as delegate, but an immutable Instant on a mutable attribute of the proxy object would have been possible as well.

Of course there are methods left out, but I think the idea is clear. The proxy object works as expected:

public class Application {
 
  public static void main(String... a) {
 
    try (var context = Context.newBuilder("js").build()) {
 
      var today = LocalDate.now();
      var bindings = context.getBindings("js");
      bindings.putMember("javaInstant", new DateProxy(today.atStartOfDay().atZone(ZoneId.of("Europe/Berlin")).toInstant()));
      bindings.putMember("yesterday", new DateProxy(today.minusDays(1).atStartOfDay().atZone(ZoneId.of("Europe/Berlin")).toInstant()));
      var result = context.eval("js",
        """
          var nativeDate = new Date(new Date().toLocaleString("en-US", {timeZone: "Europe/Berlin"}));
          nativeDate.setHours(12);
          nativeDate.setMinutes(0);
          nativeDate.setSeconds(0);
          nativeDate.setMilliseconds(0);
 
          javaInstant.setHours(12);
          ({
            nativeDate   : nativeDate,
            nativeTimeFromNativeDate : nativeDate.getTime(),
            javaInstant: javaInstant,
            diff: nativeDate.getTime() - javaInstant.getTime(),
            isBefore: yesterday < nativeDate,
            nextWeek: new Date(javaInstant.setDate(javaInstant.getDate() + 7))
          })
          """);
 
 
      assertThat(result.getMember("nativeDate").asDate()).isEqualTo(today);
      assertThat(result.getMember("diff").asLong()).isZero();
      assertThat(result.getMember("isBefore").asBoolean()).isTrue();
      assertThat(result.getMember("nextWeek").asDate()).isEqualTo(today.plusWeeks(1));
    }
  }
}

As always, there are two or more sides to solutions: With the one above, you are in full control of what is possible or not. On the other hand, you are in full control of what is possible or not. There will probably edge cases if you pass in such a proxy to an embedded program which in turn calls things on it you didn’t foresee. On the other hand, it is rather straight forward and most likely performant without too many context switches.

The other option would be pulling a JavaScript date from the embedded into the Java host like so var javaScriptDate = context.eval("js", "new Date()"); and manipulate it there.

Either way, I found it quite interesting to dig into GraalVM polyglot again thanks to one of our partners asking great questions and I hope you find that insight here useful as well. A full version of that program is available as a runnable JBang script:

jbang https://gist.github.com/michael-simons/556dd49744aae99a72aa466bd3b832a0

As you might have noticed in the snippets above, I am on Java 17. The script runs best GraalVM 21.3.0 JDK 17 but will also be happy (more or less) on stock JDK 17.

| Comments (5) »

26-Nov-21


Testing in a modular world

I am not making a secret out of it, I am a fan of the Java Module system and I think it can provide benefit for library developers the same way it brings for the maintainers and developers of the JDK themselves.

If you are interested in a great overview, have a look at this comprehensive post about Java modules by Jakob Jenkov. No fuss, just straight to the matter. Also, read what Christian publishes: sormuras.github.io. It is no coincidence that my post in 2021 has the same name as his in 2018.

At the beginning of this months, I wrote about a small tool I wrote for myself, scrobbles4j. I want the client to be able to run on the module path and the module path alone. Why am I doing this? Because I am convinced that modularization of libraries will play a bigger role in Javas future and I am responsible for Spring Data Neo4j (not yet modularized), the Cypher-DSL (published as a Multi-Release-Jar, with module support on JDK11+ and the module path) and I advise a couple of things on the Neo4j Java driver and I just want to know upfront what I have to deal with.

The Java module system starts to be a bit painful when you have to deal with open- and closed-box testing.

Goal: Create a tool that runs on the module path, is unit-testable without hassle in any IDE (i.e. does not need additional plugins, config or conventions) and can be integration tested. The tool in my case (the Scrobbles4j application linked above) is a runnable command line tool depending on various service implementations defined by modules. A Java module does not need to export or open a package to be executable, which will be important to notice later on!

Christian starts his post above with the “suggestion” to add the (unit) test classes just next to the classes under test… Like it was ages ago. Christians blog post ist from 2018, but honestly, that reassembles my feeling all to well when I kicked this off: It’s seems to be the easiest solution and I wonder if this is how the JDK team works.

I prefer not todo this as I am happy with the convention of src/main and src/test.

As I write this, most things work out pretty well with Maven (3.8 and Surefire 3.0.0.M5) and the need for extra config vanished.

Have a look at this repository: michael-simons/modulartesting. The project’s pom.xml has everything needed to successfully compile and test Java 17 code (read: The minimum required plugin versions necessary to teach Maven about JDK 17). The project has the following structure:

.
β”œβ”€β”€ app
β”‚Β Β  β”œβ”€β”€ pom.xml
β”‚Β Β  └── src
β”‚Β Β      β”œβ”€β”€ main
β”‚Β Β      β”‚Β Β  └── java
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ app
β”‚Β Β      β”‚Β Β      β”‚Β Β  └── Main.java
β”‚Β Β      β”‚Β Β      └── module-info.java
β”‚Β Β      └── test
β”‚Β Β          └── java
β”‚Β Β              └── app
β”‚Β Β                  └── MainTest.java
β”œβ”€β”€ greeter
β”‚Β Β  β”œβ”€β”€ pom.xml
β”‚Β Β  └── src
β”‚Β Β      β”œβ”€β”€ main
β”‚Β Β      β”‚Β Β  └── java
β”‚Β Β      β”‚Β Β      β”œβ”€β”€ greeter
β”‚Β Β      β”‚Β Β      β”‚Β Β  └── Greeter.java
β”‚Β Β      β”‚Β Β      └── module-info.java
β”‚Β Β      └── test
β”‚Β Β          └── java
β”œβ”€β”€ greeter-it
β”‚Β Β  β”œβ”€β”€ pom.xml
β”‚Β Β  └── src
β”‚Β Β      └── test
β”‚Β Β          └── java
β”‚Β Β              β”œβ”€β”€ greeter
β”‚Β Β              β”‚Β Β  └── it
β”‚Β Β              β”‚Β Β      └── GreeterIT.java
β”‚Β Β              └── module-info.java
└── pom.xml

That example here consists of a greeter module that creates a greeting and an app module using that greeter. The greeter requires a non-null and not blank argument. The app module has some tooling to assert its arguments. I already have prepared a closed test for the greeter module.

The whole setup is compilable and runnable like this (without using any tooling apart JDK provided means). First, compile the greeter and app modules. The --module-source-path can be specified multiple times, the --module argument takes a list of modules:

javac -d out --module-source-path greeter=greeter/src/main/java --module-source-path app=app/src/main/java --module greeter,app

It’s runnable on the module path like this

java --module-path out --module app/app.Main world
> Hello world.

As said before, the app module doesn’t export or opens anything. cat app/src/main/java/module-info.java gives you:

module app {
 
	requires greeter;
}

Open testing

Why is this important? Because we want to unit-test or open-test this module (or use in-module testing vs extra-module testing).
In-module testing will allow us to test package private API as before, extra-module testing will use the modular API as-is and as other modules will do, hence: It will map to integration tests).

The main class is dead stupid:

package app;
 
import greeter.Greeter;
 
public class Main {
 
	public static void main(String... var0) {
 
		if (!hasArgument(var0)) {
			throw new IllegalArgumentException("Missing name argument.");
		}
 
		System.out.println((new Greeter()).hello(var0[0]));
	}
 
	static boolean hasArgument(String... args) {
		return args.length > 0 && !isNullOrBlank(args[0]);
	}
 
	static boolean isNullOrBlank(String value) {
		return value == null || value.isBlank();
	}
}

It has some utility methods I want to make sure they work as intended and subject them to a unit test. I test package-private methods here, so this is an open test and a test best on JUnit 5 might look like this:

package app;
 
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertTrue;
 
import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.ValueSource;
 
class MainTest {
 
	@Test
	void isNullOrBlankShouldDetectNullString() {
		assertTrue(Main.isNullOrBlank(null));
	}
 
	@ParameterizedTest
	@ValueSource(strings = { "", " ", "  \t " })
	void isNullOrBlankShouldDetectBlankStrings(String value) {
		assertTrue(Main.isNullOrBlank(value));
	}
 
	@ParameterizedTest
	@ValueSource(strings = { "bar", "  foo \t " })
	void isNullOrBlankShouldWorkWithNonBlankStrings(String value) {
		assertFalse(Main.isNullOrBlank(value));
	}
}

It lives in the same package (app) and in the same module but under a different source path (app/src/test). When I hit the run button in my IDE (here IDEA), it just works:

But what happens if I just run ./mvnw clean verify? Things fail:

[INFO] --- maven-surefire-plugin:3.0.0-M5:test (default-test) @ app ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running app.MainTest
[ERROR] Tests run: 6, Failures: 0, Errors: 6, Skipped: 0, Time elapsed: 0.05 s <<< FAILURE! - in app.MainTest
[ERROR] app.MainTest.isNullOrBlankShouldDetectNullString  Time elapsed: 0.003 s  <<< ERROR!
java.lang.reflect.InaccessibleObjectException: Unable to make app.MainTest() accessible: module app does not "opens app" to unnamed module @7880cdf3
 
[ERROR] app.MainTest.isNullOrBlankShouldWorkWithNonBlankStrings(String)[1]  Time elapsed: 0 s  <<< ERROR!
java.lang.reflect.InaccessibleObjectException: Unable to make app.MainTest() accessible: module app does not "opens app" to unnamed module @7880cdf3
 
[ERROR] app.MainTest.isNullOrBlankShouldWorkWithNonBlankStrings(String)[2]  Time elapsed: 0 s  <<< ERROR!
java.lang.reflect.InaccessibleObjectException: Unable to make app.MainTest() accessible: module app does not "opens app" to unnamed module @7880cdf3
 
[ERROR] app.MainTest.isNullOrBlankShouldDetectBlankStrings(String)[1]  Time elapsed: 0.001 s  <<< ERROR!
java.lang.reflect.InaccessibleObjectException: Unable to make app.MainTest() accessible: module app does not "opens app" to unnamed module @7880cdf3
 
[ERROR] app.MainTest.isNullOrBlankShouldDetectBlankStrings(String)[2]  Time elapsed: 0.001 s  <<< ERROR!
java.lang.reflect.InaccessibleObjectException: Unable to make app.MainTest() accessible: module app does not "opens app" to unnamed module @7880cdf3
 
[ERROR] app.MainTest.isNullOrBlankShouldDetectBlankStrings(String)[3]  Time elapsed: 0 s  <<< ERROR!
java.lang.reflect.InaccessibleObjectException: Unable to make app.MainTest() accessible: module app does not "opens app" to unnamed module @7880cdf3
 
[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR]   MainTest.isNullOrBlankShouldDetectBlankStrings(String)[1] Β» InaccessibleObject
[ERROR]   MainTest.isNullOrBlankShouldDetectBlankStrings(String)[2] Β» InaccessibleObject
[ERROR]   MainTest.isNullOrBlankShouldDetectBlankStrings(String)[3] Β» InaccessibleObject
[ERROR]   MainTest.isNullOrBlankShouldDetectNullString Β» InaccessibleObject Unable to ma...

To understand what’s happening here we have to look what command is run by the IDE. I have appreviated the command a bit and kept the important bits:

/Library/Java/JavaVirtualMachines/jdk-17.jdk/Contents/Home/bin/java -ea \
--patch-module app=/Users/msimons/Projects/modulartesting/app/target/test-classes \
--add-reads app=ALL-UNNAMED \
--add-opens app/app=ALL-UNNAMED \
--add-modules app \
// Something something JUnit app.MainTest

Note: Why does it say app/app? The first app is the name of the module, the second the name of the exported package (which are in this case here the same).

First: --patch-module: “Patching modules” teaches us that one can patch sources and resources into a module. This is what’s happening here: The IDE adds my test classes into the app module, so they are subject to the one and only allowed module descriptor in that module.
Then --add-reads: This patches the module descriptor itself and basically makes it require another module (here: simplified everything).
The most important bit to successfully test things: --add-opens: It opens the app module to the whole world (but especially, to JUnit). It is not that JUnit needs direct access to the classes under test, but to the test classes which are – due to --patch-module part of the module.

Let’s compare what Maven/Surefire does with ./mvnw -X clean verify:

[DEBUG] Path to args file: /Users/msimons/Projects/modulartesting/app/target/surefire/surefireargs17684515751207543064
[DEBUG] args file content:
--module-path
"/Users/msimons/Projects/modulartesting/app/target/classes:/Users/msimons/Projects/modulartesting/greeter/target/greeter-1.0-SNAPSHOT.jar"
--class-path
"damn long class path"
--patch-module
app="/Users/msimons/Projects/modulartesting/app/target/test-classes"
--add-exports
app/app=ALL-UNNAMED
--add-modules
app
--add-reads
app=ALL-UNNAMED
org.apache.maven.surefire.booter.ForkedBooter

It doesn’t have the --add-opens part! Remember when I wrote that the app-module has no opens or export declaration? If it would, the --add-opens option would not have been necessary and my plain Maven execution would work. But adding it to my module is completely against what I want to achieve.
And as much as I appreciate Christian and his knowledge, I didn’t get any solution from his blog to above to work for me. What does work is just adding the necessary opens to surefire like this:

<build>
	<plugins>
		<plugin>
			<artifactId>maven-surefire-plugin</artifactId>
			<configuration combine.self="append">
				<argLine>--add-opens app/app=ALL-UNNAMED</argLine>
			</configuration>
		</plugin>
	</plugins>
</build>

There is actually an open ticket in the Maven tracker about this exact topic: SUREFIRE-1909 – Support JUnit 5 reflection access by changing add-exports to add-opens (Thanks Oliver for finding it!). I would love to see this fixed in Surefire. I mean, the likelihood that someone using surefire wants also to use JUnit 5 accessing their module is pretty high.

You might rightfully ask why open the app-module to “all-unnamed” and not to org.junit.platform.commons because the latter is what should access the test classes? The tests doesn’t run on the module path alone but on the classpath which does access modules which is perfectly valid as explained by Nicolai. We are having a dependency from the classpath on the module path here and we must make sure that the dependent is allowed to read the dependency.

Now to

Closed or integration testing

From my point of view closed testing in the modular world should be done with actually, separate modules. Oliver and I agreed that this thinking probably comes from a TCK based working approach, such as applied in the JDK itself or in the Jakarta EE world, but it’s not a bad approach, quite the contrary. What you get here is an ideal separation of really different types of tests, without fiddling with test names and everything.

An integration test for the greeter could look like this:

package greeter.it;
 
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertThrows;
 
import greeter.Greeter;
 
import org.junit.jupiter.api.Test;
 
class GreeterIT {
 
	@Test
	void greeterShouldWork() {
		assertEquals("Hello, modules.", new Greeter().hello("modules"));
	}
 
	@Test
	void greeterShouldNotGreetNothing() {
 
		assertThrows(NullPointerException.class, () -> new Greeter().hello(null));
	}
}

As it lives in a different module, the package is different (greeter.it) and so is the module name. Therefor, we can happily specify a module descriptor for the test itself, living in src/test:

module greeter.it {
 
	requires greeter;
	requires org.junit.jupiter;
 
	opens greeter.it to org.junit.platform.commons;
}

The module descriptor makes it obvious: This test will run on the module path alone! I can clearly define what is required and what I need to open up for usage of reflection. Notice: I open up the integration test module (greeter.it), not the greeter module itself!

Verdict

Testing in the modular world requires some rethinking. You need to learn about the various options of javac and java in regards of the module system.
I found The javac Command particular helpful. Pain points are usually to be found where class-path and module-path met. Sadly, this is often the case in simple unit or open tests. In a pure classpath world, they are easier to handle.

However, Maven and its plugins are getting there. I haven’t checked Gradle, but I guess that ecosystem is moving along as well. For integration tests, the world looks actually neat, at least in terms of the test setup itself. Everything needed can be expressed through module descriptors. Testing effort for something like the Spring framework itself to provide real module descriptors is of course something else and I am curious with what solution the smart people over there will come along in time of Spring 6.

If you find this post interesting, feel free to tweet it and don’t forget to checkout the accompanying project: https://github.com/michael-simons/modulartesting.

| Comments (6) »

19-Oct-21


Yet another incarnation of my ongoing scrobbles

These days my computer work is mostly concerned with all things Neo4j. Being it Spring Data Neo4j, the Cypher DSL, our integration with other frameworks and tooling or general internal things, among them being part of the Cypher language group.

In contrast to a couple of years, I don’t spent that much time around a computer in non-working hours anymore. My sleep got way better and I feel in general better. For reference, see the slides of a talk I wanted to give in 2020.

And I have to be honest: I feel distanced and tired of a couple of things I used to enjoy more a while back.

Last week however Hendrik Schreiber published japlscript and a collection of derived work: Java libraries that allow scripting of Apple applications on OS X and macOS.

As it happens, I have – as part of DailyFratze.de – a service that receives everything I play on iTunes and logs it. I have been doing this since 2005. The data accumulated with that service lead to several variations of this article and talk about database centric applications with Spring Boot and jOOQ. Here is the latest English version of it (in the speaker deck are more variations).

The sources behind that talk are in my repository bootiful-database. The repo started of with an Oracle database which I eventually replaced with Postgres. In both databases I can use modern SQL, for example window functions, all kinds of analytics, common table expressions and so on.

The actual live data is still in Daily Fratze. What that is? See DailyFratze: Ein tΓ€gliches Foto Projekt oder wie es zu “Macht was mit SQL und Spring” kam. However, there’s a caveat: The database of the service has been an older MySQL version for way too long. While it has charts like this visible to my users:



the queries are not as nice as the one in the talks.

When I wrote this post at the end of 2020, I had a look at MariaDB 10.5. It was super easy to import my old MySQL data from 5.6 into it and to my positive surprise, all SQL features I used from the talk could be applied.

So last week, the first step of building something new was migrating from MySQL 5.6 to MariaDB latest and to my positive (and big) surprise again: It was a walk in the park. Basically replacing repositories and updating the mysql-server package and accepting the new implementation. Hardly any downtime, even the old JDBC connector I use here and there can be reused. That’s developer friendly. My daily picture project just keeps running as is.

Now what?

  • Scrobbling with Hendriks Library. Ideally in a modular way, having separate sources and sinks and an orchestrating application. It basically screams Java modules.
  • Finally put the queries and stuff I talked so often about to a live application

I created scobbles4j. The idea is to have model represented in both SQL and Java records, a client module and a fresh server.

My goal is to keep the client dependency free (apart the modules integrating with Apple programs) and later use the fantastic JDK11+ HTTP Client. For the server I picked Quarkus. Why? It has been a breath of fresh air since 2019, a really pleasant to work with project and community. I was able to contribute several things Neo4j to it (and they even sent my shirts for that, how awesome!), but I never had the chance to really use it.

Java modules

Once you get the idea of modules, they help a lot on the scale of libraries and small applications like the one I want to build to keep things organized. Have a look at the sources api. It’s main export is this and implementations like the one for Apple Music can provide it like that. You see in the last linke, that the package scrobbles4j.client.sources.apple.music is not exported, so the service implementation just can stay public and is still not accessible. Neat. The client to this services need to know only the APIs: requiring service apis and loading them.

Thing you can explorer: How I added a “bundle” project, combining launcher and all sources and sinks and using jlink to create a custom runtime image.

Testing in a modular world is still sadly problematic. Maven / Surefire will stuff everything in test on the module path, IDEA on the class path. The later is easier, the former next to impossible without patching the module path (if you don’t want to have JUnit and friends in your main dependencies). Why? There’s only one module-info.java per artifact. As main and test sources are eventually combined, having an open module in test is forbidden.

There are a couple of posts like this, but tbh, I wasn’t able to make this fly. Eventually, I just opened my module explicitly in the surefire setup, which works for me with pure Maven and IDEs. Probably another approach like having explicit test modules is the way but this I find overkill for white box testing (aka plain unit tests).

Quarkus

One fact that is maybe not immediate obvious: Quarkus projects don’t have a default parent pom. They require you to import their dependency management and configure a custom Maven plugin. Nothing hard, see for yourself yourself. You probably won’t even notice it when create a new project at code.quarkus.io. However, it really helps you in a multi-module setup such as scrobbles4j. Much easier than one module wanting to have a separate parent.

I went the imperative way. Mainly I want to use Flyway for database migrations without additional fuzz. As I wanted to focus on queries and the results and also because I like it: Server side rendering it is. Here I picked Qute.

And about that SQL: To jOOQ or not to jOOQ? Well: I have only a hand full of queries, nothing to dynamic and I just want to have some fun and keep it simple. So no jOOQ, no building this time. And also, JDK 17 text blocks. I love them.

What about executing that SQL? If I had chosen the reactive way, I could have used the Quarkus reactive client. I haven’t. But nobody in their right mind will work with JDBC directly. Hmm… Jdbi to the rescue, the core module alone. I didn’t want mapping. In the Spring world, I would have picked JDBCTemplate. Also:

Deploying with fun

One good decision in the Quarkus world is that they don’t create fat jars by default, but a runnable jar in a file structure that feels like a fat jar. Those jars had their time and they had been necessary and they solved issues. This solution that you just can rsync somewhere in seconds, the quick restart times makes it feel like you’re editing PHP files again.

I was only half joking here:

It is actually what I am doing: I created the application in such a way that it is runnable on a fresh scheme and usable by other people too. But I configured flyway in such a way that it will baseline an existing scheme and hold back any version up to a given one (see application.properties) and I can use the app with my original scheme.

However, I am not stupid. I am not gonna share the schema between to applications directly. I did create another database user with read-rights only (apart for Flyways schema history) and a dedicated schema and just created views in that schema for the original tables in the other one. The views do some prepping as well and are basically an API contract. Jordi here states:

I think he’s right. This setup is an implementation of the Command Query Responsibility Segregation (CQRS) pattern. Some non-database folks will argue that this is maybe the Wish variant, but I am actually a fan and I like it that way.

Takeaway

I needed that: A simple to deploy, non-over engineered project with actually some fun data.
There’s tons of things I want to explore in JDK 17 and now I have something to entertain me with in the coming winter evenings without much cycling.

As always, a sane mix is important. I wasn’t in the mood to build something for a while now but these days I am ok with that and I can totally accept it that my life consists of a broad things of topics and not only programming, IT and speaking about those topics. Doesn’t make anyone a bad developer if they don’t work day and night. With the nice nudge and JDK 17 just been released, it really kicked me.

If you’re interested in the actual application: I am running it here: charts.michael-simons.eu. Nope, it isn’t styled and yes, I am not gonna change it in the foreseeable future. That’s not part of my current focus.

Next steps will be replacing my old AppleScript based scrobbler with a source/sink pair. And eventually, I will add a writing endpoint to the new application.

| Comments (2) »

03-Oct-21