Using Thymeleaf inline JavaScript with D3.js

I gave this talk several times now and every time I got asked how I created this visualization, clustering the genres I’m listening to by decade and country:



I just did a super spontaneous video for that. Key take away: Just use server side rendered HTML through Thymeleaf, obviously running on Spring Boot, render the content model as inlined Javascript and be good to go. Want an API later on?

Structure your Spring MVC controller and your service or repository layer in a way that makes it actually reusable.

Anyway, here’s the short clip and below the relevant code snippets:

Spring MVC controller, GenresController.java, either returning a String, denoting the template to be rendered, or the Model as just the response body:

@GetMapping(value = { "", "/" }, produces = MediaType.TEXT_HTML_VALUE)
public ModelAndView genres() {
 
	var model = genresModel();
	return new ModelAndView("genres", model);
}
 
@GetMapping(value = { "", "/" }, produces = MediaType.APPLICATION_JSON_UTF8_VALUE)
@ResponseBody
public Map<String, Object> genresModel() {
 
	var genres = this.genreRepository.findAll(Sort.by("name").ascending());
	var allSubgrenes = genreRepository.findAllSubgrenes();
	var top10Subgenres = allSubgrenes.stream().sorted(comparingLong(Subgenre::getFrequency).reversed()).limit(10).collect(toList());
	return Map.of(
			"genres", genres,
			"subgenres", allSubgrenes,
			"top10Subgenres", top10Subgenres
	);
}

The Thymeleaf template genres.html. The standard controller above renders it when / is requested, accepting text/html.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout"
      xmlns:th="http://www.thymeleaf.org"
      layout:decorate="~{layout}">
<head>
    <title th:text="#{genres}">Placeholder</title>
    <script th:inline="javascript">
        var subgenres = /*[[${subgenres}]]*/ {};
    </script>
</head>
<body>
<!-- Some more content -->
</body>
</html>

Calling this thing application.js is way too much. It’s a simple function, reading in the global subgenres variable and displays it as a nice D3.js chart.

Remember: This is also no silver bullet and I absolutely don’t have anything agains todays Javascript development. I just feel more at home near a database.

| Comments (11) »

31-Jan-19


Documentation as code, code as documentation

This morning, I read this kind tweet by Raumzeitfalle:

This refers to arc42 by example. There will be a second edition of this book soon, with new contributions from Gernot and new, Ralf.

My example in this book, biking2 is still up to date. I used this now several times at customers and every time, I also recommended my approach of generating the documentation as part of the build process. You see the live output here.

What makes those special are the following facts:

  • The docs are written in Asciidoctor, see: src/docs.
  • I’m using the Asciidoctor Maven plugin. That means they are an essential part of the build and aren’t left to die in a Confluence or whatever.
  • Even more, I use maven-resources-plugin to copy them into the output dir from which they are put into the final artifact.
  • While the documentation is strictly structured by the ideas of arc42, here come’s the fun parts:

    • I’m using jQAssistant to continuously verify my personal quality goals with that project: jQAssistant is a QA tool, which allows the definition and validation of project specific rules on a structural level. It is built upon the graph database Neo4j.
      jQAssistant integrates also in your build process. It first analyzes your sources and dependencies and writes every information it can discover through various plugins into an embedded Neo4j instance. That is the scan step. In the following analyze step, it verifies the data agains build-in or custom rules. You write custom rules as Cypher queries. Now the interesting fact: Those concept and rules can be written in Asciidoctor as well. This one, concepts_structure.adoc, declares my concept of a config and support package. Those concepts are executed in Neo4j and add labels to certain nodes. The label is then used in this rule (structure.adoc):

      MATCH (a:Main:Artifact)
      MATCH (a) -[:CONTAINS]-> (p1:Package) -[:DEPENDS_ON]-> (p2:Package) <-[:CONTAINS]- (a)
      WHERE not p1:Config
        and not (p1) -[:CONTAINS]-> (p2)
        and not p2:Support
        and not p1.fqn = 'ac.simons.biking2.summary'
      RETURN p1, p2

      “Give me all the packages of the main artifact but the config and the summary package that depend on other packages that are not contained in the package itself.” If that query returns an entry, the rule is violated. I use that rule to enforce horizontal, domain specific slices.

      Now, the very same, executable rule becomes part of my documentation (see building block view) by just including them. Isn’t that great?

    • I also include specific information from my class files in the docs. Asciidoctor allows including dedicated part of all text files. For example, I use package-info.java files like this directly in the docs: “bikes”. I did this to a much bigger extend in this repo, find the linked articles there. I love Asciidoctor.
    • Last but not least, I use Spring REST docs in my unit tests. Spring REST docs combines hand-written documentation written with Asciidoctor and auto-generated snippets produced with Spring MVC Test. A simple unit test like this gets instrumented by a call to document like show in this block. This describes expectations about parameters and returned values. If they aren’t there in the request or response, the test fails. So one cannot change the api without adapting documentation. You might have guess: The generated Asciidoctor snippet as included again in the final docs, you find it here.

I started working on the documentation in early 2016, after a great workshop with Peter Hruschka and Gernot Starke. This is now 2 years ago and the tooling just got better and better.

Whether you are writing a monolithic application like my example, micro services or desktop applications. There’s no reason not to document your application.

Here are some more interesting projects, that have similar goals:

Also, as an added bonus, there’s DocGist, that creates rendered Asciidoctor files for you to share. Thanks Michael for the tip.

| Comments (5) »

05-Dec-18


Passion.

In “my” industry people speaking a lot of doing things “with passion or not at all.” I’m wonder, are we aware of what passion meant?

The word itself means both in Greek and Latin “to suffer”, which fits very well with the some of the crazy work ethics out there.

I for myself try to stay with the “good feelings” that arose with from the Stoic and might add to them: Engagement instead of passion. I’ll always do my best, but I will not suffer. Or at least, I try not to.

Just a late night thought.

| Comments (0) »

30-Nov-18


Modeling a domain with Spring Data Neo4j and OGM

This is the forth post in this series and I want to keep it short and simple.

A domain can be modeled in many ways and so can databases. As long as I deal with them, I always preferred the approach: Database (model) first. Usually, data is much longer around than applications and I don’t want my first application instance or version define the model for all eternity.

Using an Object-Graph-Mapper or Object-Relational-Mapper can be slightly danger. One tends to write down some class hierarchy and just let the tool do its magic. In the end, there are schemes that sometimes are very hard to read for humans. The danger might be a bit smaller with an OGM as hierarchies and connections map quite nicely onto a graph, but still, I don’t want that to be the default.

My domain can be summarized with a few sentences:

  • There are Artists, that might be highlighted as Bands or Solo Artists
  • Bands have Member
  • Bands are founded in and solo artists are born in countries
  • Sometimes Artists are associated with other Artists
  • Albums are released by Artists in a year which is part of a decade
  • Albums contain multiple tracks that have been played several times in a month of a given year

I spare you the logical ER-diagram for a relational database here and jump straight to the nodes. I highlighted all the important “classes” and their relations.

Modelling data with Neo4j feels a lot like modeling on a whiteboard. And actually, it really is: The whiteboard model ends up being the physical model in the end with Neo4j.

Neo4j is a property graph database. It stores Nodes with one or more Labels, their Relationships with a type among each other and properties for both nodes and relationships.

A label starts with a colon and is usually written with a initial upper case letter, i.e. :Artist and :Album, the type of a relationship is written with a colon and than all uppercase, :RELEASED_BY and properties in camel case, without a colon, i.e. name and firstName. The above list translates in my application to a model like this:



I really find it fascinating how that model reads: Pretty much the same as my verbal description.

How to model this with Neo4-OGM and Spring Data Neo4j? You might want to recap the previous post to get an idea of the moving parts.

Value objects that happen to be persisted

In my domain the most simple objects are probably the year and the decade of the year. Those objects are value objects, they don’t have an identity. A year with a given value is as good as another instance with the same value.

@NodeEntity("Decade")
public class DecadeEntity {
 
    @Id @GeneratedValue
    private Long id;
 
    @Index(unique = true)
    private long value;
 
    public DecadeEntity(long value) {
        this.value = value;
    }
}
 
@NodeEntity(label = "Year")
public class YearEntity {
 
    @Id @GeneratedValue
    private Long id;
 
    @Index(unique = true)
    private long value;
 
    @Relationship("PART_OF")
    private DecadeEntity decade;
 
    YearEntity(final DecadeEntity decade, final long value) {
        this.decade = decade;
        this.value = value;
    }
}

I did model both of them to use them further on in aggregates, for example in the relationship RELEASED_IN, but I don’t see providing dedicated repositories for them. They only have a meaning in connection with other nodes. Things to notice here are: I followed a naming convention: All classes that are mapped to something inside the graph ends in “Entity”. Thus I have to use label attribute of @NodeEntity (or the default attribute) to specify a “nice” label, i.e. @NodeEntity(label = "Year"). I use @Index for completeness. One can configure Neo4j-OGM to automatically create indexes, but TBH, I prefer to create them by hand. There’s also one outgoing relationship from year to decade. A year is part of a decade: @Relationship("PART_OF").

You also notice that I didn’t model any of the other outgoing relationships from the year: Like all the albums released in this year, all the months with play counts in that year or the foundation years of band.

While all of Neo4j, Neo4j-OGM and Spring Data Neo4j could map relationships many levels deep, I don’t think it’s wise from an application performance point of view. I’d rather explicitly select the stuff I need.

A common base class for aggregates

I’m known for my dislike of having common base entities:

For this project however, I included one for several reasons: To not pester every other entity with the technical id, to audit interesting entities and plain simple, to have an example that actually uses our (Spring Data Neo4j) support of Spring Data’s auditing in inheritance scenarios. This is what the class looks like:

public abstract class AbstractAuditableBaseEntity {
    @Id @GeneratedValue // 1
    private Long id;
 
    @CreatedDate // 2
    @Convert(NoOpLocalDateTimeConversion.class) // 3
    private LocalDateTime createdAt;
 
    @LastModifiedDate // 2
    @Convert(NoOpLocalDateTimeConversion.class)
    private LocalDateTime updatedAt;
}
  1. Maps Neo4j’s technical (internal) id to an attribute
  2. Audits the creation respectively modification date of an entity
  3. Due to some issues with the transport, we have to force Neo4j-OGM here to not convert anything. Neo4j 3.4 supports Java 8 time types natively

Simple aggregates or entities

I see something like the Genre as a simple entity:

@NodeEntity("Genre")
public class GenreEntity extends AbstractAuditableBaseEntity {
    @Index(unique = true)
    private String name;
 
    public GenreEntity(String name) {
        this.name = name;
    }
}

It is useful to maintain on its own, has an identity but no mapped relationships. I usually declare a repository for such an entity:

import org.springframework.data.neo4j.repository.Neo4jRepository;
 
public interface GenreRepository extends Neo4jRepository<GenreEntity, Long> {
}

While it’s often preferable to extend such a repository interface not from the concrete store, I do think it’s better having the concrete store at hand in the case of Neo4j. While the concrete implementation brings a lot of CRUD method one doesn’t need all the time, it also brings in overloaded versions of them that take the depth into account as well.

To mitigate the large surface of repository methods, it’s often a good idea to reduce the repositories visibility to a minimum.

I don’t see a problem using such an entity and repository directly from a controller, for example like this:

@Controller
@RequestMapping("/genres")
public class GenreController {
    private final GenreRepository genreRepository;
 
    public GenreController(GenreRepository genreRepository) {
        this.genreRepository = genreRepository;
    }
 
    @GetMapping(value = { "", "/" }, produces = MediaType.TEXT_HTML_VALUE)
    public ModelAndView genres() {
 
        var genres = this.genreRepository.findAll(Sort.by("name").ascending());
        return new ModelAndView("genres", Map.of("genres", genres));
    }
 
    @GetMapping(value = "/{genreId}", produces = MediaType.TEXT_HTML_VALUE)
    public ModelAndView genre(@PathVariable final Long genreId) {
 
        var genre = this.genreRepository.findById(genreId)
            .orElseThrow(() -> new NodeNotFoundException(GenreEntity.class, genreId));
 
        var model = Map.of(
            "genreCmd", new GenreCmd(genre),
            "genre", genre
        );
        return new ModelAndView("genre", model);
    }
 
    @PostMapping(value = { "", "/" }, produces = MediaType.TEXT_HTML_VALUE)
    public String genre(@Valid final GenreCmd genreForm, final BindingResult genreBindingResult) {
        if (genreBindingResult.hasErrors()) {
            return "genre";
        }
 
        var genre = Optional.ofNullable(genreForm.getId())
            .flatMap(genreRepository::findById)
            .map(existingGenre -> {
                existingGenre.setName(genreForm.getName());
                return existingGenre;
            }).orElseGet(() -> new GenreEntity(genreForm.getName()));
        genre = this.genreRepository.save(genre);
 
        return String.format("redirect:/genres/%d", genre.getId());
    }
 
    static class GenreCmd {
 
        private Long id;
 
        @NotBlank
        private String name;
 
        GenreCmd(GenreEntity genreEntity) {
            this.id = genreEntity.getId();
            this.name = genreEntity.getName();
        }
 
        public GenreCmd() {
        }
 
        public Long getId() {
            return this.id;
        }
 
        public String getName() {
            return this.name;
        }
 
        public void setId(Long id) {
            this.id = id;
        }
 
        public void setName(String name) {
            this.name = name;
        }
    }
}

Find the sources here: GenreEntity, GenreRepository and the GenreController.

One use case is definitely “give me all the albums having a specific main genre.” To implement such a case, I rather access that sub collection from the owning site of the relationship. Here, the albums:

@NodeEntity("Album")
public class AlbumEntity extends AbstractAuditableBaseEntity {
 
    @Relationship("RELEASED_BY")
    private ArtistEntity artist;
 
    private String name;
 
    @Relationship("RELEASED_IN")
    private YearEntity releasedIn;
 
    @Relationship("HAS")
    private GenreEntity genre;
 
    private boolean live = false;
 
    public AlbumEntity(ArtistEntity artist, String name, YearEntity releasedIn) {
        this.artist = artist;
        this.name = name;
        this.releasedIn = releasedIn;
    }
}

The repository for that entity declares a method like this:

interface AlbumRepository extends Neo4jRepository<AlbumEntity, Long> {
    List<AlbumEntity> findAllByGenreNameOrderByName(String name, @Depth int depth);
}

It makes use of Spring Data Neo4j’s traversal feature: It traverses the relationship from album to genre and asks for all albums having a relationship of type “HAS” to a genre with the given name.

That’s also a method that I’d wrap in a service, as the method name is super technical and I don’t want that:

@Service
public class AlbumService {
    private final AlbumRepository albumRepository;
 
    public List<AlbumEntity> findAllAlbumsWithGenre(GenreEntity genre) {
        return albumRepository.findAllByGenreNameOrderByName(genre.getName(), 1);
    }
}

Complex aggregates

The Artist is a complex thing. It exists in three different forms: An unspecified artist, solo artist and as bands.

While Neo4j-OGM allows you to add a list of labels to your domain and thus allowing one entity to be mapped to several labels, I don’t like that approach.

Bands and solo artists have quite different attributes, as you can see in the sources linked above and I don’t want them to mix up.

By declaring the Artist class with @NodeEntity("Artist") and the band, which extends from it, with @NodeEntity("Band") and solo artist accordingly, bands and solo artists are stored with this two labels. Polymorphic queries works to some extend with a repository for the base entity, but as Neo4j-OGM applies schema based loading, stuff can be missing from the result.

While polymorphic queries are not a daily use case, this one is:

A band has one or more members:

@RelationshipEntity("HAS_MEMBER")
public static class Member {
    @Id @GeneratedValue
    private Long memberId;
 
    @StartNode
    private BandEntity band;
 
    @EndNode
    private SoloArtistEntity artist;
 
    @Convert(YearConverter.class)
    private Year joinedIn;
 
    @Convert(YearConverter.class)
    private Year leftIn;
 
    Member(final BandEntity band, final SoloArtistEntity artist, final Year joinedIn, final Year leftIn) {
        this.band = band;
        this.artist = artist;
        this.joinedIn = joinedIn;
        this.leftIn = leftIn;
 
    }
}

And in BandEntity:

@NodeEntity("Band")
public class BandEntity extends ArtistEntity {
 
    @Relationship("FOUNDED_IN")
    private CountryEntity foundedIn;
 
    @Relationship("ACTIVE_SINCE")
    private YearEntity activeSince;
 
    @Relationship("HAS_MEMBER")
    private List<Member> member = new ArrayList<>();
 
    public BandEntity(String name, @Nullable String wikidataEntityId, @Nullable CountryEntity foundedIn) {
        super(name, wikidataEntityId);
        this.foundedIn = foundedIn;
    }
 
    BandEntity addMember(final SoloArtistEntity soloArtist, final Year joinedIn, final Year leftIn) {
        Optional<Member> existingMember = this.member.stream()
            .filter(m -> m.getArtist().equals(soloArtist) && m.getJoinedIn().equals(joinedIn)).findFirst();
        existingMember.ifPresentOrElse(m -> m.setLeftIn(leftIn), () -> {
            this.member.add(new Member(this, soloArtist, joinedIn, leftIn));
        });
 
        return this;
    }
 
    public List<Member> getMember() {
 
        return Collections.unmodifiableList(this.member);
    }
}

As you see: No setters for the member and a getter that returns an unmodifiable list. Thus adding (and removing) members goes only through the band. The modified band is then returned. As Neo4j-OGM and Spring Data Neo4j don’t do dirty tracking and don’t save things automatically at the end of a transaction, we have to take care here. Again, I recommend a service layer:

@Service
public class ArtistService {
        @Transactional
        public BandEntity addMember(final BandEntity band, final SoloArtistEntity newMember,  final Year joinedIn, @Nullable final Year leftIn) {
                return this.bandRepository.save(band.addMember(newMember, joinedIn, leftIn));
        }
}

This is again from ArtistService.

To close this up, one final example: When entities are being deleted through Neo4j-OGM, it deletes only relationships, not the target nodes of the relationships. You have to decide wether you want “dangling” nodes in your database or not. Sometimes this is ok, sometimes not. As of today, Neo4j itself has no foreign key constraint on the relationship. And how so? It’s complete ok that a node exists for its own.

In my domain here however, albums without an artist and tracks without albums serve no purpose. To delete them when I delete an artist, I do this again through a service. The session in the following snippet is the autowired OGM session. It’s completely ok to access it. Spring Data Neo4j takes care that it participates in ongoing transactions:

@Service
public class ArtistService {
    private final Session session;
 
    @Transactional
    public void deleteArtist(Long id) {
        this.findArtistById(id).ifPresent(a -> {
                session.delete(a);
                session.query("MATCH (a:Album) WHERE size((a)-[:RELEASED_BY]->(:Artist))=0 DETACH DELETE a", Map.of());
                session.query("MATCH (t:Track) WHERE size((:Album)-[:CONTAINS]->(t))=0 DETACH DELETE t", Map.of());
                session.query("MATCH (w:WikipediaArticle) WHERE size((:Artist)-[:HAS_LINK_TO]->(w))=0 DETACH DELETE w", Map.of());
        });
    }
}

Yes, there is Cypher hidden away in a class. Sometimes there are compromises to be taken, and this one is a comprise that’s ok for me. There’s also JCypher, maybe that would be something to try out in the future.

With all the things here in this post, it’s easy to write a nice application, that deals not only with CRUD, but already presents all the interesting associations:





The complete application is available on GitHub as “bootiful music”. It has some rough edges, also ops wise, but the repository along with the posts of this series should help to get you started.

I’d like to thank Michael a lot for the idea of this query, which results in nice micro genres or categories:



| Comments (1) »

02-Nov-18


No silver bullets here: Accessing data stored in Neo4j on the JVM

In the previous post I presented various ways how to get data into Neo4j. Now that you have a lot of connected data and it’s attributes, how to access, manipulate, add to them and delete them?

I’m working with and in the Spring ecosystem quite a while now and for me the straight answer is – without much surprises – just use the Spring Data Neo4j module if you work inside the Spring ecosystem. But to surprise of some, there’s more than just Spring there outside.

In this blog post I walk you through

  • Using the Neoj4 Java-Driver directly
  • Creating an application based on Micronaut, which went 1.0 GA these days, the Neo4j Java-Driver and Neo4-OGM
  • A full blown Spring Boot application using Spring Data Neo4j

Before we jump right into some of the options you as an application developer have to access data inside Neo4j, we have to get a clear idea of some of the building blocks and moving parts involved. Let’s get started with those.

Building blocks and moving parts

Neo4j Java-Driver

The most important building block for access Neo4j on the JVM is possibly the Neo4j Java Driver. The Java driver is open source and is available on Github under the Apache License. This driver uses the binary “Bolt” protocol.

You can think of that driver as analogue to a JDBC driver that available for a relational database. Neo4j also offers drivers for different languages based on the Bolt protocol.

As with Java’s JDBC driver, there’s a bit of ceremony involved when working with this driver. First you have to acquire a driver instance and then open a session from which you can query the database:

try (
    Driver driver = GraphDatabase.driver( uri, AuthTokens.basic( user, password ) );
    Session session = driver.session()
) {
    List<String> artistNames =
        session
            .readTransaction(tx -> tx.run("MATCH (a:Artist) RETURN a", Map.of()))
            .list(record -> record.get("a").get("name").asString());
}

With that code, one connects against the database and retrieves the names of all artists, I imported in my previous post. What I omitted here is the fact that the driver does connection pooling and one should not open and close it immediately. Instead, you would have to write some boiler plate code to handle this.

There are some important things to notice here: The code speaks of a driver. That is org.neo4j.driver.v1.Driver. The session is also from the same package: org.neo4j.driver.v1.Session. Those both are types from the driver itself. You have to know this things, because those terms will pop up later again. Neo4j-OGM, the object graph mapper, also speaks about drivers and session, but those are completely different things.

The Java driver has a nice type system (see The Cypher type system) and gets you quite far.

Most of the time however, people in the Java ecosystem prefer nominal typing over structural typing and want to map “all the things database” to objects of some kind. Let’s not get into bikeshedding here but just accept things as they are. Given a database model where a musical artist has multiple links to different wikipedia sites, represented like this (I omitted getter and setter for clarity):

public class WikipediaArticleEntity implements Comparable<WikipediaArticleEntity> {
 
    private Long id;
 
    private String site;
 
    private String title;
 
    private String url;
 
    public WikipediaArticleEntity(String site, String title, String url) {
        this.site = site;
        this.title = title;
        this.url = url;
    }
}
 
public class ArtistEntity {
 
    private String name;
 
    private String wikidataEntityId;
 
    private Set<WikipediaArticleEntity> wikipediaArticles = new TreeSet<>();
 
    public ArtistEntity(String name, String wikidataEntityId, Set<WikipediaArticleEntity> wikipediaArticles) {
        this.name = name;
        this.wikidataEntityId = wikidataEntityId;
        this.wikipediaArticles = wikipediaArticles;
    }
}

To fill such a model directly by interacting purely with the driver, you’ll have to do something like this: A driver session get’s opened, than we write a query in Neo4j’s declarative graph query language called Cypher, execute and map all the returned records and nodes:

public List<ArtistEntity> findByName(String name) {
    try (Session s = driver.session()) {
        String statement
            = " MATCH (a:Artist) "
            + " WHERE a.name contains $name "
            + " WITH a "
            + " OPTIONAL MATCH (a) - [:HAS_LINK_TO] -> (w:WikipediaArticle)"
            + " RETURN a, collect(w) as wikipediaArticles";
 
        return s.readTransaction(tx -> tx.run(statement, Collections.singletonMap("name", name)))
            .list(record -> {
                final Value artistNode = record.get("a");
                final List<WikipediaArticleEntity> wikipediaArticles = record.get("wikipediaArticles")
                    .asList(node -> new WikipediaArticleEntity(
                        node.get("site").asString(), node.get("title").asString(), node.get("url").asString()));
 
                return new ArtistEntity(
                    artistNode.get("name").asString(),
                    artistNode.get("wikidataEntityId").asString(),
                    new HashSet<>(wikipediaArticles)
                );
            });
    }
}

(This code is part of my example how to interact with Neo4j from a Micronaut application, find its source here and the whole application here.)

While this works, it’s quite an effort: For a simple thing (one root aggregate, the artist, with some attributes), a query that is not that simple anymore and a lot of manual mapping. The query makes good use of a standardized multiset (the collect-statement), to avoid having n+1 queries or deduplication of things on the client site, but all this mapping is kinda annoying for a simple READ operation.

Enter

Neo4j-OGM

Neo4j-OGM stands for Object-Graph-Mapper. It’s on the same level of abstraction as JPA/Hibernate are for relational databases. There’s extensive documentation: Neo4j-OGM – An Object Graph Mapping Library for Neo4j. An OGM maps nodes and relationships in the graph to objects and references in a domain model. Object instances are mapped to nodes while object references are mapped using relationships, or serialized to properties. JVM primitives are mapped to node or relationship properties.

Given the example from above, we only have to add a handful of simple annotations to make our domain usable with Neo4j-OGM:

@NodeEntity("WikipediaArticle")
public class WikipediaArticleEntity implements Comparable<WikipediaArticleEntity> {
 
    @Id
    @GeneratedValue
    private Long id;
 
    private String site;
 
    private String title;
 
    private String url;
 
    WikipediaArticleEntity() {
    }
 
    public WikipediaArticleEntity(String site, String title, String url) {
        this.site = site;
        this.title = title;
        this.url = url;
    }
}
 
 
@NodeEntity("Artist")
public class ArtistEntity {
 
    @Id
    @GeneratedValue
    private Long id;
 
    private String name;
 
    private String wikidataEntityId;
 
    @Relationship("HAS_LINK_TO")
    private Set<WikipediaArticleEntity> wikipediaArticles = new TreeSet<>();
 
    ArtistEntity() {
    }
 
    public ArtistEntity(String name, String wikidataEntityId, Set<WikipediaArticleEntity> wikipediaArticles) {
        this.name = name;
        this.wikidataEntityId = wikidataEntityId;
        this.wikipediaArticles = wikipediaArticles;
    }
}

Notice @NodeEntity on the classes, @Relationship on the attribute wikipediaArticles of the ArtistEntity-class and some technical details, mainly @Id @GeneratedValue, needed to map Neo4j's internal, technical ids to instances of the classes and vice-versa.

@NodeEntity and @Relationship are used not only to mark the classes and attributes as something to store in the graph, but also to specify labels to be used for the nodes and names for the relationship.

The whole query than folds together into something like this:

public Iterable<ArtistEntity> findByName(String name) {
    return this.session
        .loadAll(ArtistEntity.class, new Filter("name", ComparisonOperator.CONTAINING, name), 1);
}

Quite a different, right? Dealing with the driver, the driver's session and Cypher has been abstracted away. Take note that the above session attribute is not a Driver's session, but OGM's session. This is a bit confusing when you start using those things.

Again, this code is part of my example how to interact with Neo4j from a Micronaut application. The complete source of the above is here and the whole application here.

To be fair, Neo4j-OGM needs to be configured as well. This is done in it's simplest form with a drivers instance and a list of packages that contains domain entities as described above, for example like this:

public SessionFactory createSessionFactory(Driver driver) {
    return new org.neo4j.ogm.session.SessionFactory(
        new BoltDriver(driver), "ac.simons.music.micronaut.domain");
}

The driver instance in the example above is instantiated by Micronaut. With Micronaut's configuration support, it would have been manually configured as in the very first example.

In a Spring Boot application, Spring Boot takes care of the driver and Spring Data Neo4j creates the OGM session and deals with transactions, among other things:

Spring Data Neo4j

Let's start with quoting Spring Data:

Spring Data’s mission is to provide a familiar and consistent, Spring-based programming model for data access while still retaining the special traits of the underlying data store. It makes it easy to use data access technologies, relational and non-relational databases, map-reduce frameworks, and cloud-based data services.

That goes so far, that Craig Walls is fairly correct when he says, that many stores "are mostly the same from a Spring Data perspective":

Spring Data Neo4j has some specialities, but on a superficial level, the above statement is correct.

Spring Data depends on the Spring Framework and given that, it's kinda hard to get it to work in environments other than Spring. If you're however using Spring Framework already, I wouldn't think twice to add Spring Data to the mix, regardless whether I have to deal with a relational database or Neo4j.

Given the entity ArtistEntity above, one can just declare a repository as this:

interface ArtistRepository extends Neo4jRepository<ArtistEntity, Long> {
    List<ArtistEntity> findByNameContaining(String name);
}

There is no need to add an implementation for that interface, this is done by Spring Data. Spring Data also wires up a Neo4j-OGM session that is aware of Spring transactions.

From an application developers point you don't have to deal with mapping, opening and closing sessions and transactions any longer, but only with one single "repository" as abstraction over a set of given entities.

Please be aware that the idea behind Spring Data and its repository concept is not having a repository for each entity there is, but only for the root aggregates. To quote Jens Schauder: "Repositories persist and load aggregates. An aggregate is a cluster of objects that form a unit, which should always be consistent. Also, it should always get persisted (and loaded) together. It has a single object, called the aggregate root, which is the only thing allowed to touch or reference the internals of the aggregate." (see Spring Data JDBC, References, and Aggregates).

In my "music" example, I deal with albums released in a given year. The release year is an integral part of the album and it would be weird having an additional repository for it.

So what are the specialities of Spring Data Neo4j? First of all, in the pure Neo4-OGM example you might have noticed the single, lone "1". That specifies the fetch depth in which entities should be loaded. Depending on how entities are modeled, you could ran in the problem, that you fetch your whole graph with hone single query. Specifying the depth means specifying how deep relationships should be fetch. The repository method can be declared analog:

interface ArtistRepository extends Neo4jRepository<ArtistEntity, Long> {
    List<ArtistEntity> findByNameContaining(String name, @Depth int depth);
}

People familiar with Spring Data know that derive query method like the findByNameContaining can be much more complicated. You could even write down

interface ArtistRepository extends Neo4jRepository<ArtistEntity, Long> {
    List<ArtistEntity> findByNameContainingOrWikipediaArticlesTitleIs(String name, String title, @Depth int depth);
}

and so on. I have seen some interesting finder methods here and there. While this is technically possible, I would recommend using the @Query annotation on the method name, write down the query myself and chose a method name that corresponds to the business.

Different abstraction levels

At this point it should be clear, that Neo4j Java-Driver, Neo4j-OGM and Spring Data act on different abstraction levels:



In your application, you have to decide which level of abstraction you need. You can come along way with direct interaction with the driver, especially for all kind of queries that facilitates your database for more than simple CRUD operations. However, I don't think that you want to deal with all the cruft of CRUD yourself throughout your application.

When to use what?

All three abstractions can execute all kind of Cypher queries. If you want to deal with result sets and records yourself and don't mind mapping stuff as you go along, use the Java driver. It has the least overhead. Not mapping stuff to fixed objects has the advantage that you can freely traverse relationships in your queries and use the results as needed.

As soon as you want to map nodes with the same labels and their relationship to other nodes more often than not, you should consider Neo4j-OGM. It takes away the "boring" mapping code from you and helps you to concentrate on your domain. Also, Neo4j-OGM is not tied to Spring. I didn't write application outside the Spring ecosystem for quite a while now. For this post, I needed an example where I don't have Spring, so I came up with the Micronaut demo, that uses both plain Java-Driver access and OGM access. Depending on what you want to achieve, you can combine both approaches: Mapping the boring stuff with Neo4j-OGM, handling "special" results yourself.

If you're writing an application in the Spring-Eco-System and decided for OGM, please also add Spring Data Neo4j to the mix. While it doesn't put any further abstraction layer on the mapping itself and thus is not slowing things down, it takes away the burden dealing with the session and transaction from you.

I do firmly believe that Spring Data Neo4j is the most flexible solution.

  1. Start with a simple repository, relying on the CRUD methods
  2. If necessary, declare your queries with @Query
  3. To differentiate between write and read models, execute writes through mapped @NodeEntities and reads through read-only @QueryResults
  4. Write a custom repository extension and interact directly with the Neo4j-OGM or Neo4j Java-Driver session
  5. d

To complete this post, I'll show you option 2 and 3. Given my AlbumEntity, TrackEntity and a AlbumRepository.

First of all I want a query that retrieves all the albums containing one specific track. That is pretty easy to write in Cypher:

interface AlbumRepository extends Neo4jRepository<AlbumEntity, Long> {
    @Query(value
        = " MATCH (album:Album) - [:CONTAINS] -> (track:Track)"
        + " MATCH p=(album) - [*1] - ()"
        + " WHERE id(track) = $trackId"
        + "   AND ALL(relationship IN relationships(p) WHERE type(relationship) <> 'CONTAINS')"
        + " RETURN p"
    )
    List<AlbumEntity> findAllByTrack(Long trackId);
}

By declaring this additional method on the repository, I know have mapped a simple Cypher query that does complex thinks (Here match all albums that contain a specific track and all the relationships of that album and return that all apart from the other tracks) to my entity. I benefit from SDNs mapping and have all the queries in one place.

In my domain, I didn't model the track as part of the album. Those tracks should be explicitly read and not all the time. I therefore added an additional class, called AlbumTrack. Again, accessors omitted for brevity:

@QueryResult
public class AlbumTrack {
 
	private Long id;
 
	private String name;
 
	private Long discNumber;
 
	private Long trackNumber;
}

Notice the @QueryResult annotation. This is special to Spring Data Neo4j. It marks this as a class that is instantiated from arbitrary query result but doesn't have a lifecycle. It then can be used as in a declarative query method, similar to the first one:

interface AlbumRepository extends Neo4jRepository<AlbumEntity, Long> {
    @Query(value
        = " MATCH (album:Album) - [c:CONTAINS] -> (track:Track) WHERE id(album) = $albumId"
        + " RETURN id(track) AS id, track.name AS name, c.discNumber AS discNumber, c.trackNumber AS trackNumber"
        + " ORDER BY c.discNumber ASC, c.trackNumber ASC"
    )
    List<AlbumTrack> findAllAlbumTracks(Long albumId);
}

while this query is indeed much simpler as the first one, it's important to be able to do such things for designing an application that performs well. Think about it: Is it really necessary to have all the relations to all other possible nodes at hands all the time?

In the end, you might have guess it: There are no silver bullets. There are situations where an approach close to the database is more appropriate than another, sometimes a higher abstraction level is better. Whatever you chose, try not be to dogmatic.

All the examples are part of my bootiful music project, more specifically, the "knowledge" submodule. With the building blocks described here, you can develop an web application that is used for reading and writing data.

The example application uses a simple, server side rendered approach for the frontend, but Spring Data Neo4j plays well with Spring Data Rest and that makes many different approaches possible.

In the next installment of this series, we have a look at the concrete domain modeling with Spring Data Neo4j.

| Comments (5) »

29-Oct-18