What’s in a name: Basic stuff

Some days ago a tweet by Oracles Bruno Borges caught my eye:

It's a real good question and one that has been on my mind for quite some time now. I discussed the flood of emerging rock star programmers with members of my JUG as well as the question of reading a "beginners" book about programming:

Self evident – or just a filter bubble?

So, what's important regarding basic stuff? I'd expect people visiting a Java user group or any other user group having an understanding and idea of the topic discussed: A JUG may not be the right place to learn basic Java syntax and explaining the differences between a class and an object. It might however be a place to debate questions like having public class variables or having a copy constructor due to some rule. I wrote this article about creational patterns with Java 8 some weeks ago and I wasn't quite sure how that would be received: In the filter bubble I'm living in right now it may is basic stuff but judging from the feedback I got, it actually matters: Still people have a hard time understanding the difference between a factory and a builder or even understanding the need for something like this.

It's important to discuss questions like these because they will ultimately form a new software system and are the fundamentals of an architecture.

Basic or fundamentally?

Regarding the book "Weniger schlecht programmieren" ("Less bad programming") by Kathrin Passig and Johannes Jander: One might think that this book is target at absolute beginners but stating that would mean not having understood one of the first chapters about the four stages of competence: Basic stuff like how to ask the right question, listening to answers, being aware of what you don't know is not only basic but fundamentally. That stuff has hardly to do anything with knowing a certain language or framework. Learning this "basic" stuff by heart is something that will last when the latest hype has long gone.

Soft skills aren't basic but fundamentally things. If a JUG can be a safe room where those can be trained: Awesome, mission accomplished. Regardless how hard one maybe wish, creating software that is used and may even earn your money isn't about programming a computer, but listening, questioning and then programming.

A fool with a tool…

It's not enough knowing a tool and having used it once. There is a horrendous post going around titled "How Hibernate Almost Ruined My Career" at a place for "Top coders". I'm not gonna link that, you can search for it yourself. Ten years ago I might have written such a post myself: Having worked my way through tutorials, even sources maybe and still the hammer is not tighten my damn screws. A post like this is maybe good enough for letting off steam, but not for much else.

Learning how to choose the right tool for a certain task and being open for new tools is a basic stuff.

A JUG should be a place open for presenting tools and discussing them. I'm not going to user groups or conferences to come back with a bag of new tools that I have to throw like now at my current problem, but meet and hopefully discuss with people what the basic advantage of a new tool, framework or architectural pattern actually is.

Room for improvment

Looking back over the past 10 years my life as a programmer has become really "easy". Creating a microservice in 30 minutes while talking about it? Fun! Effortless integration testing with enterprise databases? Done. Making JPA / Hibernate a tool effortless to use? Throw Spring Data JPA in the mix. The list could go on…

Instead of configuring a Spring context, an application server or installing throwaway databases for testing, I now have much more time at my disposal for thinking about basic stuff: How to slice my architecture, naming things, discussing features and more.

Use a JUG to educate users about the tools that make a developers life easier. Those tools maybe advanced but when their usage is basic knowledge, there is room really fundamentally improvement.

Good is good enough

Last but not least, don't forget that not everybody wants to be a rock star programmer and lives and breaths coding the way you do. More often people just want their task get done and go home, live a life outside the bubble. And that's fine too and sometimes even healthier. There have been discussions about 10x programmers at other places but if we all would be 10x programmers, we would be at square one. So don't make your JUG a place where only top notch, highly sophisticated stuff is presented and accepted. For example, presenting things an IDE an can do which the presenter thinks are self evident may be totally new for many others. Why not spare them the hard road of finding the features?

| Comments (1) »

29-Sep-16


Running Hibernate Search with Elasticsearch on Pivotal CF

This post has been featured on This Week in Spring – September 20, 2016 and on the Hibernate Community Newsletter 19/2016.

Two weeks ago, I wrote a post on how to use Hibernate Search with Spring Boot. The post got featured on the Hibernate community newsletter as well as on Thorbens blog Thoughts on Java.

I ended the the post saying that a downside for a fully cloud based application is the fact, that the default index provider is directory based.

Well.

There’s a solution for that, too: In upcoming Hibernate Search 5.6 there’s an integration with Elasticsearch.

I didn’t try this out with my Tweet Archive, but with the site of my JUG, which runs happily on Pivotal CF.

Goal

  • Use local, directory based Lucene index during development
  • Use Elastic Search integration when deployed to Pivotal CF (“The cloud”)

Steps taken

First of all, you have to add the dependencies

with hibernate-search.version being 5.6.0.Beta2 at the moment.

The annotations at entity level are exactly the same as in my previous post, but for your convince, here’s the post entity, which I wanted to make searchable:

Again, I have configured a language discriminator at entity level with @AnalyzerDiscriminator(impl = PostLanguageDiscriminator.class), but we come later to this.

To make Hibernate Search use the Elastic Search integration, you have to change the index manager. In a Spring Boot application this can be done by setting the following property:

spring.jpa.properties.hibernate.search.default.indexmanager = elasticsearch

And that’s exactly all there is to switch from a directory based, local Lucene index to Elasticsearch. If you have a local instance running, for example in doctor, everything works as before, the indexing as well as the querying.

The default host is http://127.0.0.1:9200, but we’re not gonna use that in the cloud. Pivotal IO offers Searchly at their marketplace, providing Elastic Search. If you add this to your application, you’ll get the credentials via an URL. The endpoint then can be configured like this in Spring application.properties:

spring.jpa.properties.hibernate.search.default.elasticsearch.host = ${vcap.services.search.credentials.sslUri}

Here I am making use of the fact that environment variables are evaluated in properties. The vcap property is automatically added by the Pivotal infrastructure and contains the mentioned secure URL. And that’s it. I have added a simple search by keyword method to my Post repository, but that I had already covered in my other post:

The actual frontend accessible through http://www.euregjug.eu/archive is nothing special, you can just browse the sources or drop me a line if you have any questions.

More interesting is the language discriminator for the posts. It looks like this:

It returns the name of the posts language. Elasticsearch offers build-in language specific analyzers, “english” and “german” are both available.

What, if I want to use a local index for testing and Elasticsearch only on deployment? I would have to define those analyzers in that profile. The right way to do it is a Hibernate @Factory like this:

and a application-default.properties containing

spring.jpa.properties.hibernate.search.model_mapping = eu.euregjug.site.config.DefaultSearchMapping

Recap

To use Hibernate Search with your JPA entities, basically follow the steps described here.

If you want to use named Analyzers from Elastic Search, that aren’t available for locale Lucene, add analyzers with the same name (and maybe a similar functionality as well) through a Hibernate @Factory and configure them in application-default.properties. If you’re at it, you may want to configure the index path into a directory which is excluded from your repo:

Relevant part of application-default.properties:

spring.jpa.properties.hibernate.search.default.indexBase = ${user.dir}/var/default/index/
spring.jpa.properties.hibernate.search.model_mapping = eu.euregjug.site.config.DefaultSearchMapping

In your prod properties, or in my case, in application-cloud.properties switch from the default index manager to “elasticsearch” and also configure the endpoint:

Relevant part of application-cloud.properties:

spring.jpa.properties.hibernate.search.default.indexmanager = elasticsearch
spring.jpa.properties.hibernate.search.default.elasticsearch.host = ${vcap.services.search.credentials.sslUri}
spring.jpa.properties.hibernate.search.default.elasticsearch.index_schema_management_strategy = MERGE

Happy searching and finding 🙂

| Comments (2) »

20-Sep-16


NetBeans, Maven and Spring Boot… more fun together

At the 1st NetBeans Day Cologne I gave a talk about why I think that the combination of NetBeans, Maven and Spring Boot is more fun together.

Together with me were Michael Müller, who spoke about the upcoming support of Java 9s JShell in NetBeans and I’m totally curious how that will work with my Spring projects. I imagine that really useful.

And certainly, Geertjan Wielenga from Oracle was there, spoke a little bit about NetBeans background and the upcoming features. His second talk was about OracleJET and how NetBeans support that concept of enterprise JavaScript programming.

So my talk: I have a full working demo right here: github.com/michael-simons/NetBeansEveningCologne and if you walk through the slides and codes, you’ll even find a coupon for my book.

That said the demo is centered around a super simple REST application that registers people for the NetBeans day. If you familiar with Spring Boot and Spring Data, the stuff isn’t probably to new, but you can still learn about the NB-SpringBoot plugin which does a lot of the stuff in NetBeans, that STS or IntelliJ do for Spring Boot.

The second part of my talk is about two great libraries respectively Maven plugins, Project Lombok and JaCoCo:


netbeans-maven-und-springboot-mehr-spas-zusammen-009

Project Lombok want’s to remove some of Javas necessary boilerplate code, that is: You can replace Getter and Setter with annotations, as well as constructors, equals/hashCode and more. Lombok is a source code annotation post processor and I always thought that I will have a hard time using it in a sane way in an IDE, but that’s actually not the case, you have in contrast, instant IDE support, as I tweeted before. People where surprised who easy Spring Boot can be used inside NetBeans, but the integration of Lombok and JaCoCo was really eye opening for some.

I’m gonna try something new here and show you what I did as a little screencast. The video references the repository above, it uses the commit fa22a87. It’s the first time I recorded something like this, so sorry, if’s a bit rough:

For anyone who doesn’t want to watch a video, Geertjan took some pictures:

I’m really convinced after using NetBeans for 2 years now after many years Eclipse, it deserves a voice. It’s a great tool and most of the time, it just works. And the best: It’s free and open source. I think the new title of Geertjans slides are even better than the jigsaws:

Ever seen kids playing with Lego? Sometimes the result doesn’t look as polished as the sets, but often they work as equally good. NetBeans may not be polished as other, much more expensive IDEs, but that actually doesn’t matter much to me.

For the evening a big thank you to Faktorzehn for providing a great place and great food and drinks, much appreciated.

If you have a JUG or a company who wants to learn more about that stuff, drop me a line, we probably can arrange something. I can give this talk in German as well as in English and can extend all topics, Spring Boot with NetBeans, Maven or Docker.

| Comments (3) »

10-Sep-16


Hibernate Search and Spring Boot: Simple yet powerful archiving

This post has been featured in the Hibernate Community Newsletter 18/2016.

Before my summer holidays I mentioned my personal twitter archive on Twitter again….

This time, Vlad from Hibernate reacted on my tweet:

More reactions came from Sanne and Emmanuel and here we go:

Content

  1. Source
  2. Background
  3. Features
  4. Tools used
  5. Application
  6. Database schema
  7. The Tweet entity
  8. Storing new entities
  9. Querying entities
  10. Conclusion
  11. Try it out yourself

Source

The whole project, which has already grown into more than a tech demo, is on github: michael-simons/tweetarchive.

What I skipped is a fancy gui. So far, it only has a REST interface. But, it can be run as a docker image with local, persistent storage. Check it out, star it, maybe even add stuff to it… Feel free!

Background

I’m running my archive for several years now, from Daily Fratze. Daily Fratze contains a home grown crawler that checks my user time line and stores my tweets in a MySQL database. I’m using JPA with Hibernate as my database access tool, so Hibernate Search fit’s nicely and is really easy to implement. Hibernate Search is a super easy way to add an Apache Lucene full text index to your entities.

For large scale applications, Elastic Search or similar maybe more fitting, but I’m really content with my “small” (at the end of last year ~50Mb) search index and it’s performance. It doesn’t add much (if any) overhead to development and on production.

For the demo, I’ve taken my entities but not the parser. For parsing in the demo I use Twitter4J. Twitter4J is apparently not made for parsing static tweets, so there are some ugly constructs for getting a Twitter archive into the app, but that should not be the point here. The entities have been adapted and refreshed according to my current skills. Some things I created years ago should never see the light of day.

Features

  • I want to be able to search my tweets. With keywords and with full blown Lucene queries
  • The application should track new tweets
  • The original JSON content should be stored as well

Tools used

In order:

Application

The application is a standard Spring Boot application. It’s 2016, you should find several real good guides out there and also on this blog how such an application is build.

I also assume that you have an idea what Apache Lucene is about.

Database schema

My migrations are inside src/main/resources/db/migration/ where Flyway automatically finds it. Flyway itself is recognized by Spring Boot if on the classpath.

I have this PostgresSQL cast

that allows me to store a string java attribute inside a JSONB column without a bunch of custom converters, without explicitly casting it but with type checks.

The table definition for tweets looks like this:

Nothing fancy here except the raw_data column, which contains the tweets original source. You can use PostgreSQLs JSON operators to query it, if you like.

The Tweet entity

You’ll find the Tweet entity here src/main/java/ac/simons/tweetarchive/tweets/TweetEntity.java. Basically, it is a standard JPA entity. I use Project Lombok to get rid of boiler plate code, so you’ll find no getters and setters.

For the following stuff, I assume you know JPA, because I’m not gonna covering that.

To make Hibernate Search aware of an entity, that should be indexed, you have to annotate the entity:

That is already all there is!

Next step: Add a simple field, for example the screen name, just annotate it with @Field:

That actually reads: Index that field, store the value with the index so that it can be searched without hitting the database but don’t to further analysis.

If you read through the entity, you’ll find several such fields.

Next: Analyzing fields. I want to search for similar words in the content of the tweet. While receiving the tweet, the application resolves URLs and stuff and replaces the short urls, see TweetStorageService.

The entity takes this one step further. The content field is annotated with:

Here the @Field annotation says: Index the content, don’t store it, but analyze it. It also says, through @AnalyzerDiscriminator, with which analyzer.

I have defined my analyzers right with the entity, but they can be defined elsewhere, on a package for example, too:

I have 3 analyzers in place: An English analyzer, wo tokenizes the input, lower cases it and then does english based word stemming. The same for German and last but not least, an analyzer that just tokenizes and filters the content.

The analyzer itself can be dynamically inferred with a discriminator, which looks like this:

Read: If the language of the tweet is available and supported, use the fitting analyzer, otherwise use the default analyzer for undefined languages.

Hibernate Search allows spatial queries. You can annotate the whole class or an attribute, that returns Coordinates:

Also nested entities are supported. My example: The information regarding a reply. I have InReplyTo as an @Embeddable class and an attribute inReplyTo

This reads: Please index the embedded class, add a prefix “reply.” to all fields and otherwise, check for @Field annotations in the embedded class.

So far: Not much!

Storing new entities

If you use Spring Boot together with Hibernate and Spring Data JPA, you have nothing to take care of except configuring the database (and you can even skip this, if you use an in memory database).

This is all the configuration it takes, to get Hibernate Search up and running with that setup, if you add org.springframework.boot:spring-boot-starter-data-jpa, org.postgresql:postgresql and org.hibernate:hibernate-search-orm to the classpath:

spring.datasource.platform = postgresql
spring.datasource.driver-class-name = org.postgresql.Driver
spring.datasource.url = jdbc:postgresql://localhost:5432/tweetArchive
spring.datasource.username = tweetArchive
spring.datasource.password = tweetArchive
 
spring.jpa.hibernate.ddl-auto = validate
 
spring.jpa.properties.hibernate.search.default.directory_provider = filesystem
spring.jpa.properties.hibernate.search.default.indexBase = ${user.dir}/var/index/default

Just go ahead and define a Repository the TweetEntity:

This is an Interface with no implementation in my application. It inherits from org.springframework.data.repository.Repository, thus providing means access entities already. I chose the simplest form of repository so that I don’t clutter my application with methods I wouldn’t need. If I instead would have inherited from CrudRepository, I wouldn’t have do define save or delete methods.

Calling the save or delete method from my tweet storage service already updates my search index.

Querying entities

But take good note that this interface inherits also from TweetRepositoryExt. This is the recommended way by Spring Data JPA to add custom behavior. This interface defines to search methods which I actually have to define. This is done in TweetRepositoryImpl and I’m gonna walk you through the search method:

First I retrieve a new FullTextEntityManager inside the declarative transaction and instantiate a query builder. The query builder exposes a nice, fluent interface to define my Lucene query. You’ll see how I add a keyword query on one specific field and also, if the user provided a date range, I add some range queries to a bracing boolean condition.

The FullTextEntityManager is then used again to instantiate a JPA query from the full text query and retrieve the result.

And that’s all there is: I can use (and hide!) the full text queries inside the same repositories I would use elsewhere.

Conclusion

If you already are using Hibernate as your ORM, have embraced Spring Data repositories and you’ll need to search some entities then Hibernate Search maybe the right approach for your project. It’s really easy to implement and also easy to use. One downside for a 12 factor app could be the fact, that the index is directory based in the default setting. You can work around it, though, by using JMS or JGroups.

I have been using Hibernate Search for quite a while now on Daily Fratze and on several other projects intern as well and for my respectively our purpose it has been enough.

Try it out yourself

There’s much more to learn in the demo application. Go to michael-simons/tweetarchive and see for yourself. There’s an extensive README, that should guide you through running the application yourself. The easiest way is to use a local Docker based instance.

If you like it, follow me on Twitter, I am @rotnroll666, leave a comment or a star.

| Comments (5) »

06-Sep-16