Tuesday, 20 December 2016

Spring Boot Starters

Ho, ho, hooo! It looks like all members of Infinispan Community have been nice and Santa brought you Spring Boot Starters!


This will make you even more productive and your code less verbose!

Why do I need starters?


Spring Boot Starters make the bootstrapping process much easier and faster. The starter brings you required Maven dependencies as well as some predefined configuration bits.

What do I need to get started?


The starter can operate in two modes: client/server (when you connect to a remote Infinispan Server cluster) and embedded (packaged along with your app). The former is the default. It's also possible to use both those modes at the same time (store some data along with your app and connect to a remote Infinispan Server cluster to perform some other type of operations).

Assuming you have an Infinispan Server running on IP address 192.168.0.17, all you need to do is to use the following dependencies:


By default, the starter will try to locate hotrod-client.properties file. The file should contain at least the server list:


It is also possible to create RemoteCacheManager's configuration manually:


That's it! Your app should successfully connect to a remote cluster and you should be able to inject RemoteCacheManager.

Using Infinispan embedded is even simpler than that. All you need to do is to add additional dependency to the classpath:


The starter will provide you a preconfigured EmbeddedCacheManager. In order to customize the configuration, use the following code snippet:

Further reading


There are two link I highly recommend you to read. The first is the Spring Boot tutorial and the second is the Github page of the Starters project


Kudos


Special thanks go to Marco Yuen, who donated us with Spring Boot Starters code and Tomasz Zabłocki, who updated them to current version and Stéphane Nicoll who spent tremendous amount of time reviewing the Starters.

Monday, 19 December 2016

Data Container Changes Part 1

Infinispan 9.0 Beta 1 introduces some big changes to the Infinispan data container.  This is the first of two blog posts detailing those changes.

This post will cover the changes to eviction which utilizes a new provider, Caffeine.  As you may already know Infinispan has supported our own implementations of LRU (Least Recently Used) and LIRS (Low Inter-reference Receny Set) algorithms for our bounded caches.

Our implementations of eviction were even rewritten for Infinispan 8, but we found we still had some issues or limitations with them, especially LIRS.  Our old implementation had some problems with keeping the correct number of entries.  The new implementation while not having that issue had others, such as being considerably more complex.  And while it implemented the entire LIRS specification, it could have memory usage issues.  This led us to looking at alternatives and Caffeine seemed like a logical fit as well as being well maintained and the author, Ben Manes, is quite responsive.

Enter Caffeine


Caffeine doesn't utilize LRU or LIRS for its eviction algorithm and instead implements TinyLFU with an admission window.  This has the benefit of the high hit ratio like LIRS, while also requiring low memory overhead like LRU.  Caffeine also provides custom weighting for objects, which allow us to reuse the code that was developed for MEMORY based eviction as well.

The only thing that Caffeine doesn't support is our idea of a custom Equivalence.  Thus Infinispan now wraps byte[] instances to ensure equals and hashCode methods work properly.  This also gives us a good opportunity to reevaluate the dataContainer configuration element.

Deprecations


The data container configuration has thus been deprecated and is now replaced by a new configuration element named memory.   Also since we are adding a new element the eviction configuration could also be consolidated into memory, and thus eviction is also deprecated.  And last but not least the storeAsBinary configuration element has also been integrated into the new memory configuration element.  Now we have 1 configuration element instead of 3, can't beat that!

New Configuration


The new memory configuration will start out pretty simple and new elements can be added as there is a need.  The memory element will be composed of a single sub element that can be of three different choices.  For this post we will go over two of the sub elements: OBJECT and BINARY.

OBJECT


Object storage stores the actual objects as provided from the user in the Java Heap.  This is the default storage method when no memory configuration is provided.  This method will provide the best performance when using operations that operate upon the entire data set, such as distributed streams, indexing and local reads etc.

Unfortunately OBJECT storage only allows for COUNT based eviction as we cannot properly estimate user object types properly.  This could be improved in a feature version if there is enough interest. Note that you can technically configured MEMORY eviction type with the OBJECT storage type with declarative configuration, but it will throw an exception when you build the configuration.  Therefore OBJECT only has a single element named size to determine the amount of entries that can be stored in the cache.

An example of how Object storage can be configured:

XML

DECLARATIVE


BINARY


Binary storage stores the object in its serialized form in a byte array.  This has an interesting side effect of objects are always stored as a deep copy.  This can be useful if you want to modify an object after retrieving it without affecting the underlying cache stored object.  Since objects have to be deserialized when performing operations on them some things such as distributed streams and local gets will be a little bit slower.

A nice benefit of storing entries as BINARY is that we can estimate the total on heap size of the object.  Thus BINARY supports both COUNT and MEMORY based eviction types.

An example of how Binary storage can be configured:

XML

DECLARATIVE


OFF-HEAP


This option will be described in more detail in the next blog post.  Stay tuned!

Conclusion


Caffeine should bring us a great solution, while also reducing a lot of maintenance ourselves.  The new memory configuration also provides a simpler solution by removing two other configuration elements.

We hope you enjoy the new changes to the data container and look out for another blog post coming soon to detail the other new changes to the data container!  In the meantime please check out our latest Infinispan 9.0 before it goes final and give us any feedback on IRC or JIRA

Thursday, 15 December 2016

Thanks Soft-Shake and Devoxx Morocco!

Last month I presented about building functional reactive applications with Infinispan, Node.js and Elm at both Soft-Shake in Geneva (slides) and Devoxx Morocco (slides).

Thanks a lot to all the participants who attended the talks and thanks also to the organisers for accepting my talk. Both conferences were really enjoyable!

At Soft-Shake I managed to attend a few presentations, and the one that really stuck with me was the one from Alexandre Masselot on "Données CFF en temps réel: tribulations techniques dans la stack Big Data" (slides). It was a very interesting use case on doing big data with the information from the Swiss Rail system. Although there was no live demo, Alexandre gave the link to a repo where you can run stuff yourself. Very cool!

On top of that, I also attended a talk by Tom Bujok on Scaling Your Application Out. Tom happens to be an old friend who since I last met him has joined Hazelcast ;)



Shortly after Shoft-Shake I headed to Casablanca to speak at Devoxx Morocco. This was a fantastic conference with a lot of young attendees. The room was almost packed up for my talk and I got good reaction from the audience on both the talk and the live demo.

During the conference I also attended other talks, including a couple of Kubernetes talks by Ray Tsang, who is an Infinispan committer himself. In his presentations he uses a Kubernetes visualizer which is very cool and I'm hoping to use it in future presentations :)

No more conferences for this year, thanks to all who've attended Infinispan presentations throughout the year!

Cheers,
Galder

Thursday, 8 December 2016

Meet Ickle!


As you’ve already learned from an earlier post this week, Infinispan 9 is on its final approach to landing and is bringing a new query language. Hurray! But wait, was there something wrong with the old one(s)? Not wrong really ...  I’ll explain.

Infinispan is a data grid of several query languages. Historically, it has offered search support early in its existence by integrating with Hibernate Search which provides a powerful Java-based DSL enabling you to build Lucene queries and run them on top of your Java domain model living in the data grid. Usage of this integration is confined to embedded mode, but that still succeeds in making Java users happy.

While the Hibernate Search combination is neat and very appealing to Java users it completely leaves non-JVM languages accessing Infinispan via remote protocols out in the cold.

Enter Remote Query. Infinispan 6.0 starts to address the need of searching the grid remotely via Hot Rod. The internals are still built on top of Lucene and Hibernate Search bedrock but these technologies are now hidden behind a new query API, the QueryBuilder, an internal DSL resembling JPA criteria query. The QueryBuilder has implementations for both embedded mode and Hot Rod. This new API provides all relational operators you can think of, but no full-text search initially, we planned to add that later.

Creating a new internal DSL was fun. However, having a long term strategy for evolving it while keeping complete backward compatibility and also doing so uniformly across implementations in multiple languages proved to be a difficult challenge. So while we were contemplating adding new full-text operators to this DSL we decided on making a long leap forward and adopt a more flexible alternative by having our own string based query language instead, another DSL really, albeit an external one this time.

So after the long ado, let me introduce Ickle, Infinispan’s new query language, conspicuously resembling JP-QL.

Ickle:
  • is a light and small subset of JP-QL, hence the lovely name
  • queries Java classes and supports Protocol Buffers too
  • queries can target a single entity type
  • queries can filter on properties of embedded objects too, including collections
  • supports projections, aggregations, sorting, named parameters
  • supports indexed and non-indexed execution
  • supports complex boolean expressions
  • does not support computations in expressions (eg. user.age > sqrt(user.shoeSize + 3) is not allowed but user.age >= 18 is fine)
  • does not support joins
    • but, navigations along embedded entities are implicit joins and are allowed
    • joining on embedded collections is allowed
    • other join types not supported
  • subqueries are not supported
  • besides the normal relational operators it offers full-text operators, similar to Lucene’s  query parser
  • is now supported across various Infinispan APIs, wherever a Query produced by the QueryBuilder is accepted (even for continuous queries or in event filters for listeners!)

That is to say we squeezed JP-QL to the bare minimum and added full-text predicates that closely follow the syntax of Lucene’s query parser.

If you are familiar with JPA/JP-QL then the following example will speak for itself:

select accountId, sum(amount) from com.acme.Transaction
    where amount < 20.0
    group by accountId
    having sum(amount) > 1000.0
    order by accountId

The same query can be written using the QueryBuilder:

Query query = queryFactory.from(Transaction.class)
.select(Expression.property("accountId"), Expression.sum("amount"))
.having("amount").lt(20.0)
.groupBy("accountId")
.having(Expression.sum("amount")).gt(1000.0)
.orderBy("accountId").build();

Both examples look nice but I hope you will agree the first one is better.

Ickle supports several new predicates for full-text matching that the QueryBuilder is missing. These predicates use the : operator that you are probably familiar from Lucene’s own query language.  This example demonstrates a simple full-text term query:

select transactionId, amount, description from com.acme.Transaction
where amount > 10 and description : "coffee"

As you can see, relational predicates and full-text predicates can be combined with boolean operators at will.

The only important thing to remark here is relational predicates are applicable to non-analyzed fields while full-text predicates can be applied to analyzed field only. How does indexing work, what is analysis and how do I turn it on/off for my fields? That’s the topic of a future post, so please be patient or start reading here.

Besides term queries we support several more:
  • Term                     description : "coffee"
  • Fuzzy                    description : "cofee"~2
  • Range                   amount : [40 to 90}
  • Phrase                  description : "hello world"
  • Proximity               description : "canceling fee"~3
  • Wildcard                description : "te?t"
  • Regexp                 description : /[mb]oat/
  • Boosting                description : "beer"^3 and description : "books"
You can read all about them starting from here.

But is Ickle really new? Not really. The name is new, the full-text features are new, but a JP-QL-ish query string was always internally present in the Query objects produced by the QueryBuilder since the beginning of Remote Query. That language was never exposed and specified until now. It evolved significantly over time and now it is ready for you to use it. The QueryBuilder / criteria-like API is still there as a convenience but it might go out of favor over time. It will be limited to non-full-text functionality only. As Ickle grows we’ll probably not be able to include some of the additions in the QueryBuilder in a backward compatible manner. If growing will cause too much pain we might consider deprecating it in favor of Ickle or if there is serious demand for it we might continue to evolve the QueryBuilder in a non compatible manner.

Being a string based query language, Ickle is very convenient for our REST endpoint, the CLI, and the administration console allowing you to quickly inspect the contents of the grid. You’ll be able to use it there pretty soon. We’ll also continue to expand Ickle with more advanced full-text features like spatial queries and faceting, but that’s a subject for another major version. Until then, why not grab the current 9.0 Beta1 and test drive the new query language yourself? We’d love to hear your feedback on the forum, on our issue tracker or on IRC on the #infinispan channel on Freenode.

Happy coding!

Tuesday, 6 December 2016

Distributed Stream Quality of Life Improvements

As I hope most people reading this already know, since Infinispan 8 you can utilize the entire Java 8 Stream API and have it be distributed across your cluster.  This performs the various intermediate and terminal operations on the data local to the node it lives on, providing for extreme performance.  There are some limitations and things to know as was explained at distributed-streams.

The problem with the API up to now was that, if you wanted to use lambdas, it was quite an ugly scene.  Take for example the following code snippet:

8.0 Distributed Streams Example

However, for Infinispan 9 we utilize a little syntax feature added with Java 8 [1] to add some much needed quality of life improvements.  This allows the most specific interface to be chosen when a method is overloaded.  This allows for a neat interaction when we add some new interfaces that implement Serializable and the various function interfaces (SerializableFunction, SerializablePredicate, SerializableSupplier, etc).  All of the Stream methods have been overridden on the CacheStream interface to take these arguments.

This allows for the code to be much cleaner as we can see here:

9.0 Distributed Streams Example

Extra Methods

This is not the only benefit of providing the CacheStream interface: we can also provide new methods that aren't available on the standard Stream interface.  One example is the forEach method which allows the user to more easily provide a Cache that is injected on each node as required.  This way you don't have to use the clumsy CacheAware interface and can directly use lambdas as desired.

Here is an example of the new forEach method in action:

In this example we take a cache and, based on the keys in it, write those values into another cache. Since forEach doesn't have to be side effect free, you can do whatever you want inside here.

All in all these improvements should make using Distributed Streams with Infinispan much easier.  The extra methods could be extended further if users have use cases they would love to suggest.  Just let us know, and I hope you enjoy using Infinispan!

Monday, 5 December 2016

Infinispan 9.0.0.Beta1 "Ruppaner"


It took us quite a bit to get here, but we're finally ready to announce Infinispan 9.0.0.Beta1, which comes loaded with a ton of goodies.

  • Performance improvements
    • JGroups 4
    • A new algorithm for non-transactional writes (aka the Triangle) which reduces the number of RPCs required when performing writes 
    • A new faster internal marshaller which produced smaller payloads. 
    • A new asynchronous interceptor core
  • Off-Heap support
    • Avoid the size of the data in the caches affecting your GC times
  • CaffeineMap-based bounded data container
    • Superior performance
    • More reliable eviction
  • Ickle, Infinispan's new query language
    • A limited yet powerful subset of JPQL
    • Supports full-text predicates
  • The Server Admin console now supports both Standalone and Domain modes
  • Pluggable marshallers for Kryo and ProtoStuff
  • The LevelDB cache store has been replaced with the better-maintained and faster RocksDB 
  • Spring Session support
  • Upgraded Spring to 4.3.4.RELEASE
We will be blogging about the above in detail over the coming weeks, including benchmarks and tutorials.
The following improvements were also present in our previous Alpha releases:
  • Graceful clustered shutdown / restart with persistent state
  • Support for streaming values over Hot Rod, useful when you are dealing with very large entries
  • Cloud and Containers
    • Out-of-the box support for Kubernetes discovery
  • Cache store improvements
    • The JDBC cache store now use transactions and upserts. Also the internal connection pool is now based on HikariCP

Also, our documentation has received a big overhaul and we believe it is vastly superior than before.

There will be one more Beta including further performance improvements as well as additional features, so stay tuned.
Infinispan 9 is codenamed "Ruppaner" in honor of the Konstanz brewery, since many of the improvements of this release have been brewed on the shores of the Bodensee !

Prost!

Composing the Infinispan Docker image


In the previous post we showed how to manipulate the Infinispan Docker container configuration at both runtime and boot time.

Before diving into multi-host Docker usage, in this post we'll explore how to create multi-container Docker applications involving Infinispan with the help of Docker Compose.

For this we'll look at a typical scenario of an Infinispan server backed by an Oracle database as a cache store.

All the code for this sample can be found on github.

 

Infinispan with Oracle JDBC cache store

 


In order to have a cache with persistence with Oracle, we need to do some configuration: configure the driver in the server, create the data source associated with the driver, and configure the cache itself with JDBC persistence.

Let's take a look at each of those steps:

Obtaining and configuring the driver

The driver (ojdbc6.jar) should be downloaded and placed in the 'driver' folder of the sample project.

The module.xml declaration used to make it available on the server is as follows:


Configuring the Data source

The data source is configured in the "datasource" element of the server configuration file as shown below:


and inside the "datasource/drivers" element, we need to declare the driver:


Creating the cache

The last piece is to define a cache with the proper JDBC Store:


Putting all together

From now on, without using Docker we'd be ready to download and install Oracle following the specific instructions for your OS, then download the Infinispan Server, edit the configuration files, copy over the driver jar, figure out how to launch the database and server, taking care not to have any port conflicts.

If it sounds too much work, it's because it really is. Wouldn't it be nice to have all these wired together and launched with a simple command line? Let's take a look at the Docker way next.


 

Enter Docker Compose


Docker Compose is a tool part of the Docker stack to facilitate configuration, execution and management of related Docker containers.

By describing the application aspects in a single yaml file, it allows centralized control of the containers, including custom configuration and parameters, and it also allows runtime interactions with each of the exposed services.

Composing Infinispan

Our Docker Compose file to assemble the application is given below:

It contains two services:
  • one called oracle that uses the wnameless/oracle-xe-11g Docker image, with an environment variable to allow remote connections.
  •  another one called infinispan that uses version 8.2.5.Final of the Infinispan Server image. It is launched with a custom command pointing to the changed configuration file and it also mounts two volumes in the container: one for the driver and its module.xml and another for the folder holding our server xml configuration.

Launching

To start the application, just execute


To inspect the status of the containers:


To follow the Infinispan server logs, use:


Infinispan usually starts faster than the database, and since the server waits until the database is ready (more on that later), keep an eye in the log output for "Infinispan Server 8.2.5.Final (WildFly Core 2.0.10.Final) started". After that, both Infinispan and Oracle are properly initialized.

Testing it

Let's insert a value using the Rest endpoint from Infinispan and verify it was saved to the Oracle database:


To check the Oracle database, we can attach to the container and use Sqlplus:


Other operations


It's also possible to increase and decrease the number of containers for each of the services:



A thing or two about startup order

 

When dealing with dependent containers in Docker based environments, it's highly recommended to make the connection obtention between parties robust enough so that the fact that one dependency is not totally initialized doesn't cause the whole application to fail when starting.

Although Compose does have a depends_on instruction, it simply starts the containers in the declared order but it has no means to detected when a certain container is fully initialized and ready to serve requests before launching a dependent one.

One may be tempted to simply write some glue script to detect if a certain port is open, but that does not work in practice: the network socket may be opened, but the background service could still be in transient initialization state.

The recommended solution for this it to make whoever depends on a service to retry periodically until the dependency is ready. On the Infinispan + Oracle case, we specifically configured the data source with retries to avoid failing at once if the database is not ready:

When starting the application via Compose you'll notice that Infinispan print some WARN with connection exceptions until Oracle is available: don't panic, this is expected!


Conclusion


Docker Compose is a powerful and easy to use tool to launch applications involving multiple containers: in this post it allowed to start Infinispan plus Oracle with custom configurations with a single command.
It's also a handy tool to have during development and testing phase of a project, specially when using/evaluating Infinispan with its many possible integrations.

Be sure to check other examples of using Docker Compose involving Infinispan: the Infinispan+Spark Twitter demo, and the Infinispan+Apache Flink demo.