Monday, 31 August 2015

Infinispan 8.0.0.Final

Dear all,

it is with the greatest pleasure that we announce the first stable release of Infinispan 8.

 The number "8" is quite special for Infinispan for two reasons:
  • it has been embedded in our logo, disguised as the infinity symbol, since the very beginning
  • it marks the move of the Infinispan code-base to Java 8
So without further ado, let's see what Infinispan 8 brings to the table:

  • A new functional-style API for interacting with caches which takes advantage of all the language goodies introduced by Java 8, such as lambdas, Optional, CompletableFuture, etc. We have already started a blog series describing the API, and the reasoning behind it and we want your opinion too.
  • Support for the Java 8's Streams API which, in the context of Infinispan, becomes fully distributed: parallel streams become truly parallel !
  • Indexing and querying received a host of new features: Continuous querying, grouping and aggregation, simultaneous querying on both indexed and non-indexed fields.
  • Expired entries now trigger events, thus allowing your applications to perform operations like refresh from an external datasource, archiving, etc.
  • Eviction is now memory size-aware, so you can define the maximum amount of memory you want a cache to grow to, before entries are removed or passivated to an external store. 
  • Infinispan Server now fully supports domain mode, and that is now the recommended way for clustered operations.
  • We have a new management console for Infinispan Server which will greatly simplify configuration and monitoring without requiring an external console. This is evolving rapidly and we will be adding quite a lot of functionality in the following months.
  • We are working hard to reduce the number of resources used by Infinispan and removing many internal locks and increasing concurrency. This work is ongoing and you will see further improvements during the 8.x series.
  • We now provide integrations with Spark and Hadoop so that you can use all of the wonderful processing tools from those ecosystem against data stored in Infinispan.
  • Both the declarative and programmatic configuration API have been enhanced to support templates and configuration inheritance. It should make your life easier when you need many caches configured in the same way.
  • A cache store for Redis implemented by Simon Paulger. Thanks for the contribution !
  • Lots more... Look at the resolved issues to find out what else we've fixed.
We have also made some significant changes to our website:
  • clearer layout
  • brand new use-case-driven examples
  • the download page for cache stores are now clearly divides core cache stores (i.e. the ones included in Infinispan release), extra stores (i.e. ones that need to be downloaded separately) and the version compatibility
  • project references, i.e. other open-source projects which use Infinispan

The 8.0 release marks only the beginning of a number of exciting things we will be working on, so please check out our roadmap.

An important note: we will be maintaining the 7.2 branch of Infinispan for quite a while, so, if you are still stuck with Java 7, you shouldn't worry about upgrading just yet... unless you want to use the new stuff :)

Saturday, 22 August 2015

Infinispan 8.0.0.CR1 is out!

The first release candidate of Infinispan 8 is out, bringing 2 much anticipated features: Continuous Queries (ISPN-5417) and notification for expired cache entries (ISPN-694)

Apart from that, the Functional Java 8 API had some refinements, and in the server land it's now possible to configure thread-pools along with the cache container (ISPN-5687).

Last but not least, lots of bug were fixed: please refer to the release notes  for full details and visit our downloads page to get this release.

We'd love to get your feedback, so please contact us our IRC channel #infinispan, twitter @infinispan, mailing list or user forums.


Friday, 21 August 2015

New Functional Map API in Infinispan 8 - Introduction

In Infinispan 8.0.0.Beta3, we have a introduced a new experimental API for interacting with your data which takes advantage of the functional programming additions and improved asynchronous programming capabilities available in Java 8.

Over the next few weeks we'll be introducing different aspects of the API. In this first blog post, we'll focus on why we felt there's a need for a new approach, answering a few key questions.

ConcurrentMap and JCache

Map­-like key/value pair APIs have often been used for distributed caching and in-­memory data grids. Initially, ConcurrentMap became popular but this was designed to be run within a single JVM, and hence some of the operations suffered in distributed environments or when persistence stores were attached. For example, methods such as 'V put(K, V)', 'V putIfAbsent(K, V)', 'V replace(K, V)' would force implementations to return the previous value, but often this value is not needed yet this could be expensive to transfer.

JSR­-107 set out to improve on this and came up with the JCache specification which solved this particular problem separating operations such ConcurrentMap's 'V put(K, V)' into two operations: 'void put(K, V)' and 'V getAndPut(K, V)', and it applied the same logic to other operations such as 'replace' by providing an alternative 'getAndReplace(K, V)'... etc.

However, even though JCache was designed with distributed caching in mind, it still failed to provide an API to execute operations asynchronously and hence avoid resource under­utilization by having threads waiting for remote operations to complete. 'l​oadAll' ​is probably the only exception, and it would have been the perfect candidate to return a F​uture​ or similar construct, but having to pass in a completion listener feels a bit clunky and cannot be chained easily.

In my opinion, the best parts of JCache are 'i​nvoke'​ and 'i​nvokeAll' methods. When you
look at them, you see a lot of potential to reimplement get, put, getAndPut, getAndReplace, putAll,​ getAll, ​and many others using these methods. In other words, as an implementer, all you should need to implement is those two functions, and the rest would be syntactic sugar for the user. Unfortunately, the way 'i​nvoke' and 'i​nvokeAll' handle arguments is a bit clunky, and really,  it's just screaming for lambdas to be passed in and C​ompletableFuture instances to be returned (Java 8!).

So, when Infinispan moved to Java 8, we decided to revisit these concepts and see if we could come up with a better, distilled map­-like interface to be used for as either a caching or data grid API.

New Functional Map API

Infinispan's Functional Map API is a distilled map­like asynchronous API which uses lambdas to interact with data.

Asynchronous and Lazy

Being an asynchronous API, all methods that return a single result, return a CompletableFuture which wraps the result, so you can use the resources of your system more efficiently by having the possibility to receive callbacks when the CompletableFuture has completed, or you can chain or compose them with other CompletableFuture. If you do want to block the thread and wait for the result, just as it happens with a ConcurrentMap or JCache method call, you can simply call `CompletableFuture.get()` (for such situations, we are working on finding ways to avoid unnecessary thread creation when the caller will block on the CompletableFuture).

For those operations that return multiple results, the API returns instances of a ​Traversable interface which offers a lazy pull­-style API for working with multiple results. Although push­-style interfaces for handling multiple results, such as RxJava, are fully asynchronous, they're harder to use from a user’s perspective. T​raversable,​ being a lazy pull­-style API, can still be asynchronous underneath since the user can decide to work on the traversable at a later stage, and the Traversable implementation itself can decide when to compute those results.

Lambda transparency

Since the content of the lambdas is transparent to Infinispan, the API has been split into 3 interfaces for read­-only (R​eadOnlyMap)​, read­-write (R​eadWriteMap)​ and write­-only (W​riteOnlyMap)​ operations respectively, in order to provide hints to the Infinispan internals on the type of work needed to support lambdas.

For example, Infinispan has been designed in such way that our 'C​oncurrentMap.​g​et(​)' and 'JCache.​g​etAll(​)' implementations do not require locks to be acquired. These get()/getAll() operations are read-only operations, and hence if you call our functional map R​eadOnlyMap's 'eval(​)' or 'e​valMany(​)' operations, you get the same benefit. A key advantage of R​eadOnlyMap's 'eval​()' and 'e​valMany(​)' operations is that they take lambdas as parameters which means the returned types are more flexible, so we can return a value associated with the key, or we can return a boolean if a value has the expected contents, or we can return some metadata parameters from it, e.g. last accessed time, last modified time, creation time, lifespan, version information...etc.

Another important hint that is required to make efficient use of the system is to know when a write-only operation is being executed. Write­-only operations require locks to be acquired and as demonstrated by JCache's 'void removeAll()' and `void put(K, V)' or ConcurrentMap's 'putAll()', they do not require the previous value to be queried or read, which as explained above is a very important optimization since reading the previous value might require the persistence layer or a remote node to be queried. WriteOnlyMap's 'eval()', 'evalMany()', and 'evalAll()' follow this same pattern with the added flexibility for the lambda to decide what kind of write operation to execute.

The final type of operations we have are read­-write operations, and within this category we find CAS-like (Compare­-And­-Swap) operations. This type of operations require previous value associated with the key to be read and for locks to be acquired before executing the lambda. Most of the operations in ConcurrentMap and JCache operations fall within this domain including: 'V put(K, V)', 'boolean putIfAbsent(K, V)', 'V replace(K, V)', 'boolean replace(K, V, V)'...etc. ReadWriteMap's 'eval()', 'evalMany()' and 'evalAll()' provide a way to implement the vast majority of these operations thanks to the flexibility of the lambdas passed in. So you can make CAS­-like comparisons not only based on value equality but based on metadata parameter equality such as version information, and you can send back previous value or boolean instances to signal whether the CAS­-like comparison succeeded.

$DEITY, I need to learn a new API!!!

This new functional Map­-like API is meant to complement existing Key/Value Infinispan API offerings, so you'll still be able to use ConcurrentMap or JCache standard APIs if that's what suits your use case best.

The target audience for this new API is either:
  1. Distributed or persistent caching/in­-memory­ data­-grid users that want to benefit from CompletableFuture and/or Traversable for async/lazy data grid or caching data manipulation. The clear advantage here is that threads do not need to be idle waiting for remote operations to complete, but instead these can be notified when remote operations complete and then chain them with other subsequent operations.
  2. Users wanting to go beyond the standard operations exposed by ConcurrentMap and JCache, for example, if you want to do a replace operation using metadata parameter equality instead of value equality, or if you want to retrieve metadata information from values...etc.
Internally, we feel that this new functional Map­-like API distills the Map­-like APIs that we currently offer (including ConcurrentMap and JCache) and gets rid of a lot of duplication in our AdvancedCache API (e.g. 'getCacheEntry()', 'getAsync()', 'putAsync()', 'put(K, V, Metadata)'...etc), and hence down the line, we'd want all these APIs to be implemented using the new functional Map­like API. By doing that, we hope to reduce the number of commands that our internal architecture implements, hence reducing our code base.

This new API also offers a new approach for passing per-invocation parameters, and much more flexible Metadata handling compared to our current approach. As we dig into this new API in next blog posts, we'll explain the differences and advantages provided by these.

Functional Map API usage examples

To give you a little taste of what the API looks like, here is a write-­only operation to associate a key with a value, whose CompletableFuture has been chained so that when it completes, a read­-only operation can be executed to read the stored value, and when that completes, print it to the system output:

You can find more examples of this new API in FunctionalConcurrentMap and FunctionalJCache classes, which are implementations of ConcurrentMap and JCache respectively using the new Functional Map API.

Tell me more!!

Over the next few weeks I'll be posting examples looking at the finer details of these new Functional Map APIs, but if you're eager to get started, check the classes in org.infinispan.functional package, FunctionalConcurrentMap and FunctionalJCache which are ConcurrentMap and JCache implementations based on these Functional Map APIs, and FunctionalMapTest which demonstrates operations that go beyond what ConcurrentMap and JCache offer.

Happy (functional) hacking :)


Monday, 17 August 2015

Infinispan Spark connector 0.1 released!

Dear users,

The Infinispan connector for Apache Spark has just been made available as a Spark Package!

What is it?

The Infinispan Spark connector allows tight integration with Apache Spark, allowing Spark jobs to be run against data stored in the Infinispan Server, exposing any cache as an RDD, and also writing data from any key/value RDD to a cache. It's also possible to create a DStream backed by cache events and to save any key-value DStream to a cache.

The minimum version required is Infinispan 8.0.0.Beta3.

Giving it a spin with Docker

A handy docker image that contains an Infinispan cluster co-located with an Apache Spark standalone cluster is the fastest way to try the connector. Start by launching the container that hosts the Spark Master:

And then run as many worker nodes as you want:

Using the shell

The Apache Spark shell is a convenient way to quickly run jobs in an interactive fashion. Taking advantage of the fact that Spark is already installed in the docker containers (and thus the shell), let's attach to the master:

Once inside, a Spark shell can be launched by:

That's all it's needed. The shell grabs the Infinispan connector and its dependencies from and exposes them in the classpath.

Generating data and writing to Infinispan

Let's obtain a list of words from the Linux dictionary, and generate 1k random 4-word phrases. Paste the commands in the shell:

From the phrases, we'll create a key value RDD (Long, String):

To save to Infinispan:

Obtaining facts about data

To be able to explore data in the cache, the first step is to create an infinispan RDD:

As an example job, let's calculate a histogram showing the distribution of word lengths in the phrases. This is simply a sequence of transformations expressed by:

This pipeline yields:

2 chars words: 10 occurrences
3 chars words: 37 occurrences
4 chars words: 133 occurrences
5 chars words: 219 occurrences
6 chars words: 373 occurrences
7 chars words: 428 occurrences
8 chars words: 510 occurrences
9 chars words: 508 occurrences
10 chars words: 471 occurrences
11 chars words: 380 occurrences
12 chars words: 309 occurrences
13 chars words: 238 occurrences


Now let's find similar words using the Levenshtein distance algorithm. For that we need to define a function that will calculate the edit distance between two strings. As usual, paste in the shell:

Empowered by the Levenshtein distance implementation, we need another function that given a word, will find in the cache similar words according to the provided maximum edit distance:

Sample usage:

Where to go from here

And that concludes this first post on Infinispan-Spark integration. Be sure to check the Twitter demo for non-shell usages of the connector, including Java and Scala API.

And it goes without saying, your feedback is much appreciated! :)

Tuesday, 11 August 2015

Infinispan 7.2.4.Final out including fixes for async store, Hot Rod...etc

Infinispan 7.2.4.Final is just out containing some important fixes in areas such as Hot Rod client and server, async cache store, key set iteration, as well as a Hibernate HQL parser upgrade. You can find more details about the issues fixed in our detailed release notes.

Happy hacking :)


Monday, 10 August 2015

Infinispan 8.0.0.Beta3 out with Lucene 5, Functional API, Templates...etc

Infinispan 8.0.0.Beta3 is out with a lot of new bells and whistles including:

  • Infinispan querying has been upgraded to be usable with Lucene 5, with a lot of improvements particularly in terms of memory efficiency. With this upgrade, Lucene 4 support has been removed from the community project.
  • Configuration templates are finally here, meaning that users can now define cache configurations as template configurations, and then create cache configurations using those templates as base configuration.
  • A brand new, experimental, FunctionalMap API has been added that takes advantage of the lambda and async programming improvements introduced in Java 8. We consider this an advanced API that will enable Infinispan to grow beyond the well known javax.cache.Cache and java.util.concurrent.ConcurrentMap APIs. In the next few days I'll be posting mutiple, detailed, blog posts looking at the different aspects of the API. If you area eager to get started, you can first have a look at a ConcurrentMap implementation using FunctionalMap to get a feel for it. Your feedback is highly appreciated!
  • Pedro has completed ISPN-2849 which should provide much better performance by liberating precious JGroups threads from having to wait for locks to be acquired remotely.
To get more details, check our release notes and go to our downloads page to find out how to get started with this Infinispan version.

Happy hacking :)


Wednesday, 5 August 2015

RadarGun 2.1.0.Final

I'm happy to announce RadarGun 2.1.0.Final is officialy out. RadarGun is a multi purpose testing tool, which provides means to measure performance and test features specific to distributed systems (data grids, caches in particular).  

The release contains multiple fixes and improvements listed below:
  • New plugins: jdg65, infinispan71, infinispan72, infinispan80, jcache
  • Reporting improvements (perfrepo reporter, percentile chart, net/gross throughput)
  • GaussianKeySelector
  • Enhanced listener support
  • JMX invocation stage now supports setting attributes
  • LogLogic improvements
  • Enanced TopologyHistory & WaitForTopologySettle stage
  • Better test coverage
  • Multiple bug fixes
From this point on, the main development will continue in 3.x line. We are preparing considerable design changes, so that RadarGun can be easily utilized to measure performance of other areas as well (e.g. JPA).

For more information about RadarGun, feel free to visit our wiki page, or try five minute tutorial. In case of any issues, please refer to our issue tracker.

Thanks everyone for their contributions!