Thursday, 15 February 2018

Hotrod clients C++/C# 8.2.0.Beta1 are out!

Dear Infinispanners,
C++ and C# 8.2.0.Beta1 releases are available!

These releases contain all the 8.2.0 features.

Worth a mention is the improvement in the remote execution API: we moved the JBossMarshaller basic implementation from test to the distro in order to simplify the data management on the application side. Test examples [1] and [2] have been updated accordingly.

Next step will be a CR release containing improvements on API docs (doxygen)

Check the release notes, browse the source code (C++, C#) or download the releases!

Cheers,
The Infinispan Team

Friday, 9 February 2018

RESTful queries coming to Infinispan 9.2


One of the interesting features in the upcoming Infinispan 9.2 release is the possibility to execute queries over the REST endpoint, enabling users to take advantage of the easy-to-use and expressiveness of the Ickle query language, that combines a subset of JP-QL with full-text features. You can learn more info about Ickle in a previous post.

Besides exposing query over REST, Infinispan 9.2 also adds support for mapping between JSON and Protobuf formats, allowing an efficient storage in binary format while exposing queries, reading and writing content as JSON documents.

To illustrate those new capabilities, this post will walk you through a sample app from scratch!


Sample app


Running the server

We start by running the Infinispan Server 9.2.0.CR2 (the latest release candidate):


This will get you a fresh instance of Infinispan running, with login and password 'user' and the REST port 8080 mapped to localhost. TIP: if you run more than one container, they'll form a cluster automatically.

Creating an indexed cache

Next step is to create an indexed cache called 'pokemon'. We make use of the CLI  (Command Line Interface) to create this cache. In the future, with ISPN-8529, we'll also be able to create cache with arbitrary configuration using REST, but for now we execute a CLI recipe:


Creating the schema

In order to be able to query, we need to define a protobuf schema for our data. The schema follows the Protobuf 2 format (Protobuf 3 support is coming) and allows for extensions to define indexing properties (analyzers, storage, etc).

Here's how it looks like:


The protobuf schema can contain some comments on top of fields and messages with "annotations" to control indexing. Hibernate Search users will recognize some of those pseudo annotations we are using here: they resemble closely their counterpart.


Registering the schema

Once we have our schema, we can easily register it via REST:



Populating the cache

We're now ready to put some data in the cache. As mentioned earlier, ingesting can be done by sending JSON documents directly. Once Infinispan receives those documents, it will convert them to protobuf, index and store them.

In order to match a particular inbound document to an entity in the schema, Infinispan uses a special meta field called _type that must be provided in the document. Here's an example of a JSON document that conforms to our schema:

Writing the document is easy:


we can retrieve content by key as JSON:


Querying


The new query endpoint can be called with an "action" parameter named "search", after the cache name. The simplest query, which returns all data can be done with:

http://localhost:8080/rest/pokemon?action=search&query=from Pokemon


If you do not want to return all the fields, use a Select clause:

http://localhost:8080/rest/pokemon?action=search&query=Select name, speed from Pokemon


Pagination can be controlled with the offset, max_results URL parameters:

http://localhost:8080/rest/pokemon?action=search&query=from Pokemon&offset=2&max_results=20


Grouping is also possible:

http://localhost:8080/rest/pokemon?action=search&query=select count(p.name) from Pokemon p group by generation


Example of a query result:

http://localhost:8080/rest/pokemon?action=search&query=select name,pokedex_number,against_fire from Pokemon order by against_fire asc&max_results=5

Results:




Conclusion

 

Infinispan 9.2 makes it easier to quickly ingest and query datasets using the ubiquitous JSON format, without sacrificing type safety and storage size.

By storing Protobuf, this will also enable other clients like the Hot Rod C#/C++ clients to query, read and write data simultaneously with REST clients.

The full source code for the demo, along with instructions on how to populate the whole dataset can be found at Github.

Finally, please try out this new feature in your own dataset and let us know how it goes!




Wednesday, 7 February 2018

Data Container Changes Part 3

Just over a year ago we detailed some improvements to the data container, including the availability of Off Heap storage in part 2. There have been quite a few fixes for Off Heap especially around memory size estimations with Infinispan 9.2. There is also a brand new "eviction" strategy that has a bit of a twist.

Eviction Strategy Resurrected


Some of you may have remembered that Infinispan used to have an eviction strategy. This was originally used to decide what eviction algorithm was used, such as LRU or LIRS. This was removed when the new data container was introduced. Well... it is back again, but it will be used for a slightly different purpose.

The eviction strategy still has NONE & MANUAL which are exactly the same as before.

Remove strategy


There is a new REMOVE strategy that is configured by default if eviction size is greater than 0. This strategy essentially enables eviction and removes old entries as new ones are inserted.

Exception strategy


We have a brand new "eviction" strategy that provides new functionality. It is unique in that it doesn't really evict, but rather prevent entries from being inserted.  This is the EXCEPTION strategy which blocks new entries from being inserted (or updated if they exceed memory size) by throwing a ContainerFullException when the size is reached.

This strategy only works on transactional caches that always have 2 phase commit enabled. This can be useful if you want to always have only so many entries and to give priority to currently inserted entries. This strategy has better performance than REMOVE since it doesn't have to bookkeep all entries to know what to remove as well.

Note this strategy works across all storage types: OBJECT, BINARY and OFFHEAP and works with both MEMORY and SIZE based "eviction types. This makes it just as flexible as the REMOVE eviction strategy and hope it finds some uses by people.

How to Configure EXCEPTION Strategy


This is how you can enable MEMORY based EXCEPTION "eviction" using xml configuration.
This is how you configure the same thing but programmatically.

Off Heap Memory Size Allocations & Estimations


Before the off heap memory based eviction only counted the allocated memory chunks for the stored entries themselves. This unfortunately meant that the size estimate is a bit less than what it should have been. There are a few things that we improved since then, including reducing the overhead of our allocations. Note all of the below things require no configuration changes and users should just get the benefits.

Reduced per object overhead


Prior the overhead for immutable entries with eviction, Infinispan itself use to allocate 2 chunks of memory with one being 28 bytes and adding 8 bytes to the actual object. Now we only allocate an additional 16 bytes to the object memory block itself (saving the extra allocation and requiring less on the object) when using eviction. Due to memory allocation overhead this saves much more than the 20 bytes as the allocator also has its own overhead.

We also shaved off 4 bytes off of all entries if expiration was not used, meaning overhead for an immutable cache entry without eviction only requires 21 bytes of overhead from ISPN when using off heap (retained in the same allocation block).

Per allocation memory sizing estimations


Internally ISPN allocates a new chunk of memory for each object. This is done currently to leverage the underlying OS allocator to handle features such as fragmentation or compaction (if the allocator does so). Unfortunately this means that each object has its own overhead from the allocator. Thus we now take that into account when estimating the memory used by adding 8 bytes overhead and aligning to 16 bytes. This seems to be a pretty common way for allocators to work. If possible we could allow for tweaking these values, but they are hard coded currently.

Accounting for Address Count


As was mentioned in the prior blog post about off heap, we allocate a single block of memory to hold address counters for our lookups when using Off Heap. Unfortunately we didn't account for that in the memory eviction count. We now account for that in the eviction mechanism, thus your memory eviction size must be greater than the address count rounded up to the nearest power of 2, multiplied by 8. What a mouthful...

Wrap up


Off heap has been overhauled quite a bit to try to reduce memory usage, fix bugs and more accurately estimate the memory used. We hope that along with the new eviction strategy are welcome additions to various applications.

Please make sure to contact us if you have any feedback, find any bugs or have any questions! You can get in contact with various places listed on our website.

Friday, 2 February 2018

Infinispan 9.2.0.CR2 is out!

The Infinispan team is proud to announce that Infinispan 9.2.0.CR2 has been released!

New and noteworthy in this release:

* [ISPN-8641] Wildfly 11 support
* [ISPN-8715] Local counters
* [ISPN-8695] Creation of caches from remote clients with custom configuration
* [ISPN-8427] Support for non-String keys in the REST server
* [ISPN-8619] Java Hot Rod client rewritten using Netty

For more details, consult the release notes.

As we approach the final version release quickly, users are encouraged to try it and send feedback. So, please head over to the download page and try it out. Or if you prefer you can run using Docker with:

    docker run -it jboss/infinispan-server:9.2.0.CR2

If you have any issues, please report it in our bug tracker, ask us on the forum, or join us for a friendly chat on the #infinispan IRC channel on Freenode.


Cheers! 

A different kind of template: wildcards

Infinispan's configuration templates are an extremely flexible way to create multiple caches using the same configuration. Configuration inheritance works by explicitly declaring the configuration a specific cache should use.

This works fine when you know the caches you are going to use upfront, but in more dynamic scenarios, this might not be possible. Additionally, if you are using the JCache API, there is no way for you to specify the configuration template you want to use.

Infinispan 9.2 introduces an alternative way to apply templates to caches: wildcards. By creating a template with a wildcard in its name, e.g. `basecache*`, any cache whose name matches the template name will inherit that configuration.

Let's show an example:

Above, caches `basecache-1` and `basecache-2` will use the `basecache*` configuration. This behaviour also applies when retrieving caches programmatically:


When using the JCache API, using the XML file above and the following code will achieve the same result:


NOTE: If a cache name matches multiple wildcards, i.e. it is ambiguous, an exception will be thrown.

I will be introducing other new features that Infinispan 9.2 brings to cache configuration in an upcoming blog post. Stay tuned !

Infinispan coming to JFokus!!



The Infinispan team is on the move again! After Katia's trip to Snowcamp, it's my turn to head  to JFokus, Sweden's largest developer conference.

For JFokus we've morphed the streaming data workshop we delivered last year at Devoxx Belgium and Codemotion Madrid into a 3 hour long deep dive tutorial. This will be delivered Monday 5th February at 13:30 local time.

On top of that, I'll be delivering a talk on streaming data analysis with Kubernetes where Infinispan will be featured. If you're interested make sure you come on Tuesday, 6th February at 14:00 local time.

So, if you're coming to JFokus and you're interesting in data grids, streaming data or similar topics, make sure you attend our talks.

Cheers,
Galder

Thursday, 1 February 2018

Executing Code in the Grid

Infinispan has quite a few spectacular ways of executing code in the grid. But I bet you haven't heard or aren't really familiar with those, which is disappointing. I hope to fix this, however, as we have added more information to the user guide and wanted to detail that here in this blog.

As I am sure you are aware Infinispan can be used in embedded (in your JVM) and remote (in a standalone server). Unfortunately, this means there are different ways of executing code based on which mode you are in.

Embedded

The embedded mode has the most features available and is the easiest to use. The appropriate section can be found here.

One question that seems to come up more than others is how a user can perform cache operations on all data, such as remove all elements that match a given filter. If you are curious about this one, you should check out the Examples section with the example named "Remove specific entries" as it details how a user would do exactly that.

I should also point out the new Cluster Executor section, which is similar to Streams that replaced Map Reduce, is here to replace the old Distributed Executor. With Cluster Executor and Distributed Streams there is a clearer distinction between executing code on nodes (Cluster Executor) and executing code based on data (Distributed Streams).

Server

The server is a bit more interesting and usually requires configuration ahead of time, unlike Embedded. It can be found in this section. The benefit of the server is most of these can invoke embedded operations internally.

Scripting is by far the easiest to use - just insert your script and execute - but has some limitations that we haven't been able to fix yet.

Server tasks can run pretty much any Java but require registering classes beforehand. Unfortunately, this section still needs to be filled in and should be added sometime in the near future. I would say, until then, if you are interested, you can look at some tests in github.

Takeaway

I hope this has helped users be able to find out some more information about the various ways of executing arbitrary code for your data. If you have any questions or need more clarification about the features highlighted here, please don't hesitate to let us know at any of these places.