Monday, 23 January 2017

Data Container Changes Part 2

Before the end of the year I wrote a blog post detailing some of the more recent changes that Infinispan introduced with the in memory data container.  As was mentioned in the previous post we would be detailing some other new changes. If you poked around in our new schema after Beta 1 you may have spoiled the surprise for yourself.

With the upcoming 9.0 Beta 2, I am excited to announce that Infinispan will have support for entries being stored off heap, as in outside of the JVM heap. This has some interesting benefits and drawbacks, but we hope you can agree the benefits in many cases far outweigh the drawbacks. But before we get into that lets first see how you can configure your cache to utilize off heap.

New Configuration


The off heap configuration is another option under the new memory element that was discussed in the previous post. It is used in the same way that either OBJECT or BINARY is used.

XML


DECLARATIVE

As you can see the configuration is almost identical to the other types of storage. The only real difference is the new address pointer argument, which will be explained below.

Description


Our off heap implementation uses the Java Unsafe to allocate memory outside of the Java heap. This data is stored as a bucket of linked list pointers, just like a standard Java HashMap. When an entry is added the key's serialized byte[] is hashed and an appropriate offset is found in the bucket. Then the entry is added to the bucket as the first element or if an entry(ies) is present it is added to the rear of the linked list.

All of this data is protected by an array of ReadWriteLock instances.  The number of address pointers is evenly divisible by the number of lock instances.  The number of lock instances is how many cores your machines doubled and rounded to the nearest power of two.  Thus each lock protects an equivalent amount of address spaces.  This provides for good lock granularity and reads will not block each other but unfortunately writes will wait and block all reads.

If you are using a bounded off heap container either by count or memory this will create a backing LRU doubly linked list to keep track of which elements were accessed most recently and removes the least recently accessed element when there are too many present in the cache.

Requirements


Our off heap implementation supports all existing features of Infinispan. There are some limitations and drawbacks of using the feature. This section will describe these in further detail.

Serialization


Off Heap runs in essentially BINARY mode, which requires entries to be serialized into their byte[] forms. Thus all keys and entries must be Serializable or have provided Infinispan Externalizers.

Size


Currently a key and a value must be able to be stored in a byte[]. Therefore a key or value in serialized form cannot be more than just over 2 Gigabytes.  This could be enhanced possibly at a later point, if the need arose.  I hope you aren't transferring this over your network though!

Memory Overhead


As with all cache implementations there is overhead required to store these entries. We have a fixed and variable overhead which scales with the amount of entries. I will detail these and briefly mention what they are used for.

Fixed overhead

As was mentioned there is a new address count parameter when configuring off heap. This value is used to determine how many linked list pointers are available. Normally you want to have more node pointers than you have entries in the cache, since then chances are you have one element in each linked list.  This is very similar to the int argument constructor for HashMap.  The big difference being that this off heap implementation will not resize.  Thus your read/write times will be slower if you have a lot of collisions. The overhead of a pointer is 8 bytes, so for approximately one million pointers it will be 8 Megabytes of off heap.

Bounded off heap requires very little fixed memory, just 32 bytes for head/tail pointers and a counter and an additional Java lock object.

Variable overhead

Unfortunately to store your entries we may need to wrap them with some data. Thus for every entry you add to the cache we store an additional 25 bytes for each entry.  This data is used for header information and also our linked list forward pointer.

Bounded off heap requires an additional address pointer for its LRU list.  Thus each entry adds an additional 36 bytes above the number above. It is larger due to requiring a doubly linked list and having to have pointers to and from the entry and eviction node.

Performance


The off heap container was designed with the intent that key lookups are quite fast. In general these should be about the same performance. However local reads and stream operations can be a little slower as there is an additional deserialization phase required.

Summary


We hope you all try out our new off heap feature! Please make sure to contact us if you have any feedback, find any bugs or have any questions!  You can get in contact with us on our forum, issue tracker, or directly on IRC freenode channel Infinispan

9.x JDBC Store Improvements

Infinispan 9 introduces several changes to the JDBC stores, in summary:
  • Configuration of DB version
  • Upsert support for store writes
  • Timestamp indexing
  • c3p0 connection pool replaced by HikariCP

DB Version Configuration


Previously when configuring a JDBC store it was only possible for a user to specify the vendor of the underlying DB. Consequently, it was not possible for Infinispan to utilise more recent features of DB as the SQL utilised by our JDBC stores had to satisfy the capabilities of the oldest supported DB version.

In Infinispan 9 we have completely refactored the code responsible for generating SQL queries.  Enabling our JDBC stores to take greater advantage of optimisations and features applicable to a given database vendor and version. See the below gist for examples of how to specify the major and minor versions of your database.

Programmatic config:
XML Config:
Note: If no version information is provided, then we attempt to retrieve version data via the JDBC driver.  This is not always possible and in such cases we default to SQL queries which are compatible with the lowest supported version of the specified DB dialect.

Upsert Support


As a consequence of the refactoring mentioned above, writes to the JDBC stores finally utilise upserts. Previously, the JDBC stores had to first select an entry, before inserting or updating a DB row depending on whether the entry previously existed.  Now, in supported DBs, store writes are performed atomically via a single SQL statement.

In some cases it may be desirable for the previous store behaviour to be utilised, in such cases the following property should be passed to your store's configuration and set to true: `infinispan.jdbc.upsert.disabled`.

Timestamp Indexing


By default an index is now created on the `timestamp-column` of a JDBC store when the "create-on-start" option is set to true for a store's table.  The advantage of this index is that it prevents the DB from having to perform full table searches when purging a table of expired cache entries.  Similar to upsert support, this index is optional an can be disabled by setting the property `infinispan.jdbc.indexing.disabled` to true.  

Hello HikariCP


In Infinispan 9 we welcome HikariCP as the new default implementation for the JDBC PooledConnectionFactory. HikariCP provides superior performance to c3p0 (the previous default), whilst also providing a much smaller footprint. The PooledConnectionFactoryConfiguration remains the same as before, expect we now include the ability to explicitly define a properties file where additional configuration parameters can be specified for the underlying HikariCP. For a full list of the available HikariCP configuration properties, please see the official documentation

Note: Support for c3p0 has been deprecated and will be removed in a future release. However, users can force c3p0 to be utilised as before by providing the system property `-Dinfinispan.jdbc.c3p0.force=true`.


Summary


We have introduced the above new features to the JDBC stores in order to improve performance and to enable us to further the store's capabilities in the future. If you're a user of the JDBC stores and have any feedback on the latest changes, or would like to request some new features/optimisations, let us know via the forumissue tracker or the #infinispan channel on Freenode. 

Wednesday, 11 January 2017

Near Cache for native C++/C# Client example

Dear Readers,

as mentioned in our previous post about the new C++/C# release 8.1.0.Beta1, clients are now equipped with near cache support.

The near cache is an additional cache level that keeps the most recently used cache entries in an "in memory" data structure. Near cached objects are synchronized with the remote server value in the background and can be get as fast as a map[] operation.

So, your client tends to periodically focus the operations on a subset of your entries? This feature could be of help: it's easy to use, just enable it and you'll have near cache seamless under the wood.

A C++ example of a cache with near cache configuration
The last line does the magic, the INVALIDATED mode is the active mode for the near cache (default mode is DISABLED which means no near cache, see Java docs), maxEntries is the maximum number of entries that can be stored nearly. If the near cache is full the oldest entry will be evicted. Set maxEntries=0 for unbounded cache (do you have enough memory?)
Now a full example of application that just does some gets and puts and counts how many of them are served remote and how many are served nearly. As you can see the cache object is an instance of the "well known" RemoteCache class
Entries values in the near cache are kept aligned with the remote cache state via the events subsystem: if something changes in the server, an update event (modified, expired, removed) is sent to the client that updates the cache accordingly.

By the way: do you know that C++/C# clients can subscribe listener to events? In the next "native" post we will see how.

Cheers!
and thank you for reading.

Wednesday, 4 January 2017

Hotrod clients C++/C# 8.1.0.Beta1 released!

New Year, New (Beta) Clients!

I'm pleased to announce that the C++/C# clients version 8.1.0.Beta1 are out!
The big news in this release is:

  • Near Caching Support

Find the bits in the usual place: http://infinispan.org/hotrod-clients/

Features list for 8.1 is almost done... not bad :)
Feedbacks, proposals, hints and lines of code are welcome!

Happy New Year,
The Infinispan Team

Tuesday, 20 December 2016

Spring Boot Starters

Ho, ho, hooo! It looks like all members of Infinispan Community have been nice and Santa brought you Spring Boot Starters!


This will make you even more productive and your code less verbose!

Why do I need starters?


Spring Boot Starters make the bootstrapping process much easier and faster. The starter brings you required Maven dependencies as well as some predefined configuration bits.

What do I need to get started?


The starter can operate in two modes: client/server (when you connect to a remote Infinispan Server cluster) and embedded (packaged along with your app). The former is the default. It's also possible to use both those modes at the same time (store some data along with your app and connect to a remote Infinispan Server cluster to perform some other type of operations).

Assuming you have an Infinispan Server running on IP address 192.168.0.17, all you need to do is to use the following dependencies:


By default, the starter will try to locate hotrod-client.properties file. The file should contain at least the server list:


It is also possible to create RemoteCacheManager's configuration manually:


That's it! Your app should successfully connect to a remote cluster and you should be able to inject RemoteCacheManager.

Using Infinispan embedded is even simpler than that. All you need to do is to add additional dependency to the classpath:


The starter will provide you a preconfigured EmbeddedCacheManager. In order to customize the configuration, use the following code snippet:

Further reading


There are two link I highly recommend you to read. The first is the Spring Boot tutorial and the second is the Github page of the Starters project


Kudos


Special thanks go to Marco Yuen, who donated us with Spring Boot Starters code and Tomasz Zabłocki, who updated them to current version and Stéphane Nicoll who spent tremendous amount of time reviewing the Starters.

Monday, 19 December 2016

Data Container Changes Part 1

Infinispan 9.0 Beta 1 introduces some big changes to the Infinispan data container.  This is the first of two blog posts detailing those changes.

This post will cover the changes to eviction which utilizes a new provider, Caffeine.  As you may already know Infinispan has supported our own implementations of LRU (Least Recently Used) and LIRS (Low Inter-reference Receny Set) algorithms for our bounded caches.

Our implementations of eviction were even rewritten for Infinispan 8, but we found we still had some issues or limitations with them, especially LIRS.  Our old implementation had some problems with keeping the correct number of entries.  The new implementation while not having that issue had others, such as being considerably more complex.  And while it implemented the entire LIRS specification, it could have memory usage issues.  This led us to looking at alternatives and Caffeine seemed like a logical fit as well as being well maintained and the author, Ben Manes, is quite responsive.

Enter Caffeine


Caffeine doesn't utilize LRU or LIRS for its eviction algorithm and instead implements TinyLFU with an admission window.  This has the benefit of the high hit ratio like LIRS, while also requiring low memory overhead like LRU.  Caffeine also provides custom weighting for objects, which allow us to reuse the code that was developed for MEMORY based eviction as well.

The only thing that Caffeine doesn't support is our idea of a custom Equivalence.  Thus Infinispan now wraps byte[] instances to ensure equals and hashCode methods work properly.  This also gives us a good opportunity to reevaluate the dataContainer configuration element.

Deprecations


The data container configuration has thus been deprecated and is now replaced by a new configuration element named memory.   Also since we are adding a new element the eviction configuration could also be consolidated into memory, and thus eviction is also deprecated.  And last but not least the storeAsBinary configuration element has also been integrated into the new memory configuration element.  Now we have 1 configuration element instead of 3, can't beat that!

New Configuration


The new memory configuration will start out pretty simple and new elements can be added as there is a need.  The memory element will be composed of a single sub element that can be of three different choices.  For this post we will go over two of the sub elements: OBJECT and BINARY.

OBJECT


Object storage stores the actual objects as provided from the user in the Java Heap.  This is the default storage method when no memory configuration is provided.  This method will provide the best performance when using operations that operate upon the entire data set, such as distributed streams, indexing and local reads etc.

Unfortunately OBJECT storage only allows for COUNT based eviction as we cannot properly estimate user object types properly.  This could be improved in a feature version if there is enough interest. Note that you can technically configured MEMORY eviction type with the OBJECT storage type with declarative configuration, but it will throw an exception when you build the configuration.  Therefore OBJECT only has a single element named size to determine the amount of entries that can be stored in the cache.

An example of how Object storage can be configured:

XML

DECLARATIVE


BINARY


Binary storage stores the object in its serialized form in a byte array.  This has an interesting side effect of objects are always stored as a deep copy.  This can be useful if you want to modify an object after retrieving it without affecting the underlying cache stored object.  Since objects have to be deserialized when performing operations on them some things such as distributed streams and local gets will be a little bit slower.

A nice benefit of storing entries as BINARY is that we can estimate the total on heap size of the object.  Thus BINARY supports both COUNT and MEMORY based eviction types.

An example of how Binary storage can be configured:

XML

DECLARATIVE


OFF-HEAP


This option will be described in more detail in the next blog post.  Stay tuned!

Conclusion


Caffeine should bring us a great solution, while also reducing a lot of maintenance ourselves.  The new memory configuration also provides a simpler solution by removing two other configuration elements.

We hope you enjoy the new changes to the data container and look out for another blog post coming soon to detail the other new changes to the data container!  In the meantime please check out our latest Infinispan 9.0 before it goes final and give us any feedback on IRC or JIRA

Thursday, 15 December 2016

Thanks Soft-Shake and Devoxx Morocco!

Last month I presented about building functional reactive applications with Infinispan, Node.js and Elm at both Soft-Shake in Geneva (slides) and Devoxx Morocco (slides).

Thanks a lot to all the participants who attended the talks and thanks also to the organisers for accepting my talk. Both conferences were really enjoyable!

At Soft-Shake I managed to attend a few presentations, and the one that really stuck with me was the one from Alexandre Masselot on "Données CFF en temps réel: tribulations techniques dans la stack Big Data" (slides). It was a very interesting use case on doing big data with the information from the Swiss Rail system. Although there was no live demo, Alexandre gave the link to a repo where you can run stuff yourself. Very cool!

On top of that, I also attended a talk by Tom Bujok on Scaling Your Application Out. Tom happens to be an old friend who since I last met him has joined Hazelcast ;)



Shortly after Shoft-Shake I headed to Casablanca to speak at Devoxx Morocco. This was a fantastic conference with a lot of young attendees. The room was almost packed up for my talk and I got good reaction from the audience on both the talk and the live demo.

During the conference I also attended other talks, including a couple of Kubernetes talks by Ray Tsang, who is an Infinispan committer himself. In his presentations he uses a Kubernetes visualizer which is very cool and I'm hoping to use it in future presentations :)

No more conferences for this year, thanks to all who've attended Infinispan presentations throughout the year!

Cheers,
Galder