About Storage Tiers
Ehcache has three storage tiers, summarized here:
- Memory store – Heap memory that holds a copy of the hottest subset of data from the off-heap store. Subject to Java GC.
- Off-heap store – Limited in size only by available RAM. Not subject to Java GC. Can store serialized data only. Provides overflow capacity to the memory store.
- Disk store – Backs up in-memory data and provides overflow capacity to the other tiers. Can store serialized data only.
This document defines the standalone storage tiers and their suitable element types and then details the configuration for each storage tier.
Before running in production, it is strongly recommended that you test the tiers with the actual amount of data you expect to use in production. For information about sizing the tiers, refer to Sizing Storage Tiers.
Configuring Memory Store
The memory store is always enabled and exists in heap memory. For the best performance, allot as much heap memory as possible without triggering garbage collection (GC) pauses, and use the off-heap store to hold the data that cannot fit in heap (without causing GC pauses).
The memory store has the following characteristics:
- Accepts all data, whether serializable or not
- Fastest storage option
- Thread safe for use by multiple concurrent threads
The memory store is the top tier and is automatically used by Ehcache to store the data hotset because it is the fastest store. It requires no special configuration to enable, and its overall size is taken from the Java heap size. Since it exists in the heap, it is limited by Java GC constraints.
Memory Use, Spooling, and Expiry Strategy in the Memory Store
All caches specify their maximum in-memory size, in terms of the number of elements, at configuration time.
When an element is added to a cache and it goes beyond its maximum memory size, an existing element is either deleted, if overflow is not enabled, or evaluated for spooling to another tier, if overflow is enabled. The overflow options are overflowToOffHeap and <persistence> (disk store).
If overflow is enabled, a check for expiry is carried out. If it is expired it is deleted; if not it is spooled. The eviction of an item from the memory store is based on the optionalMemoryStoreEvictionPolicy attribute specified in the configuration file. Legal values are LRU (default), LFU and FIFO:
- Least Recently Used (LRU) — LRU is the default setting. The last-used timestamp is updated when an element is put into the cache or an element is retrieved from the cache with a get call.
- Least Frequently Used (LFU) — For each get call on the element the number of hits is updated. When a put call is made for a new element (and assuming that the max limit is reached for the memory store) the element with least number of hits, the Less Frequently Used element, is evicted.
- First In First Out (FIFO) — Elements are evicted in the same order as they come in. When a put call is made for a new element (and assuming that the max limit is reached for the memory store) the element that was placed first (First-In) in the store is the candidate for eviction (First-Out).
For all the eviction policies there are also putQuiet() and getQuiet() methods which do not update the last used timestamp.
When there is a get() or a getQuiet() on an element, it is checked for expiry. If expired, it is removed and null is returned. Note that at any point in time there will usually be some expired elements in the cache. Memory sizing of an application must always take into account the maximum size of each cache.
Tip: calculateInMemorySize() is a convenient method that can provide an estimate of the size (in bytes) of the memory store. It returns the serialized size of the cache, providing a rough estimate. Do not use this method in production as it is has a negative effect on performance.
An alternative is to have an expiry thread. This is a trade-off between lower memory use and short locking periods and CPU utilization. The design is in favor of the latter. For those concerned with memory use, simply reduce the tier size. For more information, refer to Sizing Storage Tiers.
Configuring Disk Store
The disk store provides a thread-safe disk-spooling facility that can be used for either additional storage or persisting data through system restarts.
This section describes local disk usage. You can find additional information about configuring the disk store in Configuring Restartability and Persistence.
Serialization
Only data that is Serializable can be placed in the disk store. Writes to and from the disk use ObjectInputStream and the Java serialization mechanism. Any non-serializable data overflowing to the disk store is removed and a NotSerializableException is thrown.
Serialization speed is affected by the size of the objects being serialized and their type. It has been found that:
- The serialization time for a Java object consisting of a large Map of String arrays was 126ms, where the serialized size was 349,225 bytes.
- The serialization time for a byte[] was 7ms, where the serialized size was 310,232 bytes.
Byte arrays are 20 times faster to serialize, making them a better choice for increasing disk-store performance.
Configuring the Disk Store
Disk stores are configured on a per CacheManager basis. If one or more caches requires a disk store but none is configured, a default directory is used and a warning message is logged to encourage explicit configuration of the disk store path.
Configuring a disk store is optional. If all caches use only memory, then there is no need to configure a disk store. This simplifies configuration, and uses fewer threads. This also makes it unnecessary to configure multiple disk store paths when multiple CacheManagers are being used.
Two disk store options are available:
- Temporary store (localTempSwap)
- Persistent store (localRestartable)
localTempSwap
The localTempSwap persistence strategy allows the memory store to overflow to disk when it becomes full. This option makes the disk a temporary store because overflow data does not survive restarts or failures. When the node is restarted, any existing data on disk is cleared because it is not designed to be reloaded.
If the disk store path is not specified, a default path is used, and the default will be auto-resolved in the case of a conflict with another CacheManager.
The localTempSwap disk store creates a data file for each cache on startup called “<cache_name>.data".
localRestartable
This option implements a restartable store for all in-memory data. After any restart, the data set is automatically reloaded from disk to the in-memory stores.
The path to the directory where any required disk files will be created is configured with the <diskStore> sub-element of the Ehcache configuration. In order to use the restartable store, a unique and explicitly specified path is required.
The diskStore Configuration Element
Files are created in the directory specified by the <diskStore> configuration element. The <diskStore> element has one attribute called path.
<diskStore path="/path/to/store/data"/>
Legal values for the path attribute are legal file system paths. For example, for Unix:
/home/application/cache
The following system properties are also legal, in which case they are translated:
- user.home - User's home directory
- user.dir - User's current working directory
- java.io.tmpdir - Default temp file path
- ehcache.disk.store.dir - A system property you would normally specify on the command line—for example, java -Dehcache.disk.store.dir=/u01/myapp/diskdir.
Subdirectories can be specified below the system property, for example:
user.dir/one
To programmatically set a disk store path:
DiskStoreConfiguration diskStoreConfiguration = new DiskStoreConfiguration();
diskStoreConfiguration.setPath("/my/path/dir");
// Already created a configuration object ...
configuration.addDiskStore(diskStoreConfiguration);
CacheManager mgr = new CacheManager(configuration);
Note: A CacheManager's disk store path cannot be changed once it is set in configuration. If the disk store path is changed, the CacheManager must be recycled for the new path to take effect.
Disk Store Expiry and Eviction
Expired elements are eventually evicted to free up disk space. The element is also removed from the in-memory index of elements.
One thread per cache is used to remove expired elements. The optional attribute diskExpiryThreadIntervalSeconds sets the interval between runs of the expiry thread.
Important: Setting diskExpiryThreadIntervalSeconds to a low value can cause excessive disk-store locking and high CPU utilization. The default value is 120 seconds.
If a cache's disk store has a limited size, Elements will be evicted from the disk store when it exceeds this limit. The LFU algorithm is used for these evictions. It is not configurable or changeable.
Note: With the localTempSwap strategy, you can use maxEntriesLocalDisk or maxBytesLocalDisk at either the Cache or CacheManager level to control the size of the disk tier.
Turning off Disk Stores
To turn off disk store path creation, comment out the diskStore element in ehcache.xml.