Usage of Unsafe in 2023

Denis Gabaydulin
3 min readJan 21, 2023

--

Does intensive I/O java application still need the unsafe in 2023?
The spoiler: sometimes, yes.

For the context, I’ve been working in a small team of engineers which is building a cutting edge ecom search engine on top of Lucene for Ozon company.

One of the major reasons why we don’t use a box solution like Elasticsearch/Solr is that we are fighting for the latency metric of a limited number of queries, we have.

In an optimized search engine built on top of Lucene, the typical search query time is influenced by two main components: I/O operations, such as reading primitives from memory, and CPU-bound tasks like scoring and facet computation. This analysis assumes a single-threaded query model on a single segment, disregarding the effects of parallelization. Additionally, it is assumed that the system has enough memory to hold the entire index in memory through memory-mapped I/O in Linux. Today, I will focus on optimizing the I/O component.

Lucene users typically use MMapDirectory, which is an efficient implementation of the Directory interface using direct byte buffers and memory-mapped I/O. However, can we improve it further?

Let’s see what problems still we have in I/O?

  1. The first problem is in the limitation of ByteBuffer API. You can’t have a buffer more the 2gb (because of int used as an offset in the API).
    In big indexes like ours, we can have index files larger than 20gb. Current implementation maintains an array of underlay-ed byte buffers under the hood and extra logic to handle the situation when a client reads a data which is represented by more than one buffer.
    A possible solution is replacing of the array of byte buffer to something that can do reads using an address of a mapped file (e.g. Unsafe API).
  2. The second problem is the byte buffer still can’t completely eliminates range checks in many situations. It costs, if you read a small amount of bytes. Using a quick-n-dirty read primitive benchmark running on JDK 19.0.2, I have the following results:
Benchmark                                   Mode  Cnt    Score    Error   Units
ReadPrimitiveBenchmark.readArray thrpt 10 160.688 ± 1.712 ops/us
ReadPrimitiveBenchmark.readByteBuffer thrpt 10 161.202 ± 2.789 ops/us
ReadPrimitiveBenchmark.readMethodHandle thrpt 10 182.154 ± 0.453 ops/us
ReadPrimitiveBenchmark.readUnsafe thrpt 10 201.040 ± 1.856 ops/us

If your application operates with huge amount of data (when reads or writes), the range check doesn’t cost a lot. You should use a byte buffer or new foreign memory API (available in JDK 17 and above). But in my case the load testing with real data shows that could be a problem.

The last problem is in the design of MMapDirectory. Before reading from mapped files you must ensure that an underlay-ed file is still valid. In the implementation of MMapDirectory you have a single mapped file and many clones of input objects with own state (offset, limit, etc). All of them should check that the opened buffer is still valid before any read operation is done. The cheapest way to do that is reading a flag with a plain read semantics. Probably you have a question how it could be possible in multi-thread environment? I suggest a reader to find an answer to this question by yourself :-) So, in the current design of the API we can’t do anything with it :-(

On the picture you can see, that the buffer validation check is consumed a half of time of readByte function.

When I found out those problems, I decided to make a test implementation of Directory API on top of Unsafe and as expected I’ve got predictable performance improvements about 5–20% reduced latency depending on a query type.

Updates:

  • Retested the primitive benchmark with openjdk19.
  • Fixed a bug with an additional cast int -> long in the Unsafe test. The unsafe still kicks!

--

--