STF-383: Fix off-heap memory growth in FileMode.MEMORY by oschwald · Pull Request #368 · maxmind/MaxMind-DB-Reader-java

oschwald · 2026-05-07T22:39:37Z

Summary

Avoid off-heap buffer cache growth in MEMORY mode — FileChannel.read() into a heap buffer populates sun.nio.ch.Util.BufferCache, a per-thread cache that retains the largest direct buffer ever requested for the JVM lifetime. Under chunked MEMORY mode that meant chunkSize × loading-threads of off-heap pinned permanently. Switching to FileInputStream bypasses the cache. FileMode.MEMORY_MAPPED is unaffected (it uses FileChannel.map, not read).
Test MEMORY mode under chunkSizes matrix — adds Reader(File, FileMode, int) package-private ctor and a testMemoryMode parametrized over the existing chunkSizes() provider, so the multi-chunk + remainder branch in BufferHolder now has integration coverage.
Add mise configuration — mise.toml / mise.lock for local Java + Maven version management, matching minfraud-api-java. CI uses actions/setup-java directly and is unaffected.
Preparing for 4.1.0 — version bump + changelog entry.

Note: the linked Linear ticket suggested Files.readAllBytes() as the fix, but that wouldn't work — Files.readAllBytes is implemented as Channels.newInputStream(FileChannel) and still routes through FileChannel.read → BufferCache. Only java.io.FileInputStream (whose native read(byte[]) copies via SetByteArrayRegion without going through NIO) actually bypasses the cache. This was previously commented on STF-383.

Test plan

mvn test (199/199 pass, including 3 new chunked MEMORY-mode test runs)
mvn checkstyle:check clean
CI matrix passes (Java 17/21/24 × Linux/Windows/macOS)

🤖 Generated with Claude Code

`FileChannel.read(ByteBuffer)` with a heap-backed buffer causes the JDK to substitute a temporary direct buffer obtained from a per-thread cache (`sun.nio.ch.Util.BufferCache`). With chunk sizes near `Integer.MAX_VALUE`, a single MEMORY-mode database load leaves up to ~2 GB of direct memory cached on the loading thread for that thread's lifetime. Repeated loads on different threads compound the growth. Open the database via `FileInputStream` and delegate to the existing chunked `InputStream` read path. `FileInputStream.read(byte[])` is implemented natively without going through the NIO buffer cache, so it avoids the leak entirely. The MMAP path is unchanged, since `FileChannel.map()` does not use the cache. Note: `Files.readAllBytes()` and `Files.newInputStream()` would NOT fix this, as both are backed by `Channels.newInputStream(FileChannel)` internally and still trigger the cache. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Manages local Java and Maven versions via mise, matching the setup in minfraud-api-java. CI is unaffected since the GitHub Actions workflows use setup-java directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The existing chunkSizes parametrized matrix in ReaderTest only routed through Reader(File, int chunkSize), which hardcodes FileMode.MEMORY_MAPPED. As a result the chunked file-MEMORY load path in BufferHolder had no integration coverage — a remainder-chunk-sized- wrong regression or an EOF-handling change could ship silently. Add a package-private Reader(File, FileMode, int chunkSize) constructor and a testMemoryMode(int chunkSize) test that mirrors test(int) but in MEMORY mode. With chunk sizes 512/2048 against the test DBs (1285 and 2794 bytes), the multi-chunk + remainder branch is now exercised end to end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request addresses an issue with unbounded off-heap memory growth in FileMode.MEMORY by replacing FileChannel.read() with FileInputStream. This change prevents the JDK from caching large direct ByteBuffers in per-thread storage (sun.nio.ch.Util.BufferCache). The update also includes a version bump to 4.1.0, the addition of mise configuration for development environment management, and new tests for memory mode. Feedback was provided regarding potential integer overflow when calculating chunk counts for extremely large databases, which could lead to an incorrectly sized buffers array.

gemini-code-assist · 2026-05-07T22:44:56Z

                    var fullChunks = (int) (size / chunkSize);
                    var remainder = (int) (size % chunkSize);
                    var totalChunks = fullChunks + (remainder > 0 ? 1 : 0);
                    var buffers = new ByteBuffer[totalChunks];


The calculation of fullChunks and totalChunks uses an explicit cast from long to int without checking for overflow. If the database size is extremely large relative to the chunkSize (e.g., a multi-terabyte database with a small custom chunkSize), fullChunks could overflow, leading to an incorrectly sized buffers array and potential data corruption or IndexOutOfBoundsException during the reading loop. While MaxMind databases are typically within int range for chunk counts, it is safer to validate this or use Math.toIntExact.

var fullChunksLong = size / chunkSize; var remainder = (int) (size % chunkSize); var totalChunksLong = fullChunksLong + (remainder > 0 ? 1 : 0); if (totalChunksLong > Integer.MAX_VALUE) { throw new IOException("Database size exceeds supported chunk count for the given chunkSize"); } var totalChunks = (int) totalChunksLong; var fullChunks = (int) fullChunksLong; var buffers = new ByteBuffer[totalChunks];

(Claude replying on Greg's behalf.)

Thanks for flagging — leaving the cast unchanged in this PR for a few reasons:

Pre-existing, not introduced by this PR. The same (int) (size / chunkSize) pattern (and the parallel (int) (size % chunkSize)) was in the prior multi-chunk MEMORY path before the refactor, so this isn't a regression we're shipping.

Not practically reachable. A real MaxMind DB at multi-TB sizes doesn't exist; well before this cast would overflow we'd hit byte[] size limits (Integer.MAX_VALUE), ByteBuffer.allocate(int)'s int cap, and JVM heap limits.

Project guidance (CLAUDE.md) says to not add validation for scenarios that can't happen — at this layer, chunkSize is set by the library, not external input.

If/when very-large-DB support beyond ~2GB chunks becomes a goal, the right move is probably a broader audit of the int casts and array indexing throughout BufferHolder / MultiBuffer, not just Math.toIntExact in this one spot. Out of scope for the current PR.

horgh · 2026-05-11T21:03:44Z

-
-                    var totalRead = 0L;
-                    for (var buffer : buffers) {
-                        var read = channel.read(buffer);


Interesting:

1. Pre-existing short-read bug, now fixed (worth calling out). The old multi-chunk MEMORY path had a latent bug: for (var buffer : buffers) { var read = channel.read(buffer); // may be a *short* read if (read == -1) { break; } totalRead += read; buffer.flip(); // flips at the short position } FileChannel.read(ByteBuffer) is not contractually obligated to fully fill the buffer — a short read would leave the chunk partially populated and flip() would expose only the bytes that were actually read, silently truncating the in-memory database. The new readFully loop handles this correctly. Good catch (even if accidental).

Not sure if we'd want to mention in the changelog.

(Claude replying on Greg's behalf.)

Done in 232dd73. Traced the introducing commit via git log -S: the buggy for (var buffer : buffers) { channel.read(buffer); buffer.flip(); } loop was added in 2a1fa7f "Make buffers read-only for all paths" and first shipped in v4.0.0, so 4.0.0/4.0.1/4.0.2 are all affected.

Added a second bullet under the 4.1.0 section noting:

The silent-truncation risk specifically for the multi-chunk path.

That it only affects databases larger than ~2GB (the default chunk size).

That the new chunked read loop retries until each chunk is fully populated.

The single-chunk MEMORY path has the same FileChannel.read(ByteBuffer) contract concern but at least throws when channel.read(buffer) != size, so only the multi-chunk path silently truncates — that's the one worth calling out.

horgh · 2026-05-11T21:04:34Z

+            // mode that would mean chunkSize bytes of off-heap memory held per loader
+            // thread for the JVM's lifetime.
+            try (FileInputStream stream = new FileInputStream(database)) {
+                long size = database.length();


2. Minor robustness regression: size source no longer matches the open fd. BufferHolder.java:33-34: try (FileInputStream stream = new FileInputStream(database)) { long size = database.length(); database.length() is a path-based stat(2), not a query against the open fd. The old code used channel.size() on the already-open channel, which is atomic with the open. Theoretical race: if the file is replaced between new FileInputStream(...) and database.length(), you could get a size that doesn't match the bytes the stream will deliver — and then readFully either throws "Unexpected end of stream" (truncation) or silently stops short of the file's real end (extension/replacement with a larger file). Same general class of race as before, but slightly wider window. Suggestion: use stream.getChannel().size() instead. Lazily fetching the channel doesn't populate the NIO buffer cache — only channel.read(heapBuffer) does that, and we're not calling it. This keeps the size lookup consistent with the open file descriptor without reintroducing the leak.

(Claude replying on Greg's behalf.)

Fixed in 2624387 — database.length() → stream.getChannel().size(). Verified the suggestion doesn't reintroduce the cache: FileChannelImpl.getChannel() is lazy and .size() is an fstat-only syscall, no IOUtil.read / Util.getTemporaryDirectBuffer. Added a one-line comment noting both why (fd-atomic) and that getChannel().size() is cache-safe.

`database.length()` is a path-based `stat(2)`, which can race with file replacement between the `FileInputStream` open and the size lookup — the bytes the stream then delivers may not match the size we read. Switch to `stream.getChannel().size()`, which is an `fstat` on the already-open file descriptor and therefore atomic with the open. `FileChannelImpl.getChannel()` is lazy and `.size()` is a stat-only syscall, so this does not reintroduce the per-thread direct ByteBuffer cache that the previous commit was fixing — that cache only grows when `FileChannel.read(heapByteBuffer)` is called. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The prior multi-chunk MEMORY load loop (introduced in 4.0.0 by 2a1fa7f, "Make buffers read-only for all paths") called FileChannel.read(ByteBuffer) and trusted it to fully fill the destination — a short read would have silently truncated the in-memory database. The new readFully helper retries until each chunk is full, so this is incidentally fixed. Worth noting in the changelog so anyone investigating a >2GB MEMORY-mode corruption can trace it back. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

oschwald and others added 3 commits May 7, 2026 22:37

Add mise configuration

e2be3a5

Manages local Java and Maven versions via mise, matching the setup in minfraud-api-java. CI is unaffected since the GitHub Actions workflows use setup-java directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

oschwald force-pushed the greg/stf-383 branch from aab8e0d to cbf83d9 Compare May 7, 2026 22:40

gemini-code-assist Bot reviewed May 7, 2026

View reviewed changes

horgh requested changes May 11, 2026

View reviewed changes

oschwald and others added 2 commits May 11, 2026 21:51

horgh approved these changes May 11, 2026

View reviewed changes

horgh merged commit 74815f0 into main May 11, 2026
27 checks passed

horgh deleted the greg/stf-383 branch May 11, 2026 22:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STF-383: Fix off-heap memory growth in FileMode.MEMORY#368

STF-383: Fix off-heap memory growth in FileMode.MEMORY#368
horgh merged 5 commits into
mainfrom
greg/stf-383

oschwald commented May 7, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

oschwald May 11, 2026

Uh oh!

horgh May 11, 2026

Uh oh!

oschwald May 11, 2026

Uh oh!

horgh May 11, 2026

Uh oh!

oschwald May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

oschwald commented May 7, 2026

Summary

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

oschwald May 11, 2026

Choose a reason for hiding this comment

Uh oh!

horgh May 11, 2026

Choose a reason for hiding this comment

Uh oh!

oschwald May 11, 2026

Choose a reason for hiding this comment

Uh oh!

horgh May 11, 2026

Choose a reason for hiding this comment

Uh oh!

oschwald May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants