OpenZL JNI – Java API documentation

Java bindings for the Meta OpenZL compressor.

Getting started

Add the portable Java artifact and the classifier that matches your platform. Replace the classifier with macos_arm64 or windows_amd64 when needed.

<dependency>
  <groupId>io.github.hybledav</groupId>
  <artifactId>openzl-jni</artifactId>
  <version>VERSION</version>
</dependency>

<dependency>
  <groupId>io.github.hybledav</groupId>
  <artifactId>openzl-jni</artifactId>
  <version>VERSION</version>
  <classifier>linux_amd64</classifier>
</dependency>

The Java façade is compatible with Java 21+. Classes are compiled with --release 11. OpenZLNative.load() extracts the bundled libopenzl_jni library automatically.

Published coordinates: io.github.hybledav:openzl-jni on Maven Central. Replace VERSION with the release listed there.

Quick start: byte array round trip

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLCompressionInfo;

byte[] payload = "openzl-jni quick start".getBytes(StandardCharsets.UTF_8);

try (OpenZLCompressor compressor = new OpenZLCompressor()) {
    byte[] compressed = compressor.compress(payload);
    byte[] restored = compressor.decompress(compressed);

    OpenZLCompressionInfo info = compressor.inspect(compressed);
    System.out.printf("restored=%s, compressed=%d bytes%n",
            java.util.Arrays.equals(payload, restored),
            info.compressedSize());
}

By default OpenZLCompressor uses the ZSTD graph. The inspect call returns size information, detected graph, data flavour, element count, and format version.

Direct buffers and pooling

import io.github.hybledav.OpenZLBufferManager;
import io.github.hybledav.OpenZLCompressor;

byte[] payload = "direct buffers keep JNI zero-copy".getBytes(StandardCharsets.UTF_8);

try (OpenZLBufferManager buffers = OpenZLBufferManager.builder()
        .minimumCapacity(1 << 12)
        .alignment(256)
        .build();
     OpenZLCompressor compressor = new OpenZLCompressor()) {

    ByteBuffer src = buffers.acquire(payload.length);
    src.put(payload).flip();

    ByteBuffer compressed = compressor.compress(src, buffers);
    ByteBuffer restored = compressor.decompress(compressed, buffers);

    byte[] roundTrip = new byte[restored.remaining()];
    restored.get(roundTrip);
    System.out.println("round-trip ok? " + java.util.Arrays.equals(payload, roundTrip));

    buffers.release(src);
    buffers.release(compressed);
    buffers.release(restored);
}

Use compress(src, dst) when you manage the destination buffer yourself. The static helper OpenZLCompressor.maxCompressedSize(int) returns the upper bound used by acquireForCompression.

Numeric helpers

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLCompressionInfo;
import io.github.hybledav.OpenZLGraph;

int[] readings = new int[1024];
for (int i = 0; i < readings.length; i++) {
    readings[i] = (i % 17) * 42;
}

try (OpenZLCompressor compressor = new OpenZLCompressor(OpenZLGraph.NUMERIC)) {
    byte[] compressed = compressor.compressInts(readings);
    int[] restored = compressor.decompressInts(compressed);

    OpenZLCompressionInfo info = compressor.inspect(compressed);
    System.out.println("graph=" + info.graph() + ", flavor=" + info.flavor());
}

Array helpers exist for long, float, and double. Empty arrays return byte[0]. These methods avoid extra array-to-byte conversions when the data is already typed.

Structured data with SDDL

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLProfile;
import io.github.hybledav.OpenZLSddl;

String rowStreamSddl = String.join("\n",
        "field_width = 4;",
        "Field1 = Byte[field_width];",
        "Field2 = Byte[field_width];",
        "Row = {",
        "  Field1;",
        "  Field2;",
        "};",
        "row_width = sizeof Row;",
        "input_size = _rem;",
        "row_count = input_size / row_width;",
        "expect input_size % row_width == 0;",
        "RowArray = Row[row_count];",
        ": RowArray;");

byte[] compiled = OpenZLSddl.compile(rowStreamSddl, true, 0);
byte[] payload = "12345678".repeat(128).getBytes(StandardCharsets.US_ASCII);

try (OpenZLCompressor compressor = new OpenZLCompressor()) {
    compressor.configureProfile(OpenZLProfile.SERIAL, java.util.Map.of());
    byte[] serial = compressor.compress(payload);

    compressor.reset();
    compressor.configureSddl(compiled);
    byte[] structured = compressor.compress(payload);

    System.out.printf("serial=%d B, sddl=%d B%n", serial.length, structured.length);
}

Reset between experiments to clear the previous graph state. SDDL can improve compression whenever the input layout matches the declared structure.

Inspecting frame metadata

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLCompressionInfo;
import io.github.hybledav.OpenZLCompressionLevel;

byte[] payload = "inspect me for metadata".getBytes(StandardCharsets.UTF_8);

try (OpenZLCompressor compressor = new OpenZLCompressor()) {
    compressor.setCompressionLevel(OpenZLCompressionLevel.LEVEL_5);
    byte[] compressed = compressor.compress(payload);

    OpenZLCompressionInfo info = compressor.inspect(compressed);
    System.out.printf("original=%d, compressed=%d, flavor=%s%n",
            info.originalSize(),
            info.compressedSize(),
            info.flavor());
}

OpenZLCompressionInfo also exposes compressionRatio(), the inferred OpenZLGraph, and optional element counts for structured frames. Use serialize() or serializeToJson() to persist the compressor state.

Graphs

Graph ID Description
AUTO-1Select the default graph (currently ZSTD).
ZSTD0General purpose LZ77 + entropy.
GENERIC1Lightweight transforms before entropy coding.
NUMERIC2Optimised for numeric primitives.
STORE3No compression (passthrough).
BITPACK4Bitpacking for small ranges.
FSE5Finite State Entropy.
HUFFMAN6Huffman-only entropy stage.
ENTROPY7Generic entropy coding.
CONSTANT8Fast path for constant or near-constant inputs.

Profiles

Profile Key Use case
SERIALserialGeneral byte streams.
PYTORCHpytorchTensors exported from PyTorch.
CSVcsvComma-separated rows.
LE I16le-i16Little-endian signed 16-bit sequences.
LE U16le-u16Little-endian unsigned 16-bit sequences.
LE I32le-i32Little-endian signed 32-bit values.
LE U32le-u32Little-endian unsigned 32-bit values.
LE I64le-i64Little-endian signed 64-bit values.
LE U64le-u64Little-endian unsigned 64-bit values.
PARQUETparquetColumnar data exported from Parquet.
SDDLsddlPrecompiled SDDL programs.
SAOsaoSample Adaptive Optimisation pipeline.

List available profiles at runtime with OpenZLCompressor.listProfiles(). Use configureProfile(profile, Map<String,String> args) to pass profile-specific parameters.

Training

Two convenience methods expose the native trainer:

TrainOptions lets you control the maximum time, parallelism, requested sample count, and whether to compute the Pareto frontier. The helper writes samples to a temporary directory, hands it to the native trainer, and returns serialized compressors that you can feed back into configureProfile.

Streaming APIs are tracked upstream in OpenZL issue #128; follow TODO.md for progress in this repository.