OpenZL JNI – Java API documentation

Java bindings for the Meta OpenZL compressor.

Getting started

Add the portable Java artifact and the classifier that matches your platform. Replace the classifier with macos_arm64 or windows_amd64 when needed.

<dependency>
  <groupId>io.github.hybledav</groupId>
  <artifactId>openzl-jni</artifactId>
  <version>VERSION</version>
</dependency>

<dependency>
  <groupId>io.github.hybledav</groupId>
  <artifactId>openzl-jni</artifactId>
  <version>VERSION</version>
  <classifier>linux_amd64</classifier>
</dependency>

The Java façade is compatible with Java 21+. Classes are compiled with --release 11. OpenZLNative.load() extracts the bundled libopenzl_jni library automatically.

Published coordinates: io.github.hybledav:openzl-jni on Maven Central. Replace VERSION with the release listed there.

Quick start: byte array round trip

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLCompressionInfo;

byte[] payload = "openzl-jni quick start".getBytes(StandardCharsets.UTF_8);

try (OpenZLCompressor compressor = new OpenZLCompressor()) {
    byte[] compressed = compressor.compress(payload);
    byte[] restored = compressor.decompress(compressed);

    OpenZLCompressionInfo info = compressor.inspect(compressed);
    System.out.printf("restored=%s, compressed=%d bytes%n",
            java.util.Arrays.equals(payload, restored),
            info.compressedSize());
}

By default OpenZLCompressor uses the ZSTD graph. The inspect call returns size information, detected graph, data flavour, element count, and format version.

Direct buffers and pooling

import io.github.hybledav.OpenZLBufferManager;
import io.github.hybledav.OpenZLCompressor;

byte[] payload = "direct buffers keep JNI zero-copy".getBytes(StandardCharsets.UTF_8);

try (OpenZLBufferManager buffers = OpenZLBufferManager.builder()
        .minimumCapacity(1 << 12)
        .alignment(256)
        .build();
     OpenZLCompressor compressor = new OpenZLCompressor()) {

    ByteBuffer src = buffers.acquire(payload.length);
    src.put(payload).flip();

    ByteBuffer compressed = compressor.compress(src, buffers);
    ByteBuffer restored = compressor.decompress(compressed, buffers);

    byte[] roundTrip = new byte[restored.remaining()];
    restored.get(roundTrip);
    System.out.println("round-trip ok? " + java.util.Arrays.equals(payload, roundTrip));

    buffers.release(src);
    buffers.release(compressed);
    buffers.release(restored);
}

Use compress(src, dst) when you manage the destination buffer yourself. The static helper OpenZLCompressor.maxCompressedSize(int) returns the upper bound used by acquireForCompression.

Numeric helpers

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLCompressionInfo;
import io.github.hybledav.OpenZLGraph;

int[] readings = new int[1024];
for (int i = 0; i < readings.length; i++) {
    readings[i] = (i % 17) * 42;
}

try (OpenZLCompressor compressor = new OpenZLCompressor(OpenZLGraph.NUMERIC)) {
    byte[] compressed = compressor.compressInts(readings);
    int[] restored = compressor.decompressInts(compressed);

    OpenZLCompressionInfo info = compressor.inspect(compressed);
    System.out.println("graph=" + info.graph() + ", flavor=" + info.flavor());
}

Array helpers exist for long, float, and double. Empty arrays return byte[0]. These methods avoid extra array-to-byte conversions when the data is already typed.

Structured data with SDDL

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLProfile;
import io.github.hybledav.OpenZLSddl;

String rowStreamSddl = String.join("\n",
        "field_width = 4;",
        "Field1 = Byte[field_width];",
        "Field2 = Byte[field_width];",
        "Row = {",
        "  Field1;",
        "  Field2;",
        "};",
        "row_width = sizeof Row;",
        "input_size = _rem;",
        "row_count = input_size / row_width;",
        "expect input_size % row_width == 0;",
        "RowArray = Row[row_count];",
        ": RowArray;");

byte[] compiled = OpenZLSddl.compile(rowStreamSddl, true, 0);
byte[] payload = "12345678".repeat(128).getBytes(StandardCharsets.US_ASCII);

try (OpenZLCompressor compressor = new OpenZLCompressor()) {
    compressor.configureProfile(OpenZLProfile.SERIAL, java.util.Map.of());
    byte[] serial = compressor.compress(payload);

    compressor.reset();
    compressor.configureSddl(compiled);
    byte[] structured = compressor.compress(payload);

    System.out.printf("serial=%d B, sddl=%d B%n", serial.length, structured.length);
}

Reset between experiments to clear the previous graph state. SDDL can improve compression whenever the input layout matches the declared structure.

Inspecting frame metadata

import io.github.hybledav.OpenZLCompressor;
import io.github.hybledav.OpenZLCompressionInfo;
import io.github.hybledav.OpenZLCompressionLevel;

byte[] payload = "inspect me for metadata".getBytes(StandardCharsets.UTF_8);

try (OpenZLCompressor compressor = new OpenZLCompressor()) {
    compressor.setCompressionLevel(OpenZLCompressionLevel.LEVEL_5);
    byte[] compressed = compressor.compress(payload);

    OpenZLCompressionInfo info = compressor.inspect(compressed);
    System.out.printf("original=%d, compressed=%d, flavor=%s%n",
            info.originalSize(),
            info.compressedSize(),
            info.flavor());
}

OpenZLCompressionInfo also exposes compressionRatio(), the inferred OpenZLGraph, and optional element counts for structured frames. Use serialize() or serializeToJson() to persist the compressor state.

Graphs

Graph	ID	Description
`AUTO`	-1	Select the default graph (currently ZSTD).
`ZSTD`	0	General purpose LZ77 + entropy.
`GENERIC`	1	Lightweight transforms before entropy coding.
`NUMERIC`	2	Optimised for numeric primitives.
`STORE`	3	No compression (passthrough).
`BITPACK`	4	Bitpacking for small ranges.
`FSE`	5	Finite State Entropy.
`HUFFMAN`	6	Huffman-only entropy stage.
`ENTROPY`	7	Generic entropy coding.
`CONSTANT`	8	Fast path for constant or near-constant inputs.

Profiles

Profile	Key	Use case
`SERIAL`	`serial`	General byte streams.
`PYTORCH`	`pytorch`	Tensors exported from PyTorch.
`CSV`	`csv`	Comma-separated rows.
`LE I16`	`le-i16`	Little-endian signed 16-bit sequences.
`LE U16`	`le-u16`	Little-endian unsigned 16-bit sequences.
`LE I32`	`le-i32`	Little-endian signed 32-bit values.
`LE U32`	`le-u32`	Little-endian unsigned 32-bit values.
`LE I64`	`le-i64`	Little-endian signed 64-bit values.
`LE U64`	`le-u64`	Little-endian unsigned 64-bit values.
`PARQUET`	`parquet`	Columnar data exported from Parquet.
`SDDL`	`sddl`	Precompiled SDDL programs.
`SAO`	`sao`	Sample Adaptive Optimisation pipeline.

List available profiles at runtime with OpenZLCompressor.listProfiles(). Use configureProfile(profile, Map<String,String> args) to pass profile-specific parameters.

Training

Two convenience methods expose the native trainer:

OpenZLCompressor.train(String profile, byte[][] inputs, TrainOptions opts)
OpenZLCompressor.trainFromDirectory(String profile, String dir, TrainOptions opts)

TrainOptions lets you control the maximum time, parallelism, requested sample count, and whether to compute the Pareto frontier. The helper writes samples to a temporary directory, hands it to the native trainer, and returns serialized compressors that you can feed back into configureProfile.