// open-source line
← products

Aether

Intelligence-Driven Compression.

17.1% smaller
vs gzip-9 (Silesia)
66
Neural Parameters
~25 KiB
Memory Per Block
285
Tests Passing

Modern problems require modern solutions.

Standard archivers use fixed pipelines designed in the 1990s: gzip (LZ77 + Huffman), bzip2 (BWT + Huffman), xz (LZMA2), zstd (speed-focused). AetherArch separates modeling from coding — any predictor can be plugged in, and a custom byte-aligned range coder adapts to whatever probabilities it receives.

On the Silesia Corpus (202 MiB, 12 files — industry standard), AetherArch achieves 26.45% overall ratio, beating gzip-9 by 17.1% and closing in on bzip2-9 (only 2.8% gap). Also beats zstd -19 on text (27.37% vs 27.78%) and gzip-9 on internal benchmarks.

The neural stack adds gain at every layer: raw Order-0 baseline (4.769 bpb), ContextMixer (-12%), LZ4/LZ77 preprocess (-35%), BWT transform (-10%), RUNA/RUNB RLE (-8%), and NeuralSsmPredictor (-1%) — beating gzip-9 even before the neural stage.

V0.2.3 Performance: ~1.3 MiB/s on internal corpus. Bottleneck: BWT sort (~50%), NeuralSsm predict+update (~25%), Range coding (~15%). Memory scaling: BWT allocates ~10× chunk size peak RSS per thread (40 MiB/thread at 4 MiB max chunk).

Neural SSM Predictor

A diagonal linear State Space Model (SSM) fused with a hierarchical RLE predictor. 66 learnable parameters that adapt from scratch per block — no pre-trained weights needed. Retuned D=32, lr=0.01, o2=0.30.

Adaptive Multi-Method Routing

Per-chunk routing picks smallest: BWT + MTF + RLE + Neural SSM + range coding, LZ77, plain predictor, zstd-19 fallback, or store. Content-type detection groups files semantically.

6-Crate Ecosystem (v0.2.3)

Core library, CLI, C FFI (cbindgen), REST API server, Wasm target (decompress-only), and Python bindings (PyO3). Encryption: AES-256-GCM / ChaCha20-Poly1305 with Argon2id KDF.

The Technical Edge

Why experts choose Aether

01

Neural SSM Architecture

Byte → 32-dim embedding → SSM update (32 exponential moving averages at timescales 0.5…0.999) → binary classifiers for run symbols → adaptive mixer weighting SSM vs RLE by recent log-likelihood → order-2 literal context blend.

02

Archive Format v1.0

Self-describing, random-access, integrity-checked. Header (48B) + payload + footer (32B). BLAKE3 per file + CRC32 per block. Random-access extraction without full decompression for enterprise cloud backends.

03

V0.2.3 Optimization Peak

Direct CDF overrides (2.6× speedup), div→mul optimization (+20% e2e), O(rank) MTF, LZ77 early-exit (skip when BWT < 55%), sync-predictor skip via flag, custom byte-aligned range coder, LTO + codegen-units=1.

Ready to secure
the future?

Request Expert Briefing