Craton HSM

Security Model

Craton HSM is a software HSM. Its cryptographic boundary is the process address space of the application that loads libcraton_hsm.* (or, for remote deployments, the process running craton-hsm-daemon). This page describes the trust boundaries, roles, and protection mechanisms that the module provides inside that boundary, and states clearly what is out of scope.

A software HSM does not replace a hardware HSM when the threat model includes a compromised operating system, physical access, or side-channels that require hardware mitigation. Craton HSM is appropriate where the host OS is trusted (for example a hardened Linux deployment with encrypted volumes, SELinux/AppArmor, and restricted operator access) and a well-engineered, auditable cryptographic module is required.

Trust Boundaries

+-----------------------------------------------------------+
|  Cryptographic module boundary (process address space)    |
|                                                           |
|  +------------------+   +---------------------+           |
|  | PKCS#11 C ABI    |   | HsmCore             |           |
|  | (libcraton_hsm)  |   | SessionManager      |           |
|  +---------+--------+   | ObjectStore         |           |
|            |            | CryptoBackend       |           |
|            v            | HmacDrbg            |           |
|  +------------------+   | AuditLog            |           |
|  | catch_unwind     |-->| Self-test harness   |           |
|  +------------------+   +---------------------+           |
|                                                           |
+-----------------------------------------------------------+
         ^                               ^
         |                               |
  dlopen / LoadLibrary        gRPC over mTLS (via daemon)
         |                               |
+--------+---------+           +---------+---------+
| Host application |           | Remote application |
+------------------+           +-------------------+

The boundary is a process. Everything inside the boundary is written in Rust with the only unsafe code confined to the FFI layer and the platform memory-locking shim. Everything outside the boundary is untrusted input that must be validated before it touches module state.

Roles

Craton HSM implements the three PKCS#11 roles. Role membership is bound to a session; an authenticated state does not outlive the session that performed C_Login.

Role	PKCS#11 `CK_USER_TYPE`	Purpose	Authentication
Security Officer (SO)	`CKU_SO`	Token initialisation, user PIN management, token reinitialisation	PIN verified via PBKDF2-HMAC-SHA256
User	`CKU_USER`	All cryptographic operations, object management	PIN verified via PBKDF2-HMAC-SHA256
Unauthenticated	(no login)	Digest over public data, random generation, reading public objects	None

The SO cannot perform cryptographic operations on user key material. The User cannot reinitialise the token. Unauthenticated sessions cannot see CKA_PRIVATE=true objects at all.

See policy for the formal operation-to-role matrix.

Authentication

PINs are the only authentication factor at the PKCS#11 layer. The module never stores plaintext PINs; it stores a PBKDF2-HMAC-SHA256 hash with a per-PIN random salt.

Parameter	Value
PRF	HMAC-SHA256
Iteration count	600,000 (configurable upward)
Salt	32 bytes from the OS CSPRNG, per PIN
Derived length	32 bytes
Allowed PIN length	4–64 bytes (configurable; deployments should enforce at least 8)
Comparison	`subtle::ConstantTimeEq`

PINs are never echoed to stdout or the audit log. CLI tools read PINs through rpassword. Over gRPC, PINs travel inside a mutually authenticated TLS channel and are wiped from memory immediately after verification.

Lockout

Failed logins increment a counter per role. When the counter hits the configured threshold (default 10), C_Login returns CKR_PIN_LOCKED. Recovery:

Locked User PIN — the Security Officer can reset it via C_InitPIN.
Locked SO PIN — the token must be reinitialised via C_InitToken.

The lockout state is persisted with the token so that process restarts do not reset the counter.

Authorization

Authorization is layered on top of the role model:

Session state — the session's logged-in role is authoritative. Every privileged operation checks the state, not just a "has logged in" flag. The 5-state session FSM (R/O Public, R/W Public, R/O User, R/W User, R/W SO) is enforced in src/session/session.rs.
Object attributes — CKA_PRIVATE, CKA_SENSITIVE, and CKA_EXTRACTABLE gate what an authenticated session can see, read, or export. Generated key objects default to maximum protection (PRIVATE=true, SENSITIVE=true, EXTRACTABLE=false).
Key lifecycle — keys carry CKA_START_DATE / CKA_END_DATE and move through SP 800-57 states (pre-activation, active, deactivated, compromised, destroyed). Deactivated keys can verify and decrypt but cannot sign, encrypt, wrap, or derive.
Mechanism policy — when algorithms.fips_approved_only = true, every *Init / *Generate* / *Derive* call validates the requested mechanism against the approved list and returns CKR_MECHANISM_INVALID on a mismatch.

Data at Rest

Two storage modes exist:

Mode	Behaviour
In-memory (default)	Objects live only in the running process. All key material is zeroized on process exit. Persistence is impossible by construction.
Persistent	Objects are serialised, encrypted per-object with AES-256-GCM, and stored in a `redb` database. Access is serialised with an exclusive `flock`.

Persistent storage derives its master encryption key from the SO PIN via PBKDF2-HMAC-SHA256 (600,000 iterations, 32-byte salt). Each object gets a random 96-bit nonce; the ciphertext and nonce are stored together under an object-id key. The master encryption key is cleared from memory on logout or token reinitialisation.

Memory protection for live key material:

RawKeyMaterial locks its backing pages with mlock(2) (Unix) or VirtualLock (Windows) to prevent paging to swap.
Drop overwrites bytes with zeroize() before unlocking.
Debug output redacts bytes with [REDACTED].

Platform entropy comes from getrandom(2) on Linux, BCryptGenRandom on Windows, and SecRandomCopyBytes on macOS. All cryptographic randomness routes through the HMAC_DRBG adapter (DrbgRng) rather than the OS CSPRNG directly.

Data in Transit

For in-process use the module has no network surface. For remote use the gRPC daemon (craton-hsm-daemon) provides:

TLS 1.2 (minimum) or TLS 1.3 (preferred) for all transport.
Mutual TLS — the daemon requires a client certificate for every connection, validated against a configured root store with optional CRL checking.
PIN values transmitted inside the TLS channel only.
mTLS identity optionally mapped to a PKCS#11 token via the enterprise authentication providers.

See hardening for the required TLS configuration and certificate rotation procedure.

What the Module Does Not Defend Against

Cold-boot attacks — mlock prevents swap, not physical RAM dumps.
OS kernel compromise — if the kernel is hostile, all process memory is visible.
DMA or bus-level attacks — require IOMMU / hardware mitigation.
Power analysis and electromagnetic side-channels — software cannot mitigate hardware leakage.
Speculative execution attacks — mitigations depend on CPU microcode and kernel patches applied by the operator.

See threat-model for the explicit in-scope / out-of-scope list.

Security Invariants

Ten properties are enforced by code review and by tests; any violation is treated as a bug:

No panic crosses the FFI boundary — catch_unwind on every extern "C" export.
Key bytes never appear in logs — custom Debug impls redact.
Sensitive, non-extractable objects never return key bytes from C_GetAttributeValue.
PIN and HMAC comparisons are constant-time (subtle::ConstantTimeEq).
Every key container is zeroized on drop.
The session state machine is authoritative — role checks read the state, not a bare flag.
The audit log entry is written before the PKCS#11 function returns — no fire-and-forget logging.
All key material uses DrbgRng over the HMAC_DRBG; direct OsRng for key material is prohibited.
No unsafe in crypto paths — unsafe blocks are restricted to the ABI layer and the memory-locking shim.
Errors returned to callers are generic — the error mapping prevents internal-state leakage via error specificity.