Platform — rjbase.io

Ingest

Stream from Kafka, Kinesis, Pulsar, or batch from S3/GCS. The ingest tier applies your tokenization policies before the data ever lands in storage.

Exactly-once semantics with idempotent commit ledger
Schema evolution with backward-compatible reads
Backpressure-aware streaming to hot tier

Vault

Sensitive values are replaced with deterministic tokens. The raw data lives in an HSM-backed vault, addressable only by short-lived capabilities issued by the policy engine.

Format-preserving encryption (FF3-1) and random tokens
Per-namespace key isolation with hardware roots of trust
Online key rotation without table rewrites

Policy engine

A declarative policy graph — RBAC + ABAC + purpose-of-use — gates every read. Detokenization is never implicit; it is a policy decision with a written reason in the audit ledger.

Row-level scoping evaluated at plan time
Column masking with multiple reveal levels
Justification-required reveals with tamper-evident audit

Storage

Tiered, open-format, and pluggable. Hot partitions on NVMe, warm on local SSD, cold on your object store. One catalog, one query.

Iceberg-compatible table format · Parquet underneath
Z-ordering, bloom filters, and partition pruning by default
Bring your own bucket — we never own your data at rest

Compute

A vectorized SQL engine with adaptive query execution, plus a Python dataframe interface that compiles to the same plans.

Per-tenant compute pools — noisy-neighbor proof
Result caching with policy-aware invalidation
Autoscaling backed by spot-friendly schedulers

Catalog

A single source of truth for tables, schemas, lineage, and the policies attached to them. Backwards-compatible with the Iceberg REST catalog spec.

Time-travel queries on any tokenized table
End-to-end lineage from source topic to BI dashboard
Federated views across multiple lakes

Deployment

Run it your way.

rjbase.io runs as a managed service in our cloud, in your own VPC, or fully air-gapped on hardware you control. The control plane is the same; only the deployment target changes.

Managed

We run the control plane in our SOC 2 environment. Your data stays in your bucket.

BYOC

We deploy into your AWS, GCP, or Azure account. You see every API call.

On-prem

Helm chart against your Kubernetes. We support air-gapped installs.

Sovereign

Regional residency with hardware HSM roots — for the most regulated workloads.

A tokenized lake, end to end.