Skip to content

Roadmap

Released

v7.0.25 — 2026-05-27 ✅

Item Detail
Actions API snake_case fix ActionRecommendation now serialises as id, host, action_type etc. — eliminates "undefined" values in the Actions tab
Computed state + description API wraps each action with a state field (pending / approved / executed) and human-readable description
Tier-1 autopilot teaser Tier-1 actions visible in community with "Autopilot Edition" badge; approve returns HTTP 402 with upgrade hint
Tier-1 auto-execute When RUPTURA_EDITION=autopilot: Tier-1 actions are auto-approved and executed every 15s via the K8s actuator
Helm RBAC for autopilot rbac.yaml adds patch/update on Deployments, StatefulSets, and Nodes when edition: autopilot
MarkExecuted tracking Engine tracks execution state; executed actions show as state: executed rather than disappearing

v7.0.24 — 2026-05-26 ✅

Item Detail
FusedR cap fix rupture.Index() capped at 10.0 — eliminates astronomic values (e.g. 257,000) when alphaStable ≈ epsilon
Named signal states in UI Signals tab now shows named state labels (panic / burnout / pandemic / critical) instead of "undefined"

v7.0.23 — 2026-05-22 ✅

Item Detail
UI endpoint crash fixes All Fleet, Topology, Engine, Alerts endpoints handle empty/malformed JSON without crashing
Simulator host label simulate.py sends correct host label so pipeline indexes by workload key
CI stability Pre-deploy pod cleanup, --cleanup-on-fail, 20-minute timeout, failure diagnostics step

v7.0.22 — 2026-05-22 ✅

Item Detail
OTLP datasource activation Registering an OTLP datasource now TCP-dials the endpoint to verify reachability
SSRF bypass for OTLP SSRF validation skipped for type=otlp — push endpoints are Ruptura's own NodePort
FusedR startup corruption fix Snapshots loaded from BadgerDB are sanitized: values > 10.0 reset to 0 before serving

v7.0.21 — 2026-05-20 ✅

Item Detail
Workload lifecycle phases Fleet cards show Calibrating (CAL badge + progress), Active, and Rupture states distinctly
Calibration banner Detail drawer shows "Calibrating baseline — X% complete · ETA Ym" while baselines build

v7.0.10 — 2026-05-17 ✅

Item Detail
Database tab Settings → Database: per-signal-type retention config, purge by type/date/all
Actions tab Fleet → Actions: Tier-2 approve/reject with confidence score, emergency stop button

v7.0.4 — 2026-05-15 ✅

Item Detail
OTLP NodePort OTLP ingest service exposed as NodePort 31470 — send telemetry directly without port-forwarding
Workload simulator scripts/simulate.py — 5 behavioral profiles (stable, CPU stress, error bursts, traffic spikes, calibrating) injected every 5s via /api/v2/write

v7.0.3 — 2026-05-15 ✅

Item Detail
JSON crash fix get() helper in api.ts reads text before parsing — empty body no longer crashes Fleet/Topology
Real PNG logo SVG gradient replaced with actual PNG in NavBar — renders correctly in all browsers
Topology overhaul Explanation banner, edge click panel (call rate / error rate / P99 latency), better empty state
Per-workload health scores WorkloadCard now reads per-snapshot KPI values — each card shows its own health ring

v7.0.2 — 2026-05-15 ✅

Item Detail
10-signal mini-bars WorkloadCard shows all 10 KPI signals (stress → throughput) as color-coded bars
Light/dark mode Theme toggle in NavBar, persisted to localStorage, CSS variables across all components
Live Data Flow Engine view shows cumulative log/metric/trace counters with proportional stacked bar
All backend APIs wired TopologyMap, Settings ingest stats, NodeHealth all pulling from real endpoints

v7.0.1 — 2026-05-15 ✅

Item Detail
ruptura-ui pod Svelte 4 dashboard deployed as separate pod, nginx proxies /api/ to engine, injects Bearer token
Logo Ruptura logo in NavBar and Settings About section
Calibrating state WorkloadCard shows calibration progress bar and "calibrating" badge
Settings & Alerts pages Ingest Stats, data source config, active/resolved alert feed

v7.0.0 — 2026-05-15 ✅

Item Detail
v7 architecture Two-pod model: ruptura-engine (Go binary) + ruptura-ui (Svelte 4 + nginx)
SSE live event stream GET /api/v2/events — real-time rupture/recovery events, live counter in Fleet
K8s workload metadata Pod list, replicas, resources, labels under Fleet → Kubernetes tab
Node health view Nodes page showing CPU, memory, disk pressure per K8s node

v6.8.13 — 2026-05-13 ✅

Item Detail
Log/trace ingest counters /api/v2/dataflow endpoint exposes cumulative metrics/logs/traces totals
Live Data Flow Engine dashboard section showing ingest throughput
ruptura-ctl v1.0.0 CLI companion — health, status, workload queries, kubectl plugin support

v6.7.0 — 2026-05-06 ✅

Item Detail
First dashboard Initial Fleet view with FusedR heatmap, per-workload signal timelines, action log, narrative explain panel, SLO widget, health forecast. Superseded by the v7.0 Svelte 4 SPA (ruptura-ui pod).
Foundation Established the dashboard architecture and panel design that v7 builds upon.

ruptura-operator v0.6.9 — 2026-05-07 🔄

Red Hat OperatorHub certification pipeline running.

Item Detail
UBI9 base image Both ruptura and ruptura-operator images switched from gcr.io/distroless/static-debian12:nonroot to registry.access.redhat.com/ubi9/ubi-micro — satisfies Red Hat preflight BasedOnUBI check.
Required Red Hat labels name, vendor, version, release, summary, description labels added to both images — satisfies HasRequiredLabel preflight checks.
Default app image bump CSV default app image updated to ruptura:v6.7.0.
Build arg wiring CI workflows now pass VERSION build-arg so the version label reflects the actual image tag at build time.

ruptura-operator v0.6.8 — 2026-05-07 ✅

OperatorHub PR merged: https://github.com/k8s-operatorhub/community-operators/pull/8070

Item Detail
Fix: ServiceAccount never created Operator used serviceAccountName: ruptura-instance in the Deployment but never created the SA. Every Pod would fail to schedule. Fixed: reconcileServiceAccount() added to the reconcile loop; SA deleted in cleanup().
Fix: RBAC missing serviceaccounts verb ClusterRole now grants create/update/patch/delete on serviceaccounts.
OLM upgrade graph replaces: ruptura-operator.v0.6.7 added to CSV — existing installations upgrade cleanly.
Prometheus metrics /metrics + /healthz on :9090; ruptura_instances_total + ruptura_reconcile_errors_total gauges.

ruptura-operator v0.6.7 — 2026-05-07 ✅

First OperatorHub release, merged into community-operators.

Item Detail
RupturaInstance CRD Manages Deployment + Service + PVC + ServiceAccount per instance
OpenShift support Route with edge TLS termination when running on OpenShift
Finalizer cleanup ruptura.io/cleanup finalizer ensures owned resources are deleted before CR removal
OLM bundle Correct dot-notation annotation keys; stable and alpha channels

v6.6.3 — 2026-05-06 ✅

Item Detail
Security: timing-safe auth Bearer token comparison uses crypto/subtle.ConstantTimeCompare — eliminates timing-oracle on the API key.
Security: auth warning Server logs WARNING at startup when RUPTURA_API_KEY is unset.
Emergency stop wired POST /api/v2/actions/emergency-stop now calls engine.EmergencyStop() (was a no-op).
Forecast signal fix Warm-up stub returns the requested signal's current value via signalValue(); nil-guard on h.store.
RUPTURA_API_KEY env var Server reads the API key from the environment when --api-key flag is absent.
Slowloris protection http.Server sets ReadHeaderTimeout: 5s.
Horizon + limit caps ?horizon= capped at 10 080 min (1 week); ?limit= capped at 1 000.
Sim robustness Injector uses http.Client{Timeout: 10s}; math/rand seeded at Run() start.
reject 404 POST /api/v2/actions/{id}/reject returns 404 for unknown IDs.
ruptura-ctl status Actions() error surfaced as a dim warning.

v6.6.1 — 2026-05-06 ✅

Item Detail
sim inject fixed CLI was sending {pattern} payload; server expects {workload, metrics}. Rewired to sim.Run() — real metric ticks per pattern.
sim.send() auth APIKey added to sim.Config; every tick sends Authorization: Bearer header.
3-segment workload refs describe workload ns/Kind/name was 404 — added /rupture/{namespace}/{kind}/{workload} route.
Suppressions field mismatch Handler now matches workload/start/end fields sent by the CLI.
Health port label ruptura-ctl health now shows traces (OTLP :4317).

v6.6.0 — 2026-05-05 ✅

Item Detail
IMPROVE-07: Per-workload signal weight tuning POST /api/v2/config/weights + GET /api/v2/config/weights for runtime override. RUPTURA_WORKLOAD_WEIGHTS JSON env var for Helm/K8s bootstrap. Selector syntax: exact, ns/* prefix, or * wildcard. Weights normalised to 1.0 on load. Helm workloadWeights: array in values.yaml.

v6.5.0 — 2026-05-05 ✅

Item Detail
IMPROVE-06: Edition gate RUPTURA_EDITION env var (community | autopilot). POST .../approve returns 402 in community — recommendations stay visible read-only. Full execution in autopilot. Helm edition: community in values.yaml.

v6.4.0 — 2026-05-05 ✅

Item Detail
IMPROVE-04: Rupture fingerprinting 11-dimensional KPI vector per confirmed rupture (FusedR > 3.0). Cosine similarity ≥ 0.85 surfaced as pattern_match in every rupture response.
IMPROVE-05: Business signal layer slo_burn_velocity, blast_radius, recovery_debt in every snapshot's business block. SLO contracts in Helm values.yaml.

v6.3.0 — 2026-05-04 ✅

Item Detail
IMPROVE-01: Calibration warm-up status + calibration_progress + calibration_eta_minutes in every snapshot.
IMPROVE-02: HealthScore trend forecast health_forecast block — OLS slope → in_15min, in_30min, critical_eta_minutes.
IMPROVE-03: ruptura-sim binary Four simulation patterns (memory-leak, cascade-failure, traffic-surge, slow-burn) via POST /api/v2/sim/inject.

v6.2.2 — 2026-04-30 ✅

Item Detail
GAP-04 closed Anomaly REST endpoints: GET /api/v2/anomalies, GET /api/v2/anomalies/{host}
Dead code removed Duplicate internal/predictor/anomaly_engine.go removed; MetricPipeline interface extended
Docs updated Correct API key env var (RUPTURA_API_KEY), accurate port references, all 10 KPI signal names
Release workflow HELM_CHART path corrected; docker dependency wired for image-tag output

All v6.x engineering gaps closed as of v6.2.2.

v6.2.1 — 2026-04-30 ✅

Item Detail
FusedR in API fused_rupture_index field added to every rupture response; integration test verifies non-zero
Grafana dashboard Panel 3 now queries ruptura_kpi{signal="fused_rupture_index"}; added Panel 4 (Pressure + Contagion), Panel 6 (Throughput Collapse), workload template variable, 15s auto-refresh

v6.2.0 — 2026-04-30 ✅

Item Detail
WorkloadRef treatment unit OTLP extracts k8s.namespace.name, k8s.deployment.name, etc. Multiple pods merged into one workload view. API routes: /api/v2/rupture/{namespace}/{workload}
Adaptive per-workload baselines After 24h, thresholds become z-score deviations from Welford baseline. Batch jobs stop generating false alarms.
Narrative explain GET /api/v2/explain/{id}/narrative — structured English causal chain, not raw JSON
Topology-based contagion Real service edges from trace spans. Falls back to errors×cpu proxy when no trace data.
Maintenance windows POST/GET/DELETE /api/v2/suppressions — suppress alerts during planned deploys
Fusion end-to-end metricR (CA-ILR 15s ticker) + logR (burst detector) + traceR (OTLP span error rate) → FusedR
HealthScore formula Switched from multiplicative (collapsed aggressively) to additive-penalty model
All 10 KPI signals stress · fatigue · mood · pressure · humidity · contagion · resilience · entropy · velocity · health_score
Action engine Bounded 256-entry pending queue; Approve() / Reject() API endpoints live
BadgerDB flush on SIGTERM FlushSnapshots() called on graceful shutdown — no data loss
Token-bucket rate limiter Default 1000 req/s on ingest; configurable via RUPTURA_INGEST_RPS
Integration test Full-stack: analyzer.Update()store.StoreSnapshot() → REST API response

v6.1.0 — 2026-04-27 ✅

§ Feature Detail
§23 gRPC ingest Real gRPC server (:9090), 4 MB max, RESOURCE_EXHAUSTED back-pressure
§24 NATS / Kafka eventbus JetStream at-least-once + franz-go exactly-once; topics: ruptura.rupture.*, ruptura.actions.tier1
§25 Adaptive ensemble weighting Online MAE-based weights, 1-hour sliding window, 60 s update cycle
§26 Kubernetes operator RupturaInstance CRD, controller-runtime reconcile, creates Deployment + Service + PVC
Go SDK (sdk/go) Full v2 API coverage, typed client

v6.0.0 — 2026-04-25 ✅

Clean-room rewrite from OHE v5.1 as github.com/benfradjselim/ruptura:

  • CA-ILR dual-scale ELS engine
  • 5-model ensemble (CA-ILR, ARIMA, Holt-Winters, MAD, EWMA)
  • 44-endpoint REST API v2 with XAI explainability
  • Action engine (K8s / Webhook / Alertmanager / PagerDuty) with 3-tier safety gates
  • OTLP + Prometheus remote_write + DogStatsD ingest
  • BadgerDB embedded storage
  • ≥ 70% test coverage across all packages

Planned

v7.1.0 — Q3 2026

Feature Detail
SLO config UI Configure SLO targets and error budgets directly from the dashboard
Dashboard layout customization Drag-and-drop card arrangement, pinned signals
Multi-tenant namespaces X-Org-ID header → namespace filter on all queries; per-org storage namespacing

v7.2.0 — Q4 2026

Feature Detail
Python SDK v2 async support (httpx), type stubs, full v2 API parity with Go SDK
Grafana data source plugin Native Grafana plugin for Ruptura — query KPIs and FusedR directly in Grafana panels
Cluster mode (WAL + S3) Raft-based replication, S3-compatible snapshot target (MinIO / AWS / GCS)

Engineering Gap Closure Log

All gaps from docs/judgment.md resolved in v6.2.x:

GAP / MISSING Status Closed in
GAP-01 Dual composite engine v6.2.0
GAP-02 API stubs v6.2.0
GAP-03 Fusion wiring + FusedR in API v6.2.0 + v6.2.1
GAP-04 AnomalyStore not wired to actions v6.2.2
GAP-05 Throughput collapse blind spot v6.2.0
GAP-06 In-memory only storage v6.2.0 (BadgerDB + SIGTERM flush)
GAP-07 No Grafana dashboard v6.2.1 (6 panels, workload labels)
GAP-08 OTLP route disconnect v6.2.0 (421 Misdirected with port guidance)
GAP-09 Sentiment disconnected from log pipeline v6.2.0
GAP-10 Treatment unit infra-only v6.2.0 (WorkloadRef)
MISSING-01 Adaptive per-workload baselines v6.2.0
MISSING-02 Narrative explain v6.2.0
MISSING-03 Real contagion from trace topology v6.2.0
MISSING-04 Maintenance windows / suppressions v6.2.0

CNCF

Ruptura is an independent open-source project targeting alignment with CNCF sandbox criteria. The project follows CNCF principles: Apache 2.0 license, open governance (GOVERNANCE.md), documented security policy (SECURITY.md), and a public roadmap.

A CNCF sandbox application requires demonstrable production adoption and a committed maintainer community. Achieving that is a long-term goal, not a current status. Production feedback and contributions from the community are the path there.


Changelog

v5.1.0 (OHE) — 2026-04-19

Go + Python SDK, Prometheus remote_write, gRPC agent, Vault integration, plugin system.

v5.0.0 (OHE) — 2026-04-12

CA-ILR dual-scale predictor, dissipative fatigue (λ recovery), METRICS.md XAI standard, BadgerDB tiered storage.