CIPHER is currently in beta · Live testing in progress · Expect changes

Updates

CIPHER Changelog

New capabilities, improvements, and refinements to the CIPHER platform.

CIPHER-min-3

March 2026Major Release

NEW5,000 candles per pair (208 days / 7 months) — 10x more training data than min-2. Fetched in batches of 1,000 to work within Binance API limits.

NEWMulti-symbol rotation training — BTC, ETH, SOL, XRP rotate each episode. Symbol one-hot encoding in observation space lets the agent learn pair-specific patterns.

NEWRaw features fed directly to PPO (18 features) — removed LSTM latent from observation. Agent learns its own representations from RSI, MACD, BB, EMAs, returns, ATR.

NEWTest-split evaluation — python test_model.py --test-split runs on the held-out 15% across all 4 symbols. Current result: +3.6% return, 71.9% win rate, 32 trades.

NEWModel auto-versioning — saves a named snapshot every 500 episodes with win rate and return metadata.

NEWReward hacking detection — 10 consecutive zero-trade episodes or 3 zero-trade validations triggers automatic RL policy reset.

NEWTwo-database architecture — shared DB (knowledge, market data) + per-version training DB (episodes, trades). Safe to delete training DB without losing knowledge.

NEWLSTM internal validation — 80/20 train/val split with early stopping (patience=3). Weights only saved when validation loss improves.

FIXLSTM architecture overhauled — simplified for direction prediction (BCE). Retrain bug fixed that was corrupting weights.

FIXScaler data leakage fixed — MinMaxScaler now fitted on training split only. Validation and test data no longer leak future price ranges into training.

FIXLSTM padding fixed — zero padding instead of edge replication. Early episodes no longer see fabricated repeated prices.

FIXKnowledge query redundancy fixed — queried once per step instead of 2-3 times. Significant training speedup.

IMPROVEDRisk management tightened — 8 trades/episode (was 50), 5-candle cooldown (was 4), 100-candle max hold (was 30). Fewer but higher quality trades.

IMPROVEDStop loss widened to -3% and clamped — simulates a real stop-loss order. Prevents slippage beyond the stop.

IMPROVEDTake profit uses actual price move — if market gaps past 5%, the full move is captured.

IMPROVEDGamma reduced from 0.99 to 0.95 — agent focuses on shorter-term rewards, better suited for crypto volatility.

IMPROVEDRandom episode windows instead of sequential — prevents memorisation of fixed price sequences.

IMPROVEDValidation runs every 100 episodes across all 4 symbols with deterministic predictions. Weak validations trigger policy reset.

CIPHER-min-2

March 2026Major Release

NEWShort selling capability — CIPHER now profits from both rising and falling markets.

NEWDedicated Close Short action eliminates trading ambiguity and improves short-side precision.

NEWSmart exit rewards — CIPHER now voluntarily closes winning positions rather than waiting for time limits.

NEWLoss-cutting pressure — automatic escalating pressure to exit losing positions beyond 2% drawdown.

NEWData validation split — 70% training, 15% validation, 15% test. Prevents overfitting to historical data.

NEWVersion-isolated databases — each model version has its own isolated storage. No data contamination.

NEWVersion switcher in dashboard — toggle between current and archived versions without restarting.

IMPROVEDDashboard updates at the start of each episode rather than waiting for completion.

IMPROVEDKnowledge retrieval disabled during policy optimisation — cuts per-episode time in half.

IMPROVEDRSI values clamped to valid [0, 100] range for cleaner technical feature computation.

CIPHER-min-1

January 2026Stable Release

NEWFull knowledge store — 3,072 chunks from institutional trading literature, embedded and searchable.

NEWMulti-symbol training — BTC, ETH, SOL, XRP traded in rotation across sessions.

NEWLive Binance integration via ccxt — real market data, real fee simulation.

NEWReal-time dashboard with P&L charts, trade history, and per-pair stats.

NEWModel versioning — save, load, compare, and benchmark model versions at any point.

NEWREST API for external signal consumption.

IMPROVEDAnti-overtrading controls — maximum 2 trades per 10-candle window with escalating penalties.

IMPROVEDSession checkpoints every 100 episodes with automatic rollback capability.