r o k a

prune your logs

Integrate Download

Context pruning middleware — collapse noise, rank relevance, pack to your token budget before the model ever sees it.

See the prune in action

Thousands of repetitive log lines become a focused context stream — critical errors preserved, duplicates collapsed.

deploy.log — 48,204 lines
// raw context buffer
WARN [db_pool] Connection pool 85%, retry 50ms
WARN [db_pool] Connection pool 85%, retry 50ms
WARN [db_pool] Connection pool 85%, retry 50ms
WARN [db_pool] Connection pool 85%, retry 50ms
WARN [db_pool] Connection pool 85%, retry 50ms
WARN [db_pool] Connection pool 85%, retry 50ms
CRITICAL EXCEPTION: Out of memory killer!
INFO [auth] login attempt 192.168.1.50
INFO [auth] login attempt 10.0.0.12
… 47,891 more lines
roka output — 8,000 token budget
// optimized context
[847x | lines 12-859 | 07:00:02 → 07:00:45] WARN [db_pool] Connection pool 85%, retry 50ms
CRITICAL EXCEPTION: Out of memory killer!
INFO [auth] login attempt 192.168.1.50
input: 124,480 tkn → output: 7,842 tkn
reduction: −94% · latency: 312ms
critical chunks: 1 · selected: 24/1,204
terminal — roka pipeline
$ cat huge_deploy.log | python prune.py \
--query "why did deploy fail" \
--budget 8000 --stats
detected type: log
collapsed 12,847 repetitive lines
BM25 rank → semantic re-rank
packed 38 chunks to budget (3 critical preserved)

Built for real context, not toy demos

Roka is a pruning middleware: ingest logs, code, or prose, detect structure, score relevance to your query, and deliver exactly what fits the budget.

01

Log compaction

Fingerprints repetitive lines (UUIDs, IPs, timestamps → placeholders) and collapses patterns above a threshold into summary chunks with provenance.

02

Critical preservation

Panics, exceptions, OOM, auth failures, and stack traces are never collapsed — always packed first regardless of score.

03

BM25 + semantic rank

Fast lexical scoring with BM25Okapi, then optional sentence-transformer re-ranking so chunks match your query intent.

04

Token budget packer

Greedy packing with exact tiktoken counts (cl100k_base). Source diversity cap prevents one file from hogging the budget.

05

Multi-format input

Auto-detects logs, code, or prose. Code can be minified (docstrings/comments stripped) and chunked by function/class scope.

06

API & CLI

Pipe stdin, point at a file, or POST to /api/prune. Same pipeline everywhere — stdout is always the pruned context.

Try it now

Paste a log dump, set your query and budget, and run the live pipeline.

Token budget 8,000
Raw context 0 chars
Telemetry
Input
Output
Saved
Chunks:
Latency:
Lines:
Critical:
Pruned output
// Run prune to see optimized context…
// ready

Integrate anywhere

Drop Roka into your agent pipeline, CI debug step, or observability stack. One POST with your raw context and query — get back pruned text plus compression stats.

  • Pipe logs through CLI before sending to Claude
  • Pre-process RAG chunks at ingestion time
  • Shrink incident dumps in on-call workflows
POST /api/prune
curl -X POST http://localhost:8000/api/prune \
  -H "Content-Type: application/json" \
  -d '{
    "text": "<raw logs or code>",
    "query": "why did auth fail",
    "token_budget": 8000,
    "use_semantic": true,
    "use_minification": false
  }'

Download Roka

Python 3.10+. Install dependencies and run the API server or use the CLI directly.

terminal
pip install fastapi uvicorn rank-bm25 tiktoken \
  sentence-transformers typer rich

python project/main.py
# → http://127.0.0.1:8000

cat logs.txt | python project/prune.py \
  --query "deploy failure" --budget 8000 --stats