Roka: Prune your logs

See the prune in action

Thousands of repetitive log lines become a focused context stream — critical errors preserved, duplicates collapsed.

deploy.log — 48,204 lines

// raw context buffer

WARN [db_pool] Connection pool 85%, retry 50ms

CRITICAL EXCEPTION: Out of memory killer!

INFO [auth] login attempt 192.168.1.50

INFO [auth] login attempt 10.0.0.12

… 47,891 more lines

roka output — 8,000 token budget

// optimized context

[847x | lines 12-859 | 07:00:02 → 07:00:45] WARN [db_pool] Connection pool 85%, retry 50ms

CRITICAL EXCEPTION: Out of memory killer!

INFO [auth] login attempt 192.168.1.50

input: 124,480 tkn → output: 7,842 tkn
reduction: −94% · latency: 312ms
critical chunks: 1 · selected: 24/1,204

terminal — roka pipeline

$ cat huge_deploy.log | python prune.py \

--query "why did deploy fail" \

--budget 8000 --stats

✓ detected type: log
✓ collapsed 12,847 repetitive lines
✓ BM25 rank → semantic re-rank
✓ packed 38 chunks to budget (3 critical preserved)

Built for real context, not toy demos

Roka is a pruning middleware: ingest logs, code, or prose, detect structure, score relevance to your query, and deliver exactly what fits the budget.

01

Log compaction

Fingerprints repetitive lines (UUIDs, IPs, timestamps → placeholders) and collapses patterns above a threshold into summary chunks with provenance.

02

Critical preservation

Panics, exceptions, OOM, auth failures, and stack traces are never collapsed — always packed first regardless of score.

03

BM25 + semantic rank

Fast lexical scoring with BM25Okapi, then optional sentence-transformer re-ranking so chunks match your query intent.

04

Token budget packer

Greedy packing with exact tiktoken counts (cl100k_base). Source diversity cap prevents one file from hogging the budget.

05

Multi-format input

Auto-detects logs, code, or prose. Code can be minified (docstrings/comments stripped) and chunked by function/class scope.

06

API & CLI

Pipe stdin, point at a file, or POST to /api/prune. Same pipeline everywhere — stdout is always the pruned context.

Try it now

Paste a log dump, set your query and budget, and run the live pipeline.

Query / intent

Token budget 8,000

Semantic reranking Minify code

Raw context 0 chars

Telemetry

Input

—

Output

—

Saved

—

Chunks: —

Latency: —

Lines: —

Critical: —

Pruned output

// Run prune to see optimized context…

// ready

Integrate anywhere

Drop Roka into your agent pipeline, CI debug step, or observability stack. One POST with your raw context and query — get back pruned text plus compression stats.

→ Pipe logs through CLI before sending to Claude
→ Pre-process RAG chunks at ingestion time
→ Shrink incident dumps in on-call workflows

POST /api/prune

curl -X POST http://localhost:8000/api/prune \
  -H "Content-Type: application/json" \
  -d '{
    "text": "<raw logs or code>",
    "query": "why did auth fail",
    "token_budget": 8000,
    "use_semantic": true,
    "use_minification": false
  }'

Download Roka

Python 3.10+. Install dependencies and run the API server or use the CLI directly.

terminal

pip install fastapi uvicorn rank-bm25 tiktoken \
  sentence-transformers typer rich

python project/main.py
# → http://127.0.0.1:8000

cat logs.txt | python project/prune.py \
  --query "deploy failure" --budget 8000 --stats

Open playground View on GitHub