What is Lakon?

Lakon is an AI Optimization & Continuity Engine. It consists of two powerful tools:

1. Semantic Prompt Compressor

Removes everything that wastes tokens — polite phrasing, redundant context, scaffolding — while preserving every piece of signal the AI needs to answer correctly. Output is always shorter, response is always equivalent.

2. Context Snapshots (Continuity Engine)

Converts massive, messy chat logs into a clean, structured "Continuation Prompt". Never lose your AI's attention window again. Instantly resume workflows exactly where you left off.

Quick Start

Option 1 — Browser Extension (Recommended)

Install the extension and compress directly inside Claude, ChatGPT, or Gemini. No copy-paste. No tab switching.

Option 2 — Web App

No installation needed. Paste your prompt at /app and compress directly.

Try It Now

Test compression live without installing anything:

try it here
36 tokens

How It Works

Lakon sends your prompt to a compression backend powered by Groq. A specialized system prompt instructs the model to:

Remove filler

Polite phrasing, hedging words, redundant restatements of the request.

Restructure for attention

LLMs pay most attention to the beginning and end of a prompt (primacy/recency effect). Lakon moves signal to those zones.

Preserve every constraint

Frameworks, formats, word counts, tone instructions — all survive compression exactly as specified.

Output only valid JSON

The backend returns compressed text, token counts, and preserved signals — never commentary.

What it returns

{
  "compressed": "Compare PostgreSQL vs MongoDB: when to use each. Skip basics. Include decision table.",
  "tokens_before": 76,
  "tokens_after": 17,
  "reduction_pct": 78,
  "signal_preserved": ["technical comparison", "decision table format"],
  "warning": null
}

Extension Guide

Installing in Developer Mode (Chrome / Brave / Opera / Edge)

Until the extension is live on the Chrome Web Store, install it manually:

1
Download the extension ZIP
Click the download link above. Save and extract the ZIP file.
2
Open Extensions
Navigate to chrome://extensions/ in your browser.
3
Enable Developer Mode
Toggle the switch in the top-right corner of the extensions page.
4
Load Unpacked
Click "Load unpacked" and select the extracted extension folder.
5
Pin Lakon
Click the puzzle icon in your toolbar → find Lakon → click the pin icon.

Using the Extension

After installation, open Claude, ChatGPT, or Gemini. Type your prompt as usual. You'll see a Lakon button appear next to the send button. Click it — your prompt is replaced with the compressed version instantly.

Supported Platforms

claude.aichatgpt.comgemini.google.com

Playground (Web App)

The web app at lakonai.vercel.app/app lets you compress prompts without installing the extension. It's identical to the extension's backend — same compression quality.

When to use the web app

You're on a computer where you can't install extensions
You want to compress a very long prompt before pasting it into an AI tool
You want to test how Lakon works before installing

Keyboard Shortcut

Press ⌘ + Enter (Mac) or Ctrl + Enter (Windows) to compress.

Context Snapshots

The Continuity Engine analyzes your massive chat logs and generates a clean "Context Snapshot" using a specialized Map-Reduce pipeline running on Llama 3.3 70B.

How it works

1. Map Phase: Lakon chunks your chat log and extracts the ultimate goal, key decisions, hard constraints, and open tasks.

2. Reduce Phase: It merges these intermediate snapshots into a final, unified JSON structure.

3. Continuation Prompt: It automatically generates a highly optimized, rich first-person briefing paragraph. Paste this into a new chat to completely bring a fresh AI up to speed without bringing along the raw, cluttered history.

API Reference

The Lakon backend exposes a simple REST API. You can call it directly from your own code.

Base URL

https://lakon-api.onrender.com

POST /compress

Compresses a prompt. Returns the compressed text and token statistics.

Request body
{
  "prompt": "string (required)",
  "task_type": "auto | coding | writing | analysis | creative | data",
  "compression_mode": "strict | balanced | creative"
}
Response
{
  "compressed": "string",
  "tokens_before": number,
  "tokens_after": number,
  "reduction_pct": number,
  "signal_preserved": string[],
  "warning": string | null
}
Example (curl)
curl -X POST https://your-api.onrender.com/compress \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Help me understand React hooks.", "task_type": "auto"}'

GET /health

Returns server status. Use this to wake the server on cold start.

curl https://your-api.onrender.com/health
# → {"status": "ok", "message": "Server is awake"}

POST /snapshot

Generates a Context Snapshot and Continuation Prompt from a raw conversation log.

Request body
{
  "conversation": "string (required, the raw chat log)"
}

Rate Limits

3 requests per minute per IP address. Designed for interactive use, not batch processing.

FAQ

Is this just removing words?
No. Lakon uses a compression model that understands LLM attention mechanics. It restructures prompts — moving signal to high-attention zones — rather than just deleting words. Every technical constraint you specify survives.
Will the AI give me a worse answer?
No. In most cases the answer is identical. In some cases it's better, because the AI receives a cleaner signal with less noise to process. The compression is specifically designed to preserve intent.
Does Lakon store my prompts?
No. Your prompt is sent to the compression backend, processed, and the result is returned. Nothing is stored or logged beyond what Render.com keeps in server logs (IP address, timestamp).
Why is the first request slow?
The backend runs on Render's free tier, which spins down after inactivity. The first request after a period of no activity takes 20–30 seconds to wake the server. Subsequent requests are fast.
Can I use this for very long prompts?
Yes, up to 12,000 tokens (roughly 9,000 words). Above that, the web app will warn you. The extension has no hard limit.
Why build this as an extension?
Because that's where the friction is. The moment you need to compress a prompt, you're already inside Claude or ChatGPT. Leaving the tab to paste into a web app creates enough friction that people stop using the tool. The extension removes that friction entirely.