Docs | Lakon

What is Lakon?

Lakon is a semantic prompt compressor. It takes the prompts you write for Claude, ChatGPT, or Gemini and removes everything that wastes tokens — polite phrasing, redundant context, scaffolding — while preserving every piece of signal the AI needs to answer correctly.

It is not a prompt improver. It is not a rewriter. It is a compression engine. The output is always shorter. The AI's response is always equivalent.

The core claim: 78% fewer tokens on average. Same quality answer from the AI.

Quick Start

Option 1 — Browser Extension (Recommended)

Install the extension and compress directly inside Claude, ChatGPT, or Gemini. No copy-paste. No tab switching.

↓ Chrome ↓ Brave ↓ Opera ↓ Edge

Option 2 — Web App

No installation needed. Paste your prompt at /app and compress directly.

Try It Now

Test compression live without installing anything:

try it here

36 tokens

How It Works

Lakon sends your prompt to a compression backend powered by Groq. A specialized system prompt instructs the model to:

Remove filler

Polite phrasing, hedging words, redundant restatements of the request.

Restructure for attention

LLMs pay most attention to the beginning and end of a prompt (primacy/recency effect). Lakon moves signal to those zones.

Preserve every constraint

Frameworks, formats, word counts, tone instructions — all survive compression exactly as specified.

Output only valid JSON

The backend returns compressed text, token counts, and preserved signals — never commentary.

What it returns

{
  "compressed": "Compare PostgreSQL vs MongoDB: when to use each. Skip basics. Include decision table.",
  "tokens_before": 76,
  "tokens_after": 17,
  "reduction_pct": 78,
  "signal_preserved": ["technical comparison", "decision table format"],
  "warning": null
}

Extension Guide

Installing in Developer Mode (Chrome / Brave / Opera / Edge)

Until the extension is live on the Chrome Web Store, install it manually:

1

Download the extension ZIP

Click the download link above. Save and extract the ZIP file.

2

Open Extensions

Navigate to chrome://extensions/ in your browser.

3

Enable Developer Mode

Toggle the switch in the top-right corner of the extensions page.

4

Load Unpacked

Click "Load unpacked" and select the extracted extension folder.

5

Pin Lakon

Click the puzzle icon in your toolbar → find Lakon → click the pin icon.

Using the Extension

After installation, open Claude, ChatGPT, or Gemini. Type your prompt as usual. You'll see a Lakon button appear next to the send button. Click it — your prompt is replaced with the compressed version instantly.

Supported Platforms

claude.aichatgpt.comgemini.google.com

Web App

The web app at lakonai.vercel.app/app lets you compress prompts without installing the extension. It's identical to the extension's backend — same compression quality.

When to use the web app

→You're on a computer where you can't install extensions

→You want to compress a very long prompt before pasting it into an AI tool

→You want to test how Lakon works before installing

Keyboard Shortcut

Press ⌘ + Enter (Mac) or Ctrl + Enter (Windows) to compress.

API Reference

The Lakon backend exposes a simple REST API. You can call it directly from your own code.

Base URL

https://lakon-api.onrender.com

POST /compress

Compresses a prompt. Returns the compressed text and token statistics.

Request body

{
  "prompt": "string (required)",
  "task_type": "auto | coding | writing | analysis | creative | data",
  "compression_mode": "strict | balanced | creative"
}

Response

{
  "compressed": "string",
  "tokens_before": number,
  "tokens_after": number,
  "reduction_pct": number,
  "signal_preserved": string[],
  "warning": string | null
}

Example (curl)

curl -X POST https://your-api.onrender.com/compress \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Help me understand React hooks.", "task_type": "auto"}'

GET /health

Returns server status. Use this to wake the server on cold start.

curl https://your-api.onrender.com/health
# → {"status": "ok", "message": "Server is awake"}

Rate Limits

3 requests per minute per IP address. Designed for interactive use, not batch processing.

FAQ

Is this just removing words?

No. Lakon uses a compression model that understands LLM attention mechanics. It restructures prompts — moving signal to high-attention zones — rather than just deleting words. Every technical constraint you specify survives.

Will the AI give me a worse answer?

No. In most cases the answer is identical. In some cases it's better, because the AI receives a cleaner signal with less noise to process. The compression is specifically designed to preserve intent.

Does Lakon store my prompts?

No. Your prompt is sent to the compression backend, processed, and the result is returned. Nothing is stored or logged beyond what Render.com keeps in server logs (IP address, timestamp).

Why is the first request slow?

The backend runs on Render's free tier, which spins down after inactivity. The first request after a period of no activity takes 20–30 seconds to wake the server. Subsequent requests are fast.

Can I use this for very long prompts?

Yes, up to 12,000 tokens (roughly 9,000 words). Above that, the web app will warn you. The extension has no hard limit.

Why build this as an extension?

Because that's where the friction is. The moment you need to compress a prompt, you're already inside Claude or ChatGPT. Leaving the tab to paste into a web app creates enough friction that people stop using the tool. The extension removes that friction entirely.