Prompt Sizer - TokenKit | LLM Context Window Calculator

Prompt Input

System Prompt

User Message

Model

chars 0

words 0

lines 0

Paste your system prompt and user message to see context window usage

Frequently Asked Questions

What is a context window in LLMs?

A context window is the maximum number of tokens (input + output combined) that a language model can process in a single request. If your prompt uses 80% of the context window, only 20% is left for the model's response.

How do I know if my prompt is too long?

If your prompt uses more than 80% of a model's context window, the response will be limited. The Prompt Sizer tool shows exactly how much room is left. We recommend keeping prompts under 50% of the context window for most use cases.

What's the difference between tokens and words?

Tokens are the fundamental units LLMs process. A token is roughly ¾ of a word in English. "Hello world" is 2 words but typically 2 tokens, while "unprecedented" is 1 word but may be 3-4 tokens. Code tends to use more tokens per character than natural language.

Which model has the largest context window?

As of 2026, Llama 4 Scout has the largest context window at 10 million tokens. Google's Gemini 2.5 models and OpenAI's GPT-4.1 family support 1 million tokens. Most other models support 128K-200K tokens.

How is cost calculated?

Cost is calculated per million tokens. Input tokens (your prompt) and output tokens (the model's response) are priced separately. We show the cost of your prompt plus what a full-context-length response would cost — the maximum you'd pay.