Tokens are the units LLMs process text in (pieces of words), and the context window is the maximum amount of text (tokens) an LLM can consider at once. Understanding them is important for using LLMs effectively, managing costs, and handling their limits.
What tokens are
TOKEN → the unit LLMs process text in (not words/characters, but PIECES):
→ text is split into tokens (roughly ~4 characters or ~0.75 words each in English)
→ e.g. 'unbelievable' might be 3 tokens; common words are often 1 token
→ the model processes and generates token by token
→ LLMs work in tokens (input and output are measured in tokens)
