Question 1

LLM

Accepted Answer

A large language model is a system trained on a huge amount of writing so it can predict words and respond like a chat partner. Why it matters: when a company says it launched a new model, this is usually what it means.

Question 2

Token

Accepted Answer

A token is a small chunk of text. Models read and write tokens, not full paragraphs all at once. Why it matters: token limits affect how much the model can read, remember, and answer in one go.

Question 3

Context window

Accepted Answer

This is how much text, code, or conversation a model can keep in view at one time before older details start falling out. Why it matters: a bigger context window can help with long documents, large code files, and longer chats.

Question 4

Fine-tuning

Accepted Answer

Fine-tuning means taking a general model and training it more so it becomes better at one job, one style, or one company's data. Why it matters: this is how companies try to make a broad model more useful for their own workflows.

Question 5

Inference

Accepted Answer

Inference is the moment the model does the actual work of answering your prompt after training is already finished. Why it matters: many pricing and speed claims are really about making inference cheaper or faster.

Question 6

Open weights

Accepted Answer

Open weights means a company shares the model files so other people can run the model themselves, usually with some license rules attached. Why it matters: open weights can make it easier to self-host, inspect, or customize a model.

Question 7

Hallucination

Accepted Answer

This means the model says something that sounds confident but is wrong, made up, or unsupported by evidence. Why it matters: a polished answer can still be false, so confidence alone should never be treated as proof.

Question 8

Benchmark

Accepted Answer

A benchmark is a test used to compare models. It can be useful, but it does not always tell you how well a tool works in normal real-world tasks. Why it matters: a benchmark win can sound huge even when the everyday product barely changes.

Question 9

Safety policy

Accepted Answer

A safety policy is the rulebook for what a model should refuse, warn about, or handle carefully. Why it matters: policy changes can affect what a model will answer, who gets access, and how risky the product feels.

Question 10

RAG

Accepted Answer

RAG stands for retrieval-augmented generation. It means the model looks up outside information first, then answers using that extra material. Why it matters: this can improve accuracy when the system actually uses fresh or trusted sources.

Question 11

Evaluation

Accepted Answer

An evaluation is a broader set of tests used to check whether a model is accurate, safe, reliable, or useful for a specific job. Why it matters: serious teams show evaluations, not just marketing lines and dramatic demo videos.

Question 12

Reasoning

Accepted Answer

Reasoning usually means the model is better at multi-step thinking, problem solving, or sticking with a harder task without drifting. Why it matters: this is one of the most common upgrade claims, so it helps to ask what real tasks improved.

Question 13

Agent

Accepted Answer

An agent is an AI system that does more than chat. It can plan steps, use tools, and carry out tasks with less hand-holding. Why it matters: agent claims usually suggest the product can take action instead of only talking.

Question 14

Multimodal

Accepted Answer

Multimodal means the system can work with more than one kind of input or output, like text, image, audio, or video. Why it matters: multimodal systems can read a photo, hear a question, and answer in text or voice.

Question 15

API

Accepted Answer

An API is the way developers connect their own app or website to an AI model so the AI becomes part of another product. Why it matters: many model launches matter most because they change what developers can build or afford.

Question 16

Latency

Accepted Answer

Latency is the delay between the moment you ask for something and the moment the system starts answering. Why it matters: lower latency makes AI feel faster, smoother, and more usable in real products.

Question 17

Deployment

Accepted Answer

Deployment means moving a model from an announcement or test into the real world where people, teams, or customers can actually use it. Why it matters: the jump from demo to deployment is where many bold claims either hold up or fall apart.

Question 18

Copilot

Accepted Answer

Copilot usually means an AI helper that sits inside another app and tries to assist you while you work. Why it matters: the word sounds friendly, but the real question is what work it can truly save you.

AI Cheat Sheet

How to use this page

Model basics

LLM