AI buzzwords are everywhere. Terms are thrown around in dialogue that feel like they've existed for decades but have only emerged in the past year or two. I've experienced a broad disparity between two groups of people: those I would consider "AI power users," regularly on the bleeding edge of every new innovation, and those who have only dabbled in the subject and wouldn't be able to grasp whatever "agentic" thing everyone is raving about. I do my best to be aware of the wave of AI hype and keep myself informed of all the lingo. If you're feeling like you're behind, or have no idea what "an agentic workflow integrating with an MCP server using the latest model from Anthropic with the larger context window" is, read on.
I'd like to provide short explainers of a few terms related to the GenAI tools that have sprung up within just the last 12 months. I will approach this from first principles, so that anyone feeling behind on the discourse can quickly catch up. We will cover the following terms: LLM, Hallucination, Context Window, System Prompt, Agent, MCP, Tool, and Skill.
A lot of what is presented as fact or best-practice was accurate as far as I know at the time of writing. These statements might get stale within a year or a month, it's truly hard to say with the pace of change.
Large Language Model
Large Language Models or LLMs are the foundation of the latest wave of AI hype and the core of what powers most of the modern AI technology mentioned below. You can read up on the theory and science behind LLMs as well as the years of research that went into taking them from concept to product. There are even some who argue that LLMs are fundamentally the wrong path to take humanity into a new golden age of AI and technology. At their core, they are computationally massive systems trained to predict what comes next given an input, specifically, what token (a unit of text, roughly a word or word fragment) follows said input. That basic premise, scaled up with massive computational power and training data sourced from the internet, manifests in higher-level abstractions like the chat interfaces you've likely already used.
Models like ChatGPT, Claude, and Gemini are typically referenced as "frontier models" and stand at the cutting edge of generative AI capabilities. There are dozens of other models available, spanning various specializations and scales, with some being fully open source.
Hallucination
One of the most important characteristics of LLMs to understand, one that shapes all the techniques built on top of them, is that they are fundamentally non-deterministic. Said another way, they're probabilistic, meaning if you provide the same input twice given exactly the same context, there is no guarantee the output will be the same. You know that disclaimer tucked into basically every LLM chat tool you might use? "Insert LLM name here can make mistakes." This disclaimer warns of hallucination, which refers to a model generating confident but factually incorrect or fabricated information. This stems from the model's probabilistic nature. It has no inherent ability to understand its inputs or access ground truth about the world. If you'd like a demonstration, just try asking a model like Claude Sonnet 4.6 how many R's are in the word strawberry.
Context Window
LLMs have what is called a context window, which is a measure, in tokens, of how much text the model can consider at once when generating a response. Modern LLMs commonly have context windows of hundreds of thousands of tokens up to millions.
One might assume that a model with a 1 million token context window would deliver consistent output quality all the way up to that limit. However, modern LLMs suffer from what's colloquially known as context rot. Once a model's context window reaches around 80% of its limit, the model's output quality takes a significant drop. Because LLMs are functionally probabilistic "next word" guessers, every time a new input gets sent in, it is really just appended to the end of the ongoing session and that entire session is then sent back into the model for the following output to be generated.
When operating over the course of an extended session with an LLM, it's recommended to periodically either start a fresh session when the utilized context window is approaching around 80%, or perform what is known as compacting the session. This basically feeds the entire session history back into a fresh session with the model to be summarized and used as a more token-efficient representation of the context built up in the session up to that point.
System Prompt
Products built on LLMs will typically use what's called a System Prompt, which is a set of instructions automatically inserted before your input, giving the model guiding principles and constraints to follow when generating responses. You can browse Anthropic's published system prompts for Claude for a real-world example.
Agent
The simplest implementation of an LLM is one of basic text input and output. This was the first iteration of ChatGPT that was released and provides more-or-less a direct experience of what LLMs provide as a technology. Prompt in, generated content out.
Agents distinguish themselves from basic LLM interactions in two key ways: First, they can execute multiple steps before producing a final output, including follow-up clarification and iterative planning. Second, they can interface with external systems via APIs, Tools, and MCP servers (explained below), allowing them to act on the world rather than just respond to it. Because of this inherent flexibility and underlying power of LLMs, all varieties of Agents have sprung up from those that can code autonomously all the way to ones that can provide automated customer service over the phone.
Model Context Protocol (MCP)
The Model Context Protocol or MCP was created by Anthropic, the creators of the Claude series of frontier models, and provides an open source specification by which LLMs can integrate with external systems. Much like REST provides a standard by which systems can publish an HTTP-based API for applications to consume, MCP provides a specification by which servers can expose an interface tailored for LLM clients.
Though competing specifications like Agent-2-Agent exist, and despite significant scrutiny from multiplecybersecurityresearchers, MCP remains a widely adopted specification with broad product support.
Tool
A Tool is how an Agent interacts outside of its own process boundary. It is a discrete, callable function that an Agent can invoke to interact with something outside its own context like reading a file, querying an API, or running a terminal command. For example, a minimal description of what a "Coding Agent" is can be summarized into 3 parts: 1. an LLM that provides the engine of the agent, 2. a set of tools that can read files, create files, and edit files, and 3. a system prompt that describes to the LLM its purpose and scope as a coding agent.
Unlike MCP, which is a broad and general protocol, individual Tools are narrowly scoped to a specific capability: a single API endpoint, a file operation, or a shell command. This has advantages such as allowing certain models to be more selective about which tools they pull into their working context for a given task. Tools can also provide a more robust security context within which a model is given permissions to operate and thereby reduce the risk of unwanted actions being executed.
Skill
Skills are an open specification from Anthropic that encode specific agent capabilities in a Markdown file, augmenting what an Agent can do beyond its default behavior. Skills allow for repeatable tasks to be defined and quickly invoked. Typically, a Skill describes a particular task, the Tools and integrations needed to accomplish it, how to use those integrations, and the step-by-step process to complete it.
One of my favorite skills is also one of the simplest: /grill-me. The Skill's author, Matt Pocock, describes it best in its opening lines, which also happen to make up about 3/4 of its total content:
Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.
I have found these concepts cover enough of the fundamentals to navigate the discussion and innovation swirling around Generative AI. I hope you find them useful. If so, please consider passing it along to someone who's been quietly wondering what all the noise is about.
The post was authored with editorial support (Claude Sonnet 4.6) and image generation (Gemini 3.1 Pro) from Generative AI. All words are my own.
Webmentions
---
title: "A Practical Introduction to Generative AI Terminology"
subtitle: "First-principles explainers for the terms shaping today's AI-powered tooling landscape"
date: 2026-06-14
tags: ["ai", "gen-ai", "technology", "software engineering", "natural language processing"]
---
AI buzzwords are everywhere. Terms are thrown around in dialogue that feel like they've existed for decades but have only emerged in the past year or two. I've experienced a broad disparity between two groups of people: those I would consider "AI power users," regularly on the bleeding edge of every new innovation, and those who have only dabbled in the subject and wouldn't be able to grasp whatever "agentic" thing everyone is raving about. I do my best to be aware of the wave of AI hype and keep myself informed of all the lingo. If you're feeling like you're behind, or have no idea what "an agentic workflow integrating with an MCP server using the latest model from Anthropic with the larger context window" is, read on.
I'd like to provide short explainers of a few terms related to the GenAI tools that have sprung up within just the last 12 months. I will approach this from first principles, so that anyone feeling behind on the discourse can quickly catch up. We will cover the following terms: LLM, Hallucination, Context Window, System Prompt, Agent, MCP, Tool, and Skill.
> A lot of what is presented as fact or best-practice was accurate as far as I know at the time of writing. These statements might get stale within a year or a month, it's truly hard to say with the pace of change.
## Large Language Model
**Large Language Models or LLMs** are the foundation of the latest wave of AI hype and the core of what powers most of the modern AI technology mentioned below. You can [read up](https://en.wikipedia.org/wiki/Large_language_model) on the theory and science behind LLMs as well as the years of research that went into taking them from concept to product. There are even some who [argue that LLMs are fundamentally the wrong path](https://www.freethink.com/robots-ai/arc-prize-agi) to take humanity into a new golden age of AI and technology. At their core, they are computationally massive systems trained to predict what comes next given an input, specifically, what **token** (a unit of text, roughly a word or word fragment) follows said input. That basic premise, scaled up with massive computational power and training data sourced from the internet, manifests in higher-level abstractions like the chat interfaces you've likely already used.
Models like [ChatGPT](https://chatgpt.com/), [Claude](https://claude.ai/), and [Gemini](https://gemini.google.com/app) are typically referenced as "frontier models" and stand at the cutting edge of generative AI capabilities. There are [dozens of other models available](https://en.wikipedia.org/wiki/List_of_large_language_models), spanning various specializations and scales, with some being fully open source.
## Hallucination
One of the most important characteristics of LLMs to understand, one that shapes all the techniques built on top of them, is that they are fundamentally **non-deterministic**. Said another way, they're probabilistic, meaning if you provide the same input twice given exactly the same context, there is no guarantee the output will be the same. You know that disclaimer tucked into basically every LLM chat tool you might use? *"_Insert LLM name here_ can make mistakes."* This disclaimer warns of **hallucination**, which refers to a model generating confident but factually incorrect or fabricated information. This stems from the model's probabilistic nature. It has no inherent ability to understand its inputs or access ground truth about the world. If you'd like a demonstration, just try asking a model like Claude Sonnet 4.6 [how many R's are in the word strawberry](https://techcrunch.com/2024/08/27/why-ai-cant-spell-strawberry/).
## Context Window
LLMs have what is called a **context window**, which is a measure, in tokens, of how much text the model can consider at once when generating a response. Modern LLMs commonly have context windows of hundreds of thousands of tokens up to millions.
One might assume that a model with a 1 million token context window would deliver consistent output quality all the way up to that limit. However, modern LLMs suffer from what's colloquially known as **context rot**. Once a model's context window reaches around 80% of its limit, the model's output quality takes a significant drop. Because LLMs are functionally probabilistic "next word" guessers, every time a new input gets sent in, it is really just appended to the end of the ongoing session and that *entire* session is then sent back into the model for the following output to be generated.
When operating over the course of an extended session with an LLM, it's recommended to periodically either start a fresh session when the utilized context window is approaching around 80%, or perform what is known as **compacting** the session. This basically feeds the entire session history back into a fresh session with the model to be summarized and used as a more token-efficient representation of the context built up in the session up to that point.
## System Prompt
Products built on LLMs will typically use what's called a **System Prompt**, which is a set of instructions automatically inserted before your input, giving the model guiding principles and constraints to follow when generating responses. You can browse [Anthropic's published system prompts for Claude](https://platform.claude.ai/docs/en/release-notes/system-prompts) for a real-world example.
## Agent
The simplest implementation of an LLM is one of basic text input and output. This was the first iteration of ChatGPT that was released and provides more-or-less a direct experience of what LLMs provide as a technology. Prompt in, generated content out.
**Agents** distinguish themselves from basic LLM interactions in two key ways: First, they can **execute multiple steps** before producing a final output, including follow-up clarification and iterative planning. Second, they can **interface with external systems** via APIs, Tools, and MCP servers (explained below), allowing them to act on the world rather than just respond to it. Because of this inherent flexibility and underlying power of LLMs, all varieties of Agents have sprung up from those that can code autonomously all the way to ones that can provide automated customer service over the phone.
## Model Context Protocol (MCP)
The [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro) or MCP was created by Anthropic, the creators of the Claude series of frontier models, and provides an open source specification by which LLMs can integrate with external systems. Much like [REST](https://en.wikipedia.org/wiki/REST) provides a standard by which systems can publish an HTTP-based API for applications to consume, MCP provides a specification by which servers can expose an interface tailored for LLM clients.
Though competing specifications like [Agent-2-Agent](https://a2a-protocol.org/latest/) exist, and despite significant scrutiny from [multiple](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/630) [cybersecurity](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/1751) [researchers](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/1750), MCP remains a widely adopted specification with broad product support.
## Tool
A **Tool** is how an Agent interacts outside of its own process boundary. It is a discrete, callable function that an Agent can invoke to interact with something outside its own context like reading a file, querying an API, or running a terminal command. For example, a minimal description of what a "Coding Agent" is can be summarized into 3 parts: 1. an LLM that provides the engine of the agent, 2. a set of tools that can read files, create files, and edit files, and 3. a system prompt that describes to the LLM its purpose and scope as a coding agent.
Unlike MCP, which is a broad and general protocol, individual Tools are narrowly scoped to a specific capability: a single API endpoint, a file operation, or a shell command. This has advantages such as allowing [certain models to be more selective](https://www.anthropic.com/engineering/advanced-tool-use) about which tools they pull into their working context for a given task. Tools can also provide a more robust security context within which a model is given permissions to operate and thereby reduce the risk of unwanted actions being executed.
## Skill
**Skills** are an [open specification](https://agentskills.io/home) from Anthropic that encode specific agent capabilities in a Markdown file, augmenting what an Agent can do beyond its default behavior. Skills allow for repeatable tasks to be defined and quickly invoked. Typically, a Skill describes a particular task, the Tools and integrations needed to accomplish it, how to use those integrations, and the step-by-step process to complete it.
One of my favorite skills is also one of the simplest: [/grill-me](https://www.skills.sh/mattpocock/skills/grill-me). The Skill's author, [Matt Pocock](https://www.skills.sh/mattpocock), describes it best in its opening lines, which also happen to make up about 3/4 of its total content:
```
Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.
```
---
I have found these concepts cover enough of the fundamentals to navigate the discussion and innovation swirling around Generative AI. I hope you find them useful. If so, please consider passing it along to someone who's been quietly wondering what all the noise is about.
> The post was authored with editorial support (Claude Sonnet 4.6) and image generation (Gemini 3.1 Pro) from Generative AI. All words are my own.
AI buzzwords are everywhere. Terms are thrown around in dialogue that feel like they've existed for decades but have only emerged in the past year or two. I've experienced a broad disparity between two groups of people: those I would consider "AI power users," regularly on the bleeding edge of every new innovation, and those who have only dabbled in the subject and wouldn't be able to grasp whatever "agentic" thing everyone is raving about. I do my best to be aware of the wave of AI hype and keep myself informed of all the lingo. If you're feeling like you're behind, or have no idea what "an agentic workflow integrating with an MCP server using the latest model from Anthropic with the larger context window" is, read on. I'd like to provide short explainers of a few terms related to the GenAI tools that have sprung up within just the last 12 months. I will approach this from first principles, so that anyone feeling behind on the discourse can quickly catch up. We will cover the following terms: LLM, Hallucination, Context Window, System Prompt, Agent, MCP, Tool, and Skill. A lot of what is presented as fact or best-practice was accurate as far as I know at the time of writing. These statements might get stale within a year or a month, it's truly hard to say with the pace of change. Large Language Model Large Language Models or LLMs are the foundation of the latest wave of AI hype and the core of what powers most of the modern AI technology mentioned below. You can read up on the theory and science behind LLMs as well as the years of research that went into taking them from concept to product. There are even some who argue that LLMs are fundamentally the wrong path to take humanity into a new golden age of AI and technology. At their core, they are computationally massive systems trained to predict what comes next given an input, specifically, what token (a unit of text, roughly a word or word fragment) follows said input. That basic premise, scaled up with massive computational power and training data sourced from the internet, manifests in higher-level abstractions like the chat interfaces you've likely already used. Models like ChatGPT, Claude, and Gemini are typically referenced as "frontier models" and stand at the cutting edge of generative AI capabilities. There are dozens of other models available, spanning various specializations and scales, with some being fully open source. Hallucination One of the most important characteristics of LLMs to understand, one that shapes all the techniques built on top of them, is that they are fundamentally non-deterministic. Said another way, they're probabilistic, meaning if you provide the same input twice given exactly the same context, there is no guarantee the output will be the same. You know that disclaimer tucked into basically every LLM chat tool you might use? "Insert LLM name here can make mistakes." This disclaimer warns of hallucination, which refers to a model generating confident but factually incorrect or fabricated information. This stems from the model's probabilistic nature. It has no inherent ability to understand its inputs or access ground truth about the world. If you'd like a demonstration, just try asking a model like Claude Sonnet 4.6 how many R's are in the word strawberry. Context Window LLMs have what is called a context window, which is a measure, in tokens, of how much text the model can consider at once when generating a response. Modern LLMs commonly have context windows of hundreds of thousands of tokens up to millions. One might assume that a model with a 1 million token context window would deliver consistent output quality all the way up to that limit. However, modern LLMs suffer from what's colloquially known as context rot. Once a model's context window reaches around 80% of its limit, the model's output quality takes a significant drop. Because LLMs are functionally probabilistic "next word" guessers, every time a new input gets sent in, it is really just appended to the end of the ongoing session and that entire session is then sent back into the model for the following output to be generated. When operating over the course of an extended session with an LLM, it's recommended to periodically either start a fresh session when the utilized context window is approaching around 80%, or perform what is known as compacting the session. This basically feeds the entire session history back into a fresh session with the model to be summarized and used as a more token-efficient representation of the context built up in the session up to that point. System Prompt Products built on LLMs will typically use what's called a System Prompt, which is a set of instructions automatically inserted before your input, giving the model guiding principles and constraints to follow when generating responses. You can browse Anthropic's published system prompts for Claude for a real-world example. Agent The simplest implementation of an LLM is one of basic text input and output. This was the first iteration of ChatGPT that was released and provides more-or-less a direct experience of what LLMs provide as a technology. Prompt in, generated content out. Agents distinguish themselves from basic LLM interactions in two key ways: First, they can execute multiple steps before producing a final output, including follow-up clarification and iterative planning. Second, they can interface with external systems via APIs, Tools, and MCP servers (explained below), allowing them to act on the world rather than just respond to it. Because of this inherent flexibility and underlying power of LLMs, all varieties of Agents have sprung up from those that can code autonomously all the way to ones that can provide automated customer service over the phone. Model Context Protocol (MCP) The Model Context Protocol or MCP was created by Anthropic, the creators of the Claude series of frontier models, and provides an open source specification by which LLMs can integrate with external systems. Much like REST provides a standard by which systems can publish an HTTP-based API for applications to consume, MCP provides a specification by which servers can expose an interface tailored for LLM clients. Though competing specifications like Agent-2-Agent exist, and despite significant scrutiny from multiple cybersecurity researchers, MCP remains a widely adopted specification with broad product support. Tool A Tool is how an Agent interacts outside of its own process boundary. It is a discrete, callable function that an Agent can invoke to interact with something outside its own context like reading a file, querying an API, or running a terminal command. For example, a minimal description of what a "Coding Agent" is can be summarized into 3 parts: 1. an LLM that provides the engine of the agent, 2. a set of tools that can read files, create files, and edit files, and 3. a system prompt that describes to the LLM its purpose and scope as a coding agent. Unlike MCP, which is a broad and general protocol, individual Tools are narrowly scoped to a specific capability: a single API endpoint, a file operation, or a shell command. This has advantages such as allowing certain models to be more selective about which tools they pull into their working context for a given task. Tools can also provide a more robust security context within which a model is given permissions to operate and thereby reduce the risk of unwanted actions being executed. Skill Skills are an open specification from Anthropic that encode specific agent capabilities in a Markdown file, augmenting what an Agent can do beyond its default behavior. Skills allow for repeatable tasks to be defined and quickly invoked. Typically, a Skill describes a particular task, the Tools and integrations needed to accomplish it, how to use those integrations, and the step-by-step process to complete it. One of my favorite skills is also one of the simplest: /grill-me. The Skill's author, Matt Pocock, describes it best in its opening lines, which also happen to make up about 3/4 of its total content: Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer. I have found these concepts cover enough of the fundamentals to navigate the discussion and innovation swirling around Generative AI. I hope you find them useful. If so, please consider passing it along to someone who's been quietly wondering what all the noise is about. The post was authored with editorial support (Claude Sonnet 4.6) and image generation (Gemini 3.1 Pro) from Generative AI. All words are my own.