AI & LLM Glossary
Clear, practical definitions of AI concepts, from context windows to agentic memory. Built for engineering and product teams working with LLMs.
Showing 110 of 110 terms
A
Adherence (Instruction Adherence)
Measuring how well AI systems follow the specific instructions and constraints provided by users
Agent Observability
Monitoring and tracking what autonomous AI agents are doing in real-time across distributed systems.
Agentic Memory System
A comprehensive memory framework for AI agents that maintains episodic, semantic, and procedural memory to enable learning and continuous improvement.
AI Access Control
Systems determining who can use AI models, which data they can access, and what they're allowed to do.
AI Agent
An autonomous system that perceives its environment, makes decisions, and takes actions to achieve specific goals without direct human intervention.
AI Auditability
The ability to create a complete record of what an AI system did, why it did it, and what inputs influenced its outputs.
AI Cost Model
The framework for understanding and predicting how much your AI system will cost to operate at different scales.
AI Data Governance
Policies and systems controlling what data goes into AI models, how it's used, and who can access it.
AI Traceability
The technical capability to follow an input through every transformation until it produces output, showing what influenced the result.
AI Vendor Lock-In Risk
The danger of becoming dependent on a specific AI provider's models or infrastructure, making it costly to switch.
Alignment
Ensuring AI systems behave in accordance with human values, goals, and constraints
Alignment Evals
Testing whether AI system behavior aligns with specified goals, values, and constraints
Audit Log
Comprehensive records of AI system decisions, actions, and state changes for accountability and compliance
C
Chain-of-Thought (CoT)
A prompting technique that makes LLMs show their reasoning step-by-step, improving accuracy especially on complex reasoning tasks.
Chunking
The process of dividing long documents into smaller pieces for RAG systems to store and retrieve efficiently.
Code Agent
AI agents specialized in writing, analyzing, and executing code to solve problems programmatically
Compliance
Ensuring AI systems adhere to applicable laws, regulations, industry standards, and ethical guidelines.
Context Compression
Reducing the token size of context information while preserving critical details and meaning
Context Eviction
Removing or deprioritizing old information from the AI's active context to make room for new data
Context Management
Strategically selecting what information an AI system should consider in each interaction
Context Retrieval
Fetching relevant past information or memories to include in current AI processing
Context Rot
Degradation of memory quality and accuracy as stored context becomes outdated or semantically disconnected
Context Window
The maximum amount of text an LLM can consider at once, measured in tokens.
Coordination Protocol
The rules and standards enabling multiple AI agents to work together, share information, and synchronize actions.
Cost-to-Completion
The total cost, in money or tokens, required to accomplish a task using AI, from initial attempt to satisfactory result.
Cross-Encoder Scoring
Using transformer models to score query-document pairs directly rather than encoding them separately
Customization
Tailoring AI systems to specific organizational needs, preferences, and constraints without rebuilding from scratch.
D
Data Sovereignty (AI Context)
The principle that data, especially when used in AI systems, should remain under the control and jurisdiction of its origin country or organization.
Delegation
Agents assigning subtasks to other agents or systems, breaking complex problems into manageable pieces
Dense Retrieval
Using learned embeddings to retrieve information based on semantic similarity rather than keyword matching
Deployment
The process of taking a trained AI model or application from development into production where it serves real users.
Developer Agents
AI agents designed to autonomously write, test, debug, and optimize code, assisting software engineers in development tasks.
Distributed Systems
Computing architectures where AI systems are spread across multiple machines or locations, enabling scale, reliability, and geographic distribution.
Document Ranking
Sorting retrieved documents by relevance to the query using scoring or learning-to-rank models
E
Embedding Drift
Changes in embedding model output distributions or quality over time, degrading retrieval performance
Embeddings
Numerical representations of text that capture semantic meaning, enabling AI systems to understand similarity and relationships.
Emergent Behaviors
Complex system behaviors that arise unexpectedly from simpler components interacting, not explicitly programmed
End-to-End Eval
Evaluating complete AI system performance across entire workflows rather than isolated components
Enterprise Agents
AI agents deployed in organizations to autonomously execute business processes and complete multi-step tasks under organizational control.
Enterprise AI Stack
The complete set of components an enterprise organization needs to build, deploy, and manage AI systems in production.
Enterprise Framing
How to position and communicate AI capabilities to enterprise organizations by emphasizing control, governance, and business value.
Enterprise Governance
The organizational frameworks, policies, and oversight mechanisms that ensure AI systems are used appropriately and comply with requirements.
Enterprise Memory & AI Systems
Persistent memory infrastructure that lets AI systems learn from past interactions and deliver personalized, context-aware experiences at scale.
Enterprise Metrics
The suite of quantitative measurements organizations use to assess whether AI systems are delivering business value and operating as intended.
Enterprise Procurement
The organizational and contractual processes large companies use to evaluate, approve, and purchase AI systems and services.
Enterprise Workflows
Structured, automated processes within organizations that incorporate AI to automate decision-making, task routing, and multi-step operations.
Episodic Memory (AI)
AI systems storing specific experiences and interactions in chronological context with sensory/contextual details
Evals (Evaluation Systems)
Systematic testing frameworks that measure AI system quality across multiple dimensions like accuracy, safety, and efficiency
Event Loop (Agent Runtime)
The core execution mechanism that cycles through agent decision-making, tool execution, and state updates
Explainability
Making AI decisions and outputs interpretable to humans, showing why the system generated specific responses
F
Failure Modes
Systematic ways AI systems malfunction or produce wrong outputs, including hallucinations, biases, and reasoning errors
Feedback Loop (Agentic)
Mechanisms where AI agents observe outcomes of their actions and adjust subsequent behavior based on results
Fine-Tuning
Training a pre-trained LLM on domain-specific data to improve performance on specialized tasks without rebuilding from scratch.
G
Governance
Systems and processes that establish rules, make decisions about how something should be run, and ensure compliance with those rules.
Graph RAG
Using knowledge graphs to structure and retrieve information instead of flat vector databases
Groundedness Evals
Testing whether AI outputs are factually supported by source materials and not hallucinated
Guardrails
Safety mechanisms that constrain AI output to acceptable ranges and prevent harmful or out-of-scope responses
H
Hallucination
When an LLM generates plausible-sounding but false, misleading, or fabricated information with high confidence.
Hallucination Mitigation via Retrieval
Using external knowledge sources and retrieval-augmented generation to ground AI outputs in factual information
Hallucination Rate
Quantitative measurement of how often an AI system produces factually incorrect or unfounded outputs
Human-in-the-Loop (HITL)
Systems where humans and AI collaborate, with humans reviewing and correcting AI decisions to improve quality
Hybrid Search
Combining semantic similarity search with keyword/lexical matching for more robust information retrieval
I
Inference
The process of running a trained model to generate outputs from input data, as opposed to training which creates the model.
Integrations
Connections between AI systems and external services, data sources, and business tools that enable the AI to access information and take actions.
K
Knowledge Graph
A structured representation of entities, relationships, and attributes that captures domain knowledge as interconnected nodes and edges.
Knowledge Storage
Infrastructure for persisting and organizing information that AI systems retrieve and reason about
Knowledge Systems
Infrastructure that stores, organizes, and retrieves structured and unstructured information to support AI reasoning and decision-making.
Knowledge Work
Jobs centered on creating, analyzing, and applying information rather than physical tasks
L
Latency Optimization
Reducing response time for AI systems through caching, batching, model optimization, and infrastructure tuning
Lifecycle (Model Lifecycle)
The complete journey of an AI model from conception through development, testing, deployment, monitoring, retraining, and eventual retirement.
LLM (Large Language Model)
A neural network trained on massive amounts of text data to predict and generate human-like text, often used as the reasoning engine for AI applications.
Long-Term Memory (AI)
Persistent storage of information that an AI system can access across sessions and conversations, enabling it to learn from and recall past interactions.
M
Maintenance
The ongoing operations and updates required to keep AI systems running effectively, including monitoring, bug fixes, updates, and performance optimization.
Memory & Optimization
Strategies for managing and persisting information about users, interactions, and context to improve AI performance while maintaining efficiency and privacy.
Memory & Personalization
Tailoring AI responses and memory retrieval based on individual user preferences, history, and behavior patterns
Memory Consolidation
Processing and integrating new experiences into organized long-term memory structures for persistent learning
Model Routing
Systems that intelligently direct requests to different AI models based on criteria like cost, latency, accuracy, or specialization.
Multi-Agent Systems
Architectures where multiple AI agents work together, often with different roles or specializations, to solve complex problems collaboratively.
Multimodal AI
AI systems that can process multiple types of input (text, images, audio, video) and reason across them within a single model.
O
Observability
Monitoring and understanding AI system behavior through logs, metrics, and traces to detect problems
Orchestration
Coordinating multiple AI models, tools, and systems to work together in complex workflows
Orchestration Layer
The infrastructure that coordinates and manages multiple AI models, services, data sources, and tools within an AI system.
P
Personalization Engine
Systems that customize AI outputs, recommendations, and experiences based on individual user preferences, behavior, and characteristics.
Policy Engine
Systems that enforce organizational rules and constraints on AI behavior, including access control, content filtering, and decision approval.
Prompt Engineering
The practice of crafting specific input text (prompts) to guide LLMs toward producing desired outputs with improved quality and consistency.
Prompt Injection
A security vulnerability where user input or untrusted data manipulates an AI model's behavior by injecting instructions into the prompt.
Prompt Template
Reusable prompt structures with placeholders that enable consistent, parameterized interactions with AI models across different inputs.
R
RAG (Retrieval-Augmented Generation)
A technique that retrieves relevant information before generating responses, making LLMs more accurate and factual.
RAG Pipeline
The complete workflow of retrieving relevant documents or data and providing them to an AI model to ground its responses in external knowledge.
Rate Limiting
Mechanisms that restrict how frequently users or systems can call AI APIs or services to prevent overload, control costs, and ensure fair usage.
Reranking
Reordering search results using more sophisticated models or signals after initial retrieval
Retrieval Pipeline
The technical infrastructure that searches, ranks, and retrieves relevant information from knowledge bases or documents to support AI systems.
S
Safety Filters
Systems that detect and prevent AI models from producing harmful, unethical, or inappropriate content before it reaches users.
Semantic Search
Search that understands meaning rather than matching keywords, retrieving results based on conceptual similarity rather than exact word matches.
Session Management
Systems that maintain and manage conversation context, user state, and history across multiple interactions with an AI system.
Sparse Retrieval
Retrieval methods that use explicit keywords and term matching to find relevant documents, contrasting with semantic similarity-based approaches.
State Management (Agent)
Systems that track and maintain the current status, progress, and internal variables of AI agents as they work through multi-step tasks.
Structured Output
Constraining AI model outputs to specific, machine-parseable formats (JSON, XML, etc.) instead of free-form text.
System Prompt
Initial instructions provided to an AI model that define its role, behavior, constraints, and how it should respond to users.
T
Temperature (LLM)
A parameter controlling randomness in model outputs, from deterministic (0) to highly creative/random (1+), with different optimal values for different tasks.
Token Budget
The total allocation of tokens (units of text cost) available for an AI system, used to constrain spending and optimize resource allocation.
Tokenization
The process of breaking text into small pieces (tokens) that LLMs process, where tokens are not always whole words.
Tool Use (Function Calling)
Enabling AI models to call external functions or APIs to access information and take actions
Transformer Architecture
The underlying neural network structure used by modern large language models, based on self-attention mechanisms for processing sequential data.
W
Warm-Up (Model)
Pre-loading and initializing AI models before serving requests to reduce latency and improve response times for users.
Workflow Automation
Using AI to automatically execute multi-step business processes, reducing manual work and enabling faster, more consistent operations.