So You've Heard These AI Terms and Nodded Along; Let's Fix That

Artificial intelligence is changing the world, and simultaneously inventing a whole new language to describe how it's doing it. Spend five minutes reading about AI and you'll run into LLMs, RAG, RLHF, and a dozen other terms that can make even very smart people in the tech world feel insecure. This glossary is an attempt to fix that — a living document updated regularly as the field evolves.

AGI (Artificial General Intelligence)

A nebulous term referring to AI that's more capable than the average human at many, if not most, tasks. Definitions vary:

OpenAI CEO Sam Altman: "Equivalent of a median human that you could hire as a co-worker."
OpenAI's charter: "Highly autonomous systems that outperform humans at most economically valuable work."
Google DeepMind: "AI that's at least as capable as humans at most cognitive tasks."

AI Agent

A tool that uses AI technologies to perform a series of tasks on your behalf — beyond basic chatbot capabilities. Examples include:

Filing expenses
Booking tickets or restaurant tables
Writing and maintaining code

An autonomous system that may draw on multiple AI systems to carry out multistep tasks. Infrastructure is still being built out to deliver on its full capabilities.

API Endpoints

Think of these as "buttons" on the back of software that other programs can press to make it do things. Developers use these interfaces to build integrations — allowing AI agents to control third-party services directly without manual human operation.

Chain of Thought

Breaking down a problem into smaller, intermediate steps to improve the quality of AI outputs. This approach:

Takes longer to get an answer
Produces more likely correct results, especially in logic or coding contexts
Is optimized through reinforcement learning in reasoning models

Coding Agents

Specialized AI agents applied to software development. Rather than simply suggesting code, coding agents can:

Write, test, and debug code autonomously
Handle iterative, trial-and-error work
Operate across entire codebases
Spot bugs, run tests, and push fixes with minimal human oversight

Compute

Refers to the vital computational power that allows AI models to operate. This includes:

GPUs, CPUs, TPUs, and other infrastructure
The processing fuel that powers the AI industry
The hardware backbone enabling model training and deployment

Deep Learning

A subset of self-improving machine learning with multi-layered, artificial neural network (ANN) structure. Key characteristics:

Makes more complex correlations than simpler ML systems
Identifies important data characteristics automatically
Learns from errors through repetition and adjustment
Requires millions of data points and longer training times

Diffusion

The tech at the heart of many art-, music-, and text-generating AI models. The process:

Slowly "destroys" data structure by adding noise
Learns a "reverse diffusion" process to restore data from noise
Gains the ability to recover data and generate new content

Distillation

A technique using a 'teacher-student' model to extract knowledge from large AI models:

Developers send requests to a teacher model and record outputs
Outputs are used to train the student model
Creates smaller, more efficient models with minimal distillation loss
Likely how OpenAI developed GPT-4 Turbo

Fine-tuning

Further training of an AI model to optimize performance for a more specific task by feeding in new, specialized (task-oriented) data. Many AI startups use this approach to build commercial products from large language models.

GAN (Generative Adversarial Network)

A machine learning framework involving a pair of neural networks:

Generator creates outputs from training data
Discriminator evaluates and spots artificially generated data
The two models compete to optimize outputs
Works best for narrower applications like producing realistic photos or videos

Hallucination

The AI industry's term for AI models making stuff up — generating incorrect information. This arises from:

Gaps in training data
Knowledge limitations in the model
Can lead to misleading or dangerous outputs
Contributing to push toward specialized, domain-specific AIs

Inference

The process of running an AI model — setting it loose to make predictions or draw conclusions from previously seen data. Cannot happen without training. Different hardware performs inference at different speeds, from smartphone processors to high-end AI chips.

Large Language Model (LLM)

AI models used by popular AI assistants like ChatGPT, Claude, Google's Gemini, Meta's Llama, Microsoft Copilot, and Mistral's Le Chat. Characteristics:

Deep neural networks made of billions of numerical parameters
Learn relationships between words and phrases
Create multidimensional maps of language
Generate most likely patterns that fit prompts

Memory Cache

An optimization technique that boosts inference efficiency by:

Saving particular calculations for future queries
Reducing the number of calculations a model must run
KV (key value) caching works in transformer-based models
Drives faster results by reducing algorithmic labor

Neural Network

The multi-layered algorithmic structure underpinning deep learning and generative AI. Though the concept dates to the 1940s, GPUs from the video game industry unlocked its power, enabling:

Better performance across many domains
Voice recognition, autonomous navigation, drug discovery
Training algorithms with many more layers than previously possible

Open Source

Software or AI models where underlying code is publicly available for anyone to use, inspect, or modify. Examples:

Meta's Llama family of models
Linux (in operating systems)
Enables independent safety audits
Contrasts with closed source (like OpenAI's GPT models)

Parallelization

Doing many things simultaneously instead of sequentially. In AI:

Fundamental to both training and inference
Modern GPUs perform thousands of calculations in parallel
Ability to parallelize across many chips/machines is crucial
Determines how quickly and cost-effectively models can be built and deployed

RAMageddon

The ever-increasing shortage of random access memory (RAM) chips. Impacts:

Gaming industry (raised console prices)
Consumer electronics (biggest smartphone shipments dip in over a decade)
General enterprise computing (limited data center RAM)
AI industry buying most RAM for data centers
Prices expected to remain high until shortage ends

Reinforcement Learning

Training AI where a system learns by trying things and receiving rewards for correct answers. Key aspects:

Like training a pet with treats, but for neural networks
Model explores environment, takes actions, updates behavior based on feedback
Powerful for training AI to play games, control robots, sharpen reasoning
RLHF (Reinforcement Learning from Human Feedback) is central to fine-tuning models

Token

Basic building blocks of human-AI communication, representing discrete segments of data. Created through tokenization, which:

Breaks down raw text into bite-sized units
Similar to how compilers translate human language to binary
Determines cost in enterprise settings (charged per-token)
Roughly analogous to "words" for AI workloads

Token Throughput

A measure of how much AI work a system can handle at once. High token throughput:

Determines how many users a model can serve simultaneously
Affects how quickly each user receives a response
Key goal for AI infrastructure teams
Maximizing it has become an obsession in the field

Training

The process of data being fed into a model so it can learn from patterns and generate useful outputs. Characteristics:

System responds to data characteristics to adapt outputs
Can be expensive (requires lots of inputs)
Volumes required have been trending upwards
Hybrid approaches like fine-tuning can help manage costs

Transfer Learning

Using a previously trained AI model as the starting point for developing a new model for a different but related task. Benefits:

Drives efficiency savings by shortcutting model development
Useful when data for the new task is limited
Has limitations — models may require additional training for domain-specific performance

Weights

Numerical parameters that determine how much importance is given to different features in training data, thereby shaping AI model output. Process:

Training begins with randomly assigned weights
Weights adjust as model seeks to match target output
Reflect how much each input influences the result
Apply multiplication to inputs

Validation Loss

A number indicating how well an AI model is learning during training — lower is better. Used to:

Decide when to stop training
Adjust hyperparameters
Investigate potential problems
Flag overfitting (when model memorizes rather than learns)
Distinguish between genuine understanding and rote memorization