Glossary

AI Glossary

38+ AI terms explained in simple language

Artificial Intelligence (AI)

The simulation of human intelligence by machines — including learning, reasoning, problem-solving, perception, and language understanding. AI is the broad field; machine learning and deep learning are subsets.

Algorithm

A set of rules or instructions a computer follows to solve a problem or accomplish a task. Machine learning algorithms learn patterns from data, while traditional algorithms follow explicit rules written by programmers.

AI Agent

An autonomous AI system that perceives its environment, makes decisions, and takes actions to achieve specific goals — browsing the web, writing code, sending emails — without constant human input.

AutoML

Automated Machine Learning — tools and techniques that automate the process of building, training, and optimizing machine learning models, making ML accessible to people without deep technical expertise.

ChatGPT

OpenAI's AI chatbot launched in November 2022. One of the fastest-growing products in history, it can write, code, analyze, and converse. Powered by GPT models, it has over 100 million users.

Claude

Anthropic's AI assistant, designed with a focus on safety and helpfulness. Known for strong writing, long document analysis, and following complex instructions. Available at claude.ai.

Computer Vision

A field of AI that enables computers to interpret and understand visual information — images, video, and live camera feeds. Used in facial recognition, medical imaging, self-driving cars, and more.

Context Window

The maximum amount of text (measured in tokens) an AI model can process at once. GPT-4 has a 128k token context window; Claude has up to 200k. Larger context windows allow longer documents and conversations.

Chain of Thought (CoT)

A prompting technique that asks AI to show its step-by-step reasoning before giving a final answer. "Think step by step" significantly improves accuracy on math, logic, and complex reasoning tasks.

Deep Learning

A subset of machine learning using neural networks with many layers. Responsible for breakthroughs in image recognition, speech synthesis, and language models like GPT. "Deep" refers to the many layers.

Diffusion Model

A type of generative AI that creates images by learning to reverse a noise-adding process. The model starts with random noise and gradually refines it into a coherent image. Powers Stable Diffusion, DALL-E, and Midjourney.

Embedding

A numerical representation of text, images, or other data as a vector of floating-point numbers. Words with similar meanings have similar embeddings, allowing AI to understand semantic relationships. Essential for RAG and search.

Fine-Tuning

The process of further training a pre-trained AI model on a specific dataset to specialize it for a particular task. A fine-tuned GPT model for customer service will respond more accurately to support questions.

Foundation Model

A large AI model trained on broad data that can be adapted to many tasks. GPT-4, Claude, Gemini, and Llama are foundation models. They power thousands of applications through APIs and fine-tuning.

GPT (Generative Pre-trained Transformer)

OpenAI's family of large language models. GPT-3 launched in 2020, GPT-4 in 2023. "Generative" means it creates text, "Pre-trained" means it learned from internet data, "Transformer" is the architecture.

Generative AI

AI that can create new content — text, images, audio, video, or code. Unlike discriminative AI (which classifies), generative AI produces original output. ChatGPT, Midjourney, and Sora are all generative AI.

GPU (Graphics Processing Unit)

Computer chips originally designed for rendering graphics. GPUs are ideal for training AI models because they can perform many calculations simultaneously. Nvidia GPUs are the backbone of the AI industry.

Hallucination

When an AI model generates plausible-sounding but factually incorrect information with confidence. LLMs can "hallucinate" fake citations, wrong statistics, or events that never happened. A key challenge in AI deployment.

Hugging Face

A platform and community for sharing, discovering, and using open-source AI models and datasets. Often called the "GitHub of AI" — it hosts tens of thousands of models including Llama, Mistral, and BERT.

LLM (Large Language Model)

An AI model trained on massive amounts of text to understand and generate human language. Examples include GPT-4, Claude 3, Gemini, and Llama 3. "Large" refers to billions of parameters used to represent learned patterns.

LangChain

A popular Python and JavaScript framework for building applications with language models. Helps connect LLMs to databases, APIs, memory, and other tools to build AI agents and RAG applications.

Latency

The time delay between sending a request to an AI model and receiving the first response. Low latency is crucial for real-time applications. Measured in milliseconds for first token and seconds for full responses.

Machine Learning (ML)

A subset of AI where systems learn from data rather than following explicit rules. Instead of programming rules, you feed examples and the algorithm learns patterns. Every AI tool in this glossary uses ML.

Model

The trained AI system that generates outputs. A model is the result of training on data — it contains billions of parameters (numerical weights) that encode learned patterns. GPT-4 has an estimated 1.8 trillion parameters.

Multimodal AI

AI systems that can process and generate multiple types of data — text, images, audio, and video — within a single model. GPT-4V, Gemini Ultra, and Claude 3 are multimodal models that understand images and text together.

Neural Network

A computational system loosely inspired by the human brain. Consists of interconnected layers of "neurons" (mathematical functions). Deep learning uses neural networks with many layers to learn complex patterns.

NLP (Natural Language Processing)

The field of AI focused on enabling computers to understand, interpret, and generate human language. All chatbots, text summarizers, and language translators use NLP. LLMs are the current state of the art in NLP.

n8n

An open-source workflow automation platform that connects apps and services. Pronounced "n-eight-n" or "nodemation". Popular for building AI agent workflows with 400+ integrations. Free when self-hosted.

Prompt

The text input given to an AI model to guide its response. A well-written prompt produces better results. "Write a blog post about AI" is a basic prompt; adding context, format, and examples dramatically improves output quality.

Prompt Engineering

The practice of designing and optimizing prompts to get better results from AI models. Includes techniques like chain-of-thought, few-shot examples, role assignment, and structured output formatting.

Parameters

The numerical values inside a neural network that get adjusted during training. More parameters generally means a more capable model. GPT-4 has ~1.8 trillion, Llama 3 has 70 billion, and small local models have 7 billion.

RAG (Retrieval-Augmented Generation)

A technique combining LLMs with a knowledge base. Before generating a response, the system retrieves relevant documents and includes them in the prompt. Reduces hallucination and enables AI to answer from your own data.

Reinforcement Learning

An AI training method where a model learns by receiving rewards for good actions and penalties for bad ones. ChatGPT uses RLHF (Reinforcement Learning from Human Feedback) to align responses with human preferences.

Token

The basic unit AI language models process. A token is roughly 4 characters or 0.75 words. "Hello world" is 2 tokens. AI pricing and context limits are measured in tokens. GPT-4o costs $5 per million input tokens.

Training Data

The dataset used to train an AI model. LLMs are trained on text from the internet, books, and code. The quality and diversity of training data directly determines the model's capabilities and potential biases.

Transformer

The neural network architecture that powers all modern LLMs. Introduced in the 2017 Google paper "Attention Is All You Need." The "T" in GPT stands for Transformer. Uses attention mechanisms to process language.

Vector Database

A database optimized for storing and searching embeddings (vector representations). Used in RAG systems to find semantically similar content. Popular options include Pinecone, Weaviate, Qdrant, and ChromaDB.

Zero-Shot Learning

When an AI model performs a task it was never explicitly trained for, using only the description in the prompt. "Classify this review as positive or negative" works zero-shot because the model understands the task from description alone.