Decoding Key Terms of Artificial Intelligence

⚡

Key Takeaways

1General artificial intelligence (AGI) aims to surpass humans in most cognitive tasks.

2Autonomous AI agents perform complex tasks, going beyond simple chatbots.

3API endpoints allow programs to integrate and automate processes without human intervention.

💡Why it matters — Mastering these concepts is crucial for navigating the rapidly evolving technological landscape and understanding the impact of AI on various sectors.

Artificial Intelligence and Its Evolving Vocabulary

Artificial Intelligence (AI) is transforming our world, and with it, a new vocabulary is emerging to describe this revolution. As you browse articles on AI, you will encounter terms such as LLMs, RAG, RLHF, and many others, which can even confuse industry experts. This glossary is an attempt to clarify these terms. We update it regularly to keep pace with the rapid evolution of the field, just like the AI systems it describes.

General Artificial Intelligence (AGI)

General Artificial Intelligence, often abbreviated as AGI, is a complex and sometimes vague concept. It generally refers to an AI capable of surpassing the average human in many tasks, if not most of them. Sam Altman, CEO of OpenAI, described AGI as a "median human" that one could hire as a colleague. OpenAI's charter defines AGI as highly autonomous systems that outperform humans in most economically valuable work. On the other hand, Google DeepMind views AGI as an AI at least as competent as humans in the majority of cognitive tasks. This diversity of definitions shows that even experts do not always agree on what AGI truly is.

AI Agent

An AI agent refers to a tool that uses AI technologies to perform a series of tasks on your behalf, going well beyond the capabilities of a simple chatbot. These agents can handle operations like expense reporting, ticket booking, or even writing and maintaining code. However, the concept of an AI agent is still developing, and its meaning can vary among individuals. The infrastructure needed to fully realize its capabilities is still under construction. Essentially, an AI agent is an autonomous system that can rely on multiple AI systems to execute complex multi-step tasks.

API Endpoints

API endpoints can be imagined as "hidden buttons" at the back of software that other programs can activate to make it work. Developers use these interfaces to create integrations, allowing, for example, an application to pull data from another or an AI agent to directly control third-party services without human intervention. Although these endpoints are invisible to ordinary users, they are ubiquitous in smart devices and connected platforms. As AI agents gain capabilities, they become increasingly adept at discovering and utilizing these endpoints, thus opening up new possibilities for automation.

Chain of Thought

When a human is faced with a simple question, they can often answer it without much thought, such as "which animal is bigger, a giraffe or a cat?" However, for more complex questions, step-by-step reasoning is necessary. For example, if a farmer has chickens and cows, and together they have 40 heads and 120 legs, it will likely require writing an equation to find the answer (20 chickens and 20 cows). In the context of AI, chain-of-thought reasoning for large language models involves breaking down a problem into intermediate steps to improve the quality of the final outcome. Although this generally takes more time, the answer obtained is often more accurate, especially in logical or coding contexts.

Coding Agent

A coding agent is a specialized version of an AI agent, applied to software development. Unlike a simple assistant that suggests code to a human, a coding agent can autonomously write, test, and debug code. It handles the iterative and trial-and-error work that often occupies developers. These agents can scan entire codebases, identify bugs, run tests, and apply fixes with minimal human supervision. Imagine it as an extremely fast intern who never sleeps and never loses focus, although human review is still necessary to validate the work.

Compute

The term "compute" is often used to refer to the processing power essential for the operation of AI models. This processing capability fuels the AI industry, enabling the training and deployment of powerful models. The term is often shorthand for the types of hardware that provide this power, such as GPUs, CPUs, TPUs, and other infrastructures that form the backbone of the modern AI industry.

Deep Learning

Deep learning is a subset of machine learning that self-improves, using a multi-layer artificial neural network structure. This allows algorithms to establish more complex correlations compared to simpler machine learning systems, such as linear models or decision trees. Inspired by the neural connections of the human brain, this structure enables AI models to identify important features in data on their own, without human intervention. However, deep learning systems require a large number of data points to be effective and typically take longer to train, which increases development costs.

Diffusion

Diffusion is a central technology for many AI models generating art, music, and text. Inspired by physics, these systems "destroy" the structure of data progressively by adding noise until nothing remains. In physics, diffusion is a spontaneous and irreversible process, but in AI, the goal is to learn a "reverse diffusion" process to restore the destroyed data, thus allowing the recovery of information from the noise.

Distillation

Distillation is a technique used to extract knowledge from a large AI model via a "teacher-student" model. Developers send queries to a teacher model and record the outputs, which are then used to train the student model. This process allows for the creation of a smaller and more efficient model based on a larger model, with minimal distillation loss. This is likely how OpenAI developed GPT-4 Turbo, a faster version of GPT-4. While distillation is commonly used internally by AI companies, it can also be employed to catch up with cutting-edge models, although this may violate the terms of service of AI APIs.

Fine-Tuning

Fine-tuning involves training an AI model to optimize its performance for a specific task or domain, using new specialized data. Many AI startups use large language models as a base to develop a commercial product but seek to enhance their utility for a target sector or task by fine-tuning previous training cycles with their own expertise and domain knowledge.

Generative Adversarial Network (GAN)

A Generative Adversarial Network, or GAN, is a machine learning framework that supports significant developments in generative AI, including the production of realistic data like deepfake tools. GANs use a pair of neural networks, one generating data and the other evaluating it. This competitive process helps optimize AI outputs to make them more realistic without further human intervention. GANs are particularly effective for narrow applications, such as producing realistic photos or videos.

Hallucination

Hallucination is the term used in the AI industry to refer to models that invent incorrect information. This issue is a major challenge for AI quality, as hallucinations can produce misleading outputs and lead to real-world risks, such as harmful medical advice. This phenomenon is often attributed to gaps in training data, prompting the development of more specialized AI models to reduce the risks of misinformation.

Inference

Inference is the process by which an AI model makes predictions or draws conclusions from previously seen data. This process cannot occur without prior training, as a model must learn patterns in a dataset before it can extrapolate effectively. Various types of hardware can perform inference, from smartphone processors to powerful GPUs, but not all can execute models equally. Very large models require powerful infrastructures to make predictions quickly.

Large Language Model (LLM)

Large Language Models, or LLMs, are the AI models used by popular AI assistants such as ChatGPT, Claude, Google’s Gemini, Meta’s Llama, Microsoft’s Copilot, or Mistral’s Le Chat. When you interact with an AI assistant, you are communicating with an LLM that processes your request, often with the help of various available tools, such as web browsing or code interpreters. These models are deep neural networks that...