Artificial Intelligence

What it is

Artificial intelligence is the field of computer science that seeks to create systems capable of performing tasks that traditionally require human intelligence: reasoning, learning, perceiving, generating language, and making decisions.

It's not a new concept — the term was coined in 1956 at the Dartmouth conference — but the convergence of three factors transformed it in recent years: massive amounts of data, accessible compute power (GPUs, TPUs), and advances in neural network architectures, particularly the Transformer introduced in 2017.

Layers of the current ecosystem

Foundation Models

Foundation models are large-scale neural networks trained on massive amounts of unlabeled data. They're called "foundational" because they serve as a base for multiple tasks without requiring complete retraining.

Examples: GPT-4, Claude, Gemini, Llama, Mistral.

Their key characteristic is emergence: capabilities that weren't explicitly programmed but arise from training at scale, such as chain-of-thought reasoning, cross-language translation, or code generation.

Large Language Models (LLMs)

A subset of foundation models specialized in processing and generating text. They use the Transformer architecture with attention mechanisms that allow them to capture long-range relationships in text sequences.

Current LLMs don't just generate text — they can follow complex instructions, maintain context in long conversations, and use external tools when configured to do so.

Generative AI

The application of foundation models to create new content: text, code, images, audio, video. It's the most visible layer for end users and the one that has driven massive adoption since 2022.

Interaction paradigms

The way humans interact with AI systems has evolved rapidly:

Prompting: the user writes natural language instructions and the model responds
RAG (Retrieval-Augmented Generation): the model queries external sources before responding, reducing hallucinations
Tool Use: the model invokes APIs, databases, or external services to complete tasks
Autonomous agents: systems that combine reasoning, memory, and tools to execute complex multi-step tasks with minimal human intervention

Each paradigm builds on the previous one. AI agents represent the current frontier.

Practical considerations

Hallucinations: LLMs generate plausible but not necessarily correct text. All output must be verified.
Context window: models have a token limit they can process. This directly affects how long a conversation or document can be.
Cost: more capable models are more expensive per token. Model choice must balance capability with budget.
Latency: text generation is sequential (token by token). For real-time applications, this is a limiting factor.

References

Attention Is All You Need — Vaswani et al., 2017. The paper that introduced the Transformer architecture.
On the Opportunities and Risks of Foundation Models — Stanford CRFM, 2021. Foundational report on large-scale models.
Anthropic — Creators of Claude and the Model Context Protocol.
OpenAI — Creators of GPT-4 and pioneers in commercial generative AI.