Skip to content

Building AI Agents on Red Hat AI

A hands-on tutorial that takes you from zero to a deployed AI agent system on Red Hat AI, using the fips-agents toolkit.

This tutorial was last verified against fipsagents v0.31.0 and fips-agents CLI v0.15.3 (June 2026). See Feature Highlights for the full capability set.

What is Red Hat AI?

Red Hat AI is Red Hat's portfolio for building and running AI on the hybrid cloud. In this tutorial, "Red Hat AI" specifically means OpenShift AI running on OpenShift: OpenShift is the underlying Kubernetes platform, OpenShift AI is the MLOps layer that manages model serving (via KServe and vLLM), and your agents run as ordinary OpenShift workloads alongside. Later modules introduce OGX (the LlamaStack distribution bundled with OpenShift AI), which moves tool orchestration, safety shields, and observability into the platform — the agent delegates these concerns to OpenShift AI rather than handling them itself. The supplementary Models as a Service module adds governed model access with API keys, token quotas, and usage tracking, all managed through OpenShift AI's dashboard.

What you'll build

By the end of this tutorial, you'll have a complete system running on Red Hat AI:

Browser → Chat UI → Gateway → Agent → MCP Server (calculus tools)
                              vLLM (gpt-oss-20b)
  • A Calculus Helper agent that solves math problems using remote tools
  • A calculus MCP server with 8 SymPy-powered tools (integration, differentiation, limits, etc.)
  • An HTTP gateway that proxies OpenAI-compatible API requests
  • A chat UI for browser-based interaction

Prerequisites

The full prerequisite checklist — cluster, OpenShift AI, LLM serving, CLI tools, and registry access — is its own module:

0. Before You Begin

Two paths

The tutorial supports two paths. Path A is the full experience: an OpenShift cluster with OpenShift AI and a GPU serving gpt-oss-20b via vLLM. Path B is for students without GPU access (Developer Sandbox, CRC, or any cluster without a GPU node) — you supply an external OpenAI-compatible model URL and deploy everything else on your cluster. Both paths are documented in Before You Begin.

Modules

Module What you'll do
1. Scaffold Your Agent Create an agent project, explore every file
2. Configure and Deploy Edit config, deploy to OpenShift, verify
3. Build an MCP Server Create a calculus tool server from scratch
4. Wire MCP to Agent Connect the tools, update the prompt
5. Gateway and UI Deploy the full stack, test end-to-end
6. Code Execution Sandbox Deploy a sandbox, give the agent code execution
7. Extend with AI Use AI-assisted slash commands to add capabilities
8. Production Hardening Secrets, FIPS, scaling, observability (metrics, traces, sessions, user feedback)
9. File Uploads Drag-drop uploads, Docling parsing, MIME validation, ClamAV scanning
10. Platform Mode and Guardrails Graduate tool orchestration, shields, and tracing to OGX server-side. Requires fipsagents 0.21+
11. Scaling with llm-d Conceptual: disaggregated prefill/decode + KV-cache routing behind OGX

Supplementary modules

These standalone modules extend the tutorial with RHOAI 3.4 platform features. They are independent of each other and can be completed in any order after the prerequisites listed in each module.

Module What you'll do
Agent Memory with MemoryHub Add cross-session memory via MemoryHub's MCP-based semantic store
Models as a Service Deploy MaaS: subscription-based model governance, API key auth, token quotas, usage tracking (RHOAI 3.4+)
MCP Gateway Deploy MCP Gateway: centralized tool access, auth, rate limiting across MCP servers (RHOAI 3.4+, Tech Preview)

Reference

Deep-dive pages linked from the tutorial:

How to follow along

Each module builds on the previous one. You'll run real commands, edit real files, and deploy real services. The completed code is in this repository if you get stuck:

  • calculus-agent/ -- the finished agent
  • calculus-helper/ -- the finished MCP server

Feature highlights

The tutorial's module sequence is stable at v0.31.0. All features below are included in the baseline. For details on any capability, consult docs/architecture.md in the agent-template repo.

Version Feature What it adds
0.20.0 Image input Multimodal message support in astep_stream()
0.21.0 Platform mode Delegate LLM orchestration to OGX server-side (Module 10 covers this)
0.22.0 Subagent-as-tool Register peer agents in agent.yaml, auto-get a delegate_to_agent tool
0.22.0 Question tool Structured questions from agent to operator with ask_user
0.23.0 Session compaction LLM-driven summarization of old messages on context overflow
0.23.0 Doom-loop detection Breaks stuck tool-call loops automatically
0.23.0 Per-tool permissions Allow/deny/ask gates on individual tool calls
0.24.0 Event-triggered mode React to webhooks, cron, Kafka, Redis — not just chat
0.24.0 Session fork & revert Branch conversation history for exploration
0.24.0 OTEL trace fidelity Configurable detail levels for trace replay
0.25.0 Kafka/Redis sources Event-triggered agents can consume Kafka topics and Redis Streams
0.26.0 State recovery Reducer-based checkpoint/replay for long-running agents
0.27.0 Graph store Apache AGE property-graph backend for entity/relationship persistence
0.28.0 Per-turn cost ceiling max_cost_per_turn_usd enforcement in the agent loop
0.29.0 Work-item coordination WorkItemStore with lease-based checkout, 5 stock LLM tools, session continuity
0.30.0 Prompt assembly Layered system prompt composition (identity, personality, governance, capabilities)
0.30.0 Trust and maturation Trust accumulation, lifecycle stages (Proto-Agent → Specialist), graduated autonomy
0.30.0 Self-healing tools learn_skill, suggest_skill, rollback_skill with trust-gated access
0.31.0 Ad-hoc spawn_agent Stock tool for ephemeral in-process agent instances with tool-subset whitelisting
0.31.0 AGENTS.md scaffold AAIF-spec agent identity file, served via /v1/agent-info

The calculus-coordinator/ directory in this repo demonstrates subagent-as-tool (v0.22.0).