1. Scaffold Your Agent¶
We start by creating an agent project using the fips-agents CLI. By the end
of this module, you'll understand every file in the project and have the agent
running locally.
Have you finished the prerequisites?
Make sure you've worked through 0. Before You Begin — you'll need a cluster, OpenShift AI, an LLM endpoint, the CLI tools, and registry access before Module 2.
Create the project¶
The fips-agents create agent command scaffolds a complete project from the
built-in template. The --local flag sets it up for local development.
The CLI scaffolds the full project and prints next steps. Once it finishes, look at what was created:
.claude/ AGENTS.md CLAUDE.md Containerfile
Makefile README.md agent.yaml chart/
deploy.sh evals/ identity.md personality.md
prompts/ pyproject.toml redeploy.sh rules/
skills/ src/ tests/ tools/
Project structure¶
| Path | Purpose |
|---|---|
src/agent.py |
Your agent subclass -- most of your work happens here |
agent.yaml |
Configuration: model endpoint, tools, prompts, server settings |
prompts/system.md |
System prompt with YAML frontmatter and Markdown body |
tools/ |
Local tool implementations, one @tool-decorated file per tool |
skills/ |
Progressive-disclosure capabilities (agentskills.io spec) |
rules/ |
Behavioral constraints, one Markdown file per rule |
chart/ |
Helm chart for deploying to OpenShift |
identity.md |
Agent identity definition (name, role, personality traits) |
personality.md |
Agent personality and communication style |
evals/ |
Test scenarios and eval runner |
Containerfile |
Multi-stage build using Red Hat UBI base images |
Makefile |
Development and deployment commands |
AGENTS.md |
Open standard agent descriptor |
pyproject.toml |
Python package metadata and dependencies |
Understanding agent.yaml¶
This is the central configuration file. It has clearly labeled sections -- here are the key ones.
Agent identity¶
agent:
name: ${AGENT_NAME:-my-agent}
description: "A brief description of what this agent does"
version: 0.1.0
The name and description appear in logs and on the /v1/agent-info endpoint.
We'll change these to match our calculus agent in Module 2.
Model configuration¶
model:
provider: ${MODEL_PROVIDER:-openai}
endpoint: ${MODEL_ENDPOINT:-http://llamastack:8321/v1}
name: ${MODEL_NAME:-meta-llama/Llama-3.3-70B-Instruct}
temperature: 0.7
max_tokens: 4096
The provider field selects the LLM backend. The default (openai) works with
any OpenAI-compatible API -- vLLM, LlamaStack, llm-d, or even OpenAI itself.
Set it to anthropic, bedrock, or azure to route through an adapter
sidecar instead.
The ${VAR:-default} syntax means: use the environment variable VAR if it
is set, otherwise fall back to the value after :-. For example,
${MODEL_ENDPOINT:-http://llamastack:8321/v1} uses the MODEL_ENDPOINT
environment variable if available, or http://llamastack:8321/v1 as a
fallback. You don't need to edit agent.yaml -- when you deploy to OpenShift
in Module 2, a Kubernetes ConfigMap injects the real values as environment
variables, and the substitution happens automatically at runtime.
You'll find your actual model values and configure them in Module 2 when you deploy to OpenShift.
MCP servers¶
Empty by default. In Module 4, we'll add our calculus MCP server here. Each entry can use HTTP or stdio transport:
mcp_servers:
- url: http://mcp-server:8080/mcp/ # HTTP
- command: /path/to/server # stdio
args: [--verbose]
Platform mode (optional, off by default)¶
Off by default. When enabled, the agent delegates LLM orchestration to OGX — including MCP tool calls, shield enforcement, and the inference loop — instead of running them client-side. We cover this in Module 10; leave it off until then.
Local tools and server¶
tools:
local_dir: ./tools
visibility_default: agent_only
server:
host: ${HOST:-0.0.0.0}
port: ${PORT:-8080}
Tools are auto-discovered from tools/ at startup. The visibility_default
controls which tool plane a tool belongs to if it doesn't declare one
explicitly (more on planes below). The server section configures the HTTP
binding -- the agent exposes an OpenAI-compatible /v1/chat/completions
endpoint that the gateway and UI communicate through.
Prompt assembly (layered mode)¶
The scaffold includes a prompt_assembly section that replaces the flat
system-prompt concatenation with named layers:
prompt_assembly:
identity:
source: identity.md
enabled: true
personality:
source: personality.md
enabled: false
governance_enabled: true
capabilities_enabled: true
This is what identity.md and personality.md in the project root are for.
When prompt_assembly is present, build_system_prompt() assembles the system
message from four layers in precedence order: identity (who the agent IS),
personality (how it behaves), governance (rules/), and capabilities (skills/).
Identity is on by default; personality is off until you enable it.
When prompt_assembly is absent, the legacy flat concatenation (system prompt
+ rules + skills) is used instead. We'll customize identity.md in Module 2.
Understanding src/agent.py¶
The template gives you a MyAgent class with the minimal shape — one model
call, optional tool dispatch, return:
from fipsagents.baseagent import BaseAgent, StepResult
class MyAgent(BaseAgent):
async def step(self) -> StepResult:
response = await self.call_model()
response = await self.run_tool_calls(response)
return StepResult.done(response.content)
Three things to notice:
BaseAgent subclass. Your agent inherits from BaseAgent, which handles
configuration, tool registration, MCP connections, prompt loading, and
lifecycle management. You implement step().
The step() method. Called in a loop -- each invocation is one turn of
reasoning. call_model() sends the conversation to the LLM with all
registered tool schemas. run_tool_calls() executes any tool calls the LLM
requested and re-calls the model until no more tool calls remain.
Richer calling patterns are documented in the project's CLAUDE.md ("Calling Patterns"): structured output via call_model_json, validation-with-retry via call_model_validated, and agent-code tool dispatch via self.use_tool(). The minimal step() above is enough for the rest of this tutorial.
The __main__ block. Starts the agent as an HTTP server:
if __name__ == "__main__":
from fipsagents.baseagent import load_config
from fipsagents.server import OpenAIChatServer
config = load_config("agent.yaml")
server = OpenAIChatServer(
agent_class=MyAgent,
config_path="agent.yaml",
title=config.agent.name,
version=config.agent.version,
)
server.run(host=config.server.host, port=config.server.port)
Each incoming request creates a fresh agent instance, runs
setup() then the step() loop then shutdown(), and streams the response.
The server also provides /healthz for liveness probes and /v1/agent-info
for metadata.
Understanding prompts/system.md¶
The system prompt uses Markdown with YAML frontmatter:
---
name: system
description: System prompt for the agent
temperature: 0.3
variables:
- name: role
type: string
description: One-line role description used to focus the agent
default: "a helpful assistant"
---
You are {role}.
## Instructions
1. Use the tools available to you to accomplish the user's request.
2. If the request is ambiguous, ask a clarifying question before acting.
3. If you cannot complete the request, say so explicitly rather than
speculating.
The frontmatter declares metadata and template variables. Variables use
{variable_name} syntax and are substituted when loaded. We'll replace this
generic prompt with one tailored to the calculus domain in Module 4.
The prompts.system field in agent.yaml designates which prompt file
becomes the system prompt (defaults to system, which loads
prompts/system.md). At startup, build_system_prompt() loads this file,
appends all rules from rules/, and appends the skill manifest from
skills/.
The tool system¶
BaseAgent uses a two-plane model for tools:
| Plane | Visibility | Who calls it | Example |
|---|---|---|---|
| Plane 1 | agent_only |
Your Python code via self.use_tool() |
Formatting, validation, internal logic |
| Plane 2 | llm_only |
The LLM via tool-calling protocol | Web search, code execution, MCP tools |
There's also both for tools callable from either side, but it's rare.
The template includes examples of each. Here's the plane 2 tool
(tools/web_search.py):
from fipsagents.baseagent.tools import tool
@tool(
description="Search the web for information on a topic",
visibility="llm_only",
)
async def web_search(query: str) -> str:
"""Search the web and return relevant results.
Args:
query: The search query string.
"""
# ... implementation ...
And the plane 1 tool (tools/format_citations.py):
@tool(
description="Format raw URLs and titles into clean citation strings",
visibility="agent_only",
)
def format_citations(urls: list, titles: list) -> str:
"""Format URLs and titles into numbered citation lines.
Args:
urls: List of source URLs.
titles: List of source titles (same length as urls).
"""
# ... implementation ...
Because format_citations uses visibility="agent_only", it is only callable from your Python code via self.use_tool(). It does not appear in the LLM's tool schema and is not included in the /v1/agent-info tool list.
Key conventions:
- One file per tool in
tools/. Files starting with_are skipped. - Type hints are mandatory -- the registry builds JSON schemas from them.
- Google-style
Args:docstrings become per-parameter descriptions. - Use
async deffor I/O. Sync functions run in a thread executor.
MCP tools are always plane 2
Tools discovered from MCP servers are automatically registered with
llm_only visibility, regardless of the visibility_default setting.
The LLM decides when to call them, just like local plane 2 tools.
Run it locally¶
Install dependencies and verify the scaffold starts correctly:
make install creates a virtual environment in .venv/ and installs
fipsagents plus your project's dependencies. make run-local starts the
HTTP server on port 8080. The agent won't be able to reach an LLM yet --
the defaults point at a LlamaStack endpoint that doesn't exist. That's
expected; you'll configure the real endpoint in Module 2.
Once you see Uvicorn running on http://0.0.0.0:8080, test it:
{
"agent": {
"name": "my-agent",
"description": "A brief description of what this agent does",
"version": "0.1.0"
},
"model": {
"name": "meta-llama/Llama-3.3-70B-Instruct",
"temperature": 0.7,
"max_tokens": 4096
},
"system_prompt": "You are a helpful assistant.\n\n## Instructions\n...",
"tools": [
{
"name": "ask_user",
"description": "Ask the user a clarifying question"
},
{
"name": "spawn_agent",
"description": "Spawn a sub-agent to handle a delegated task"
}
]
}
The model.name value shows the default from agent.yaml. This will
change to your actual model once you deploy with the correct ConfigMap
values in Module 2.
The tools array lists every tool registered with llm_only or both
visibility. The two shown above are stock tools that BaseAgent always
registers -- ask_user for interactive clarification and spawn_agent for
multi-agent delegation. system_prompt reflects the rendered prompt text
after rule and skill injection.
MemoryHub log line
On first start you'll see MemoryHub config at .memoryhub.yaml has no
server_url — memory disabled (set server_url to enable). That's expected
— the scaffold ships a stub .memoryhub.yaml, and the agent falls back
to NullMemoryClient cleanly. The core tutorial works without memory.
When you're ready to add cross-session recall, see
Agent Memory with MemoryHub.
Stop the server with Ctrl+C.
What's next¶
The scaffolded project starts and serves its health and info endpoints. In Module 2, you'll find your model endpoint in OpenShift, configure the agent, and deploy it to the cluster.