Anatomy of a Well-Designed Skill
A deep dive into the four pillars of skill design: clear descriptions, well-typed parameters, thorough error handling, and predictable output, with annotated code examples and practical guidance.
A skill is only as useful as an agent’s ability to understand it, invoke it correctly, and interpret the results. You can build the most powerful capability in the world, but if the agent can’t figure out when to use it or how to call it, it may as well not exist.
This guide breaks down the four pillars that separate well-designed skills from frustrating ones: clear descriptions, well-typed parameters, thorough error handling, and predictable output. We’ll walk through a complete skill definition, annotate every decision, and give you a framework for evaluating your own designs.
The four pillars
Before getting into code, here’s the mental model. Every skill has four surfaces that the agent interacts with:
| Pillar | What it answers | Who benefits |
|---|---|---|
| Description | ”Should I use this skill right now?” | The agent’s planning/reasoning layer |
| Parameters | ”How do I invoke this correctly?” | The agent’s tool-calling layer |
| Error handling | ”What went wrong and what should I try next?” | The agent’s recovery logic |
| Output contract | ”What did I get back and what does it mean?” | The agent’s next-step reasoning |
If any one of these is weak, the whole skill degrades. A perfect implementation with a vague description will rarely get invoked. Flawless parameters with poor error messages will leave the agent stuck when anything goes wrong.
Pillar 1: clear descriptions
The description is the single most important piece of a skill definition. It’s the agent’s only way to decide whether this skill is the right tool for the current task. Think of it as a job posting: it needs to attract the right candidates (invocations) and repel the wrong ones.
What a good description contains
- What the skill does, in one sentence of plain language
- When to use it, with specific scenarios and concrete examples
- When NOT to use it (this is often more important than the positive case)
- What it returns, so the agent knows what to expect
- Key limitations like context windows, rate limits, or format constraints
Annotated example
const databaseQuerySkill = {
name: "query_database",
description: `Execute a read-only SQL query against the application database
and return the results as structured rows.
Use this when:
- You need to look up specific records (users, orders, products)
- You need to compute aggregates (counts, sums, averages)
- You need to check whether data exists before taking action
Do NOT use this when:
- You need to modify data (use execute_mutation instead)
- You need to run schema migrations (use run_migration instead)
- The query might return more than 10,000 rows (use export_data instead)
Returns: An object with 'columns' (array of column names),
'rows' (array of row arrays), and 'rowCount' (total matched).
Results are capped at 500 rows. If truncated, the 'truncated'
field will be true.
The database uses PostgreSQL syntax. Tables include: users,
orders, products, inventory, audit_log.`,
};
Every sentence in that description is doing work. The agent now knows:
- This is read-only (it won’t accidentally mutate data)
- There are sibling skills for mutations, migrations, and large exports
- Results have a known shape and a cap at 500 rows
- The database dialect is PostgreSQL
- The available tables are listed explicitly
Common description mistakes
Too terse: “Queries the database.” The agent has no idea when to prefer this over other data-access skills.
Too generic: “A flexible tool for working with data.” Could mean anything. The agent will either overuse it or ignore it.
Missing negative guidance: Without “Do NOT use this when…” clauses, the agent may try to insert data through your read-only query skill, then fail in confusing ways.
Implementation details instead of intent: “Uses pg-pool with a 30-second timeout and connection pooling.” The agent doesn’t care about your connection pool. It cares about what problems this skill solves.
Pillar 2: well-typed parameters
Parameters are the contract between the agent’s intent and your skill’s execution. The goal is to make it as easy as possible for the agent to construct a valid invocation on the first try.
Parameter design principles
Use descriptive names. The parameter name is often the strongest signal the agent has.
| Bad | Good | Why |
|---|---|---|
q | search_query | Self-documenting |
n | max_results | States the purpose |
t | file_type | Eliminates ambiguity |
opts | (flatten into individual params) | Avoids nested objects |
Include descriptions with examples. A parameter description should answer: what is this, what format does it expect, and what are some valid values?
parameters: {
type: "object",
properties: {
query: {
type: "string",
description: "SQL SELECT statement to execute. Must be read-only " +
"(SELECT, WITH, EXPLAIN). Example: 'SELECT name, email FROM users " +
"WHERE created_at > NOW() - INTERVAL 7 DAY'"
},
timeout_seconds: {
type: "number",
description: "Maximum execution time in seconds. Defaults to 10. " +
"Increase for complex analytical queries, but values over 30 " +
"may indicate the query needs optimization.",
default: 10
},
format: {
type: "string",
enum: ["rows", "csv", "markdown"],
description: "Output format. 'rows' returns structured JSON (default), " +
"'csv' returns comma-separated text, 'markdown' returns a formatted table.",
default: "rows"
}
},
required: ["query"]
}
Set sensible defaults. Every optional parameter should have a default that works for the common case. If 90% of invocations use the same value, make that the default.
Use enums for constrained choices. When a parameter can only take specific values, enumerate them. This stops the agent from inventing invalid values and gives it a clear menu to choose from.
Keep the parameter object flat. Deeply nested objects are significantly harder for agents to construct correctly. If you find yourself nesting, consider whether those nested fields should be separate top-level parameters or even a separate skill.
Validation at the boundary
Always validate parameters at the entry point of your skill, before doing any real work. This produces clear, immediate error messages rather than cryptic failures deep in the implementation.
async function executeQuery(params: {
query: string;
timeout_seconds?: number;
format?: "rows" | "csv" | "markdown";
}) {
// Validate immediately
if (!params.query.trim()) {
return { success: false, error: "Query cannot be empty" };
}
const normalized = params.query.trim().toUpperCase();
if (
!normalized.startsWith("SELECT") &&
!normalized.startsWith("WITH") &&
!normalized.startsWith("EXPLAIN")
) {
return {
success: false,
error:
"Only SELECT, WITH, and EXPLAIN statements are allowed. " +
"Use execute_mutation for INSERT, UPDATE, or DELETE operations.",
};
}
const timeout = params.timeout_seconds ?? 10;
if (timeout < 1 || timeout > 60) {
return {
success: false,
error: `Timeout must be between 1 and 60 seconds. Got: ${timeout}`,
};
}
// Proceed with execution...
}
Pillar 3: thorough error handling
Agents will send unexpected inputs. External services will fail. Files will be missing. Network requests will time out. The question isn’t whether errors will occur, but how your skill communicates them when they do.
The error response pattern
Every error response from a skill should contain three things:
- What happened, as a clear and specific description of the failure
- Why it happened, with enough context for the agent to understand the cause
- What to do next, as a concrete suggestion for recovery
// Pattern: structured error with recovery guidance
interface SkillError {
success: false;
error: string; // What happened
code?: string; // Machine-readable error type
suggestion: string; // What the agent should try next
}
Error categories and responses
Different error types call for different recovery suggestions:
// Input validation error -> tell the agent how to fix the input
{
success: false,
error: "Invalid date format: '03/26/2026'",
code: "INVALID_INPUT",
suggestion: "Use ISO 8601 format: '2026-03-26'"
}
// Resource not found -> tell the agent how to find the right resource
{
success: false,
error: "Table 'user_accounts' does not exist",
code: "NOT_FOUND",
suggestion: "Available tables: users, orders, products, inventory. " +
"Did you mean 'users'?"
}
// Permission denied -> tell the agent this path is blocked
{
success: false,
error: "Cannot access /etc/shadow: permission denied",
code: "PERMISSION_DENIED",
suggestion: "This file requires elevated privileges and cannot be " +
"read by this skill. Consider reading a different file."
}
// Rate limited -> tell the agent to wait
{
success: false,
error: "API rate limit exceeded",
code: "RATE_LIMITED",
suggestion: "Rate limit resets in approximately 30 seconds. " +
"Wait before retrying this request."
}
// Timeout -> tell the agent to simplify
{
success: false,
error: "Query timed out after 10 seconds",
code: "TIMEOUT",
suggestion: "The query was too complex. Try adding a LIMIT clause, " +
"narrowing the date range, or breaking it into smaller queries."
}
Never swallow errors silently
One of the worst things a skill can do is return an empty or misleading success response when something actually failed. If a search returns zero results because the index is down, that’s different from zero results because nothing matched. Make the distinction clear:
// Bad: agent thinks nothing matched
{ matches: [], totalMatches: 0 }
// Good: agent knows something went wrong
{
success: false,
error: "Search index unavailable",
suggestion: "The search service is temporarily down. Try again in a few minutes."
}
Pillar 4: predictable output
The output contract is what the agent relies on to plan its next steps. If the output shape changes depending on inputs, or if important metadata is sometimes present and sometimes absent, the agent’s reasoning becomes unreliable.
Structured vs. freeform output
Always prefer structured output over freeform text. Structured data is easier for the agent to parse, reference, and reason about.
// Structured: agent can access specific fields
{
users: [
{ id: 1, name: "Alice", email: "alice@example.com" },
{ id: 2, name: "Bob", email: "bob@example.com" }
],
totalCount: 47,
page: 1,
hasMore: true
}
// Freeform: agent has to parse text to extract information
"Found 47 users. Showing page 1:\n1. Alice (alice@example.com)\n2. Bob (bob@example.com)\n..."
Include metadata
Every response should include enough metadata for the agent to decide what to do next without guessing:
| Metadata field | Purpose | Example |
|---|---|---|
totalCount | Are there more results beyond what was returned? | 47 |
truncated | Was the response cut short? | true |
page / hasMore | Is there pagination to follow? | 1 / true |
executionTime | Was this slow? Should the agent optimize? | "2.3s" |
warnings | Non-fatal issues the agent should know about | ["Column 'age' contains nulls"] |
Consistent shape
The response shape should be identical regardless of whether results are found or not. Don’t return an array when there are results and null when there are none:
// Bad: shape changes based on results
results.length > 0 ? results : null
// Good: shape is always the same
{
matches: results, // empty array when no results
totalMatches: results.length,
truncated: false
}
Putting it all together: a complete skill
Here’s a complete skill definition that puts all four pillars into practice. Study the annotations to understand why each piece is there.
const createNoteSkill = {
name: "create_note",
// PILLAR 1: Clear description with when-to-use,
// when-not-to-use, and return shape
description: `Create a new note in the user's notebook with a title
and body. Use this when the user asks to save, record, or remember
something. Do NOT use this for creating files in the codebase (use
write_file instead) or for creating tasks (use create_task instead).
Returns the created note's ID, title, and timestamp. If a note with
the same title already exists, returns an error with the existing
note's ID so you can update it instead.`,
// PILLAR 2: Well-typed parameters with descriptions and validation
parameters: {
type: "object",
properties: {
title: {
type: "string",
description:
"Note title. 1-200 characters. Must be unique " +
"within the notebook.",
},
body: {
type: "string",
description:
"Note content. Supports Markdown formatting. " +
"Maximum 50,000 characters.",
},
tags: {
type: "array",
items: { type: "string" },
description:
"Optional tags for categorization. Example: " +
"['meeting-notes', 'project-alpha']",
default: [],
},
},
required: ["title", "body"],
},
};
async function handleCreateNote(params: {
title: string;
body: string;
tags?: string[];
}) {
// PILLAR 2 (continued): Input validation
if (!params.title.trim()) {
return {
success: false,
error: "Title cannot be empty",
suggestion: "Provide a descriptive title for the note.",
};
}
if (params.title.length > 200) {
return {
success: false,
error: `Title too long (${params.title.length} chars, max 200)`,
suggestion: "Shorten the title and put details in the body.",
};
}
try {
const note = await db.notes.create({
title: params.title.trim(),
body: params.body,
tags: params.tags ?? [],
});
// PILLAR 4: Predictable, structured output with metadata
return {
success: true,
note: {
id: note.id,
title: note.title,
createdAt: note.createdAt.toISOString(),
},
message: `Note "${note.title}" created successfully.`,
};
} catch (err) {
// PILLAR 3: Specific error handling with recovery guidance
if (isDuplicateTitleError(err)) {
const existing = await db.notes.findByTitle(params.title);
return {
success: false,
error: `A note with title "${params.title}" already exists`,
existingNoteId: existing.id,
suggestion:
"Use update_note with this ID to modify the " +
"existing note, or choose a different title.",
};
}
return {
success: false,
error: `Failed to create note: ${err.message}`,
suggestion: "This may be a temporary issue. Try again.",
};
}
}
Design checklist
Before shipping a skill, run through this checklist:
- Description states what the skill does in one sentence
- Description includes at least one “when to use” scenario
- Description includes at least one “when NOT to use” scenario with alternatives
- Description describes the return format
- Parameters have descriptive names (no single-letter abbreviations)
- Parameters each have a description with at least one example
- Parameters use enums for constrained choices
- Optional parameters have sensible defaults
- Parameter object is flat (no unnecessary nesting)
- Input validation happens at the skill boundary with clear messages
- Error responses include what happened, why, and what to try next
- Error shape is consistent across all error types
- Success responses are structured, not freeform text
- Response shape is consistent whether results are found or not
- Metadata is included (counts, truncation flags, pagination info)
If your skill passes every item, you have a well-designed skill that agents will be able to use reliably.
Next steps
Now that you understand the anatomy of a single skill, explore Skill Design Principles to learn how skills compose together, when to split a skill into multiple pieces, and how to manage side effects across a skill ecosystem.