Anatomy of a Well-Designed Skill

A skill is only as useful as an agent’s ability to understand it, invoke it correctly, and interpret the results. You can build the most powerful capability in the world, but if the agent can’t figure out when to use it or how to call it, it may as well not exist.

This guide breaks down the four pillars that separate well-designed skills from frustrating ones: clear descriptions, well-typed parameters, thorough error handling, and predictable output. We’ll walk through a complete skill definition, annotate every decision, and give you a framework for evaluating your own designs.

The four pillars

Before getting into code, here’s the mental model. Every skill has four surfaces that the agent interacts with:

Pillar	What it answers	Who benefits
Description	”Should I use this skill right now?”	The agent’s planning/reasoning layer
Parameters	”How do I invoke this correctly?”	The agent’s tool-calling layer
Error handling	”What went wrong and what should I try next?”	The agent’s recovery logic
Output contract	”What did I get back and what does it mean?”	The agent’s next-step reasoning

If any one of these is weak, the whole skill degrades. A perfect implementation with a vague description will rarely get invoked. Flawless parameters with poor error messages will leave the agent stuck when anything goes wrong.

Pillar 1: clear descriptions

The description is the single most important piece of a skill definition. It’s the agent’s only way to decide whether this skill is the right tool for the current task. Think of it as a job posting: it needs to attract the right candidates (invocations) and repel the wrong ones.

What a good description contains

What the skill does, in one sentence of plain language
When to use it, with specific scenarios and concrete examples
When NOT to use it (this is often more important than the positive case)
What it returns, so the agent knows what to expect
Key limitations like context windows, rate limits, or format constraints

Annotated example

const databaseQuerySkill = {
  name: "query_database",
  description: `Execute a read-only SQL query against the application database
and return the results as structured rows.

Use this when:
- You need to look up specific records (users, orders, products)
- You need to compute aggregates (counts, sums, averages)
- You need to check whether data exists before taking action

Do NOT use this when:
- You need to modify data (use execute_mutation instead)
- You need to run schema migrations (use run_migration instead)
- The query might return more than 10,000 rows (use export_data instead)

Returns: An object with 'columns' (array of column names),
'rows' (array of row arrays), and 'rowCount' (total matched).
Results are capped at 500 rows. If truncated, the 'truncated'
field will be true.

The database uses PostgreSQL syntax. Tables include: users,
orders, products, inventory, audit_log.`,
};

Every sentence in that description is doing work. The agent now knows:

This is read-only (it won’t accidentally mutate data)
There are sibling skills for mutations, migrations, and large exports
Results have a known shape and a cap at 500 rows
The database dialect is PostgreSQL
The available tables are listed explicitly

Common description mistakes

Too terse: “Queries the database.” The agent has no idea when to prefer this over other data-access skills.

Too generic: “A flexible tool for working with data.” Could mean anything. The agent will either overuse it or ignore it.

Missing negative guidance: Without “Do NOT use this when…” clauses, the agent may try to insert data through your read-only query skill, then fail in confusing ways.

Implementation details instead of intent: “Uses pg-pool with a 30-second timeout and connection pooling.” The agent doesn’t care about your connection pool. It cares about what problems this skill solves.

Pillar 2: well-typed parameters

Parameters are the contract between the agent’s intent and your skill’s execution. The goal is to make it as easy as possible for the agent to construct a valid invocation on the first try.

Parameter design principles

Use descriptive names. The parameter name is often the strongest signal the agent has.

Bad	Good	Why
`q`	`search_query`	Self-documenting
`n`	`max_results`	States the purpose
`t`	`file_type`	Eliminates ambiguity
`opts`	(flatten into individual params)	Avoids nested objects

Include descriptions with examples. A parameter description should answer: what is this, what format does it expect, and what are some valid values?

parameters: {
  type: "object",
  properties: {
    query: {
      type: "string",
      description: "SQL SELECT statement to execute. Must be read-only " +
        "(SELECT, WITH, EXPLAIN). Example: 'SELECT name, email FROM users " +
        "WHERE created_at > NOW() - INTERVAL 7 DAY'"
    },
    timeout_seconds: {
      type: "number",
      description: "Maximum execution time in seconds. Defaults to 10. " +
        "Increase for complex analytical queries, but values over 30 " +
        "may indicate the query needs optimization.",
      default: 10
    },
    format: {
      type: "string",
      enum: ["rows", "csv", "markdown"],
      description: "Output format. 'rows' returns structured JSON (default), " +
        "'csv' returns comma-separated text, 'markdown' returns a formatted table.",
      default: "rows"
    }
  },
  required: ["query"]
}

Set sensible defaults. Every optional parameter should have a default that works for the common case. If 90% of invocations use the same value, make that the default.

Use enums for constrained choices. When a parameter can only take specific values, enumerate them. This stops the agent from inventing invalid values and gives it a clear menu to choose from.

Keep the parameter object flat. Deeply nested objects are significantly harder for agents to construct correctly. If you find yourself nesting, consider whether those nested fields should be separate top-level parameters or even a separate skill.

Validation at the boundary

Always validate parameters at the entry point of your skill, before doing any real work. This produces clear, immediate error messages rather than cryptic failures deep in the implementation.

async function executeQuery(params: {
  query: string;
  timeout_seconds?: number;
  format?: "rows" | "csv" | "markdown";
}) {
  // Validate immediately
  if (!params.query.trim()) {
    return { success: false, error: "Query cannot be empty" };
  }

  const normalized = params.query.trim().toUpperCase();
  if (
    !normalized.startsWith("SELECT") &&
    !normalized.startsWith("WITH") &&
    !normalized.startsWith("EXPLAIN")
  ) {
    return {
      success: false,
      error:
        "Only SELECT, WITH, and EXPLAIN statements are allowed. " +
        "Use execute_mutation for INSERT, UPDATE, or DELETE operations.",
    };
  }

  const timeout = params.timeout_seconds ?? 10;
  if (timeout < 1 || timeout > 60) {
    return {
      success: false,
      error: `Timeout must be between 1 and 60 seconds. Got: ${timeout}`,
    };
  }

  // Proceed with execution...
}

Pillar 3: thorough error handling

Agents will send unexpected inputs. External services will fail. Files will be missing. Network requests will time out. The question isn’t whether errors will occur, but how your skill communicates them when they do.

The error response pattern

Every error response from a skill should contain three things:

What happened, as a clear and specific description of the failure
Why it happened, with enough context for the agent to understand the cause
What to do next, as a concrete suggestion for recovery

// Pattern: structured error with recovery guidance
interface SkillError {
  success: false;
  error: string; // What happened
  code?: string; // Machine-readable error type
  suggestion: string; // What the agent should try next
}

Error categories and responses

Different error types call for different recovery suggestions:

// Input validation error -> tell the agent how to fix the input
{
  success: false,
  error: "Invalid date format: '03/26/2026'",
  code: "INVALID_INPUT",
  suggestion: "Use ISO 8601 format: '2026-03-26'"
}

// Resource not found -> tell the agent how to find the right resource
{
  success: false,
  error: "Table 'user_accounts' does not exist",
  code: "NOT_FOUND",
  suggestion: "Available tables: users, orders, products, inventory. " +
    "Did you mean 'users'?"
}

// Permission denied -> tell the agent this path is blocked
{
  success: false,
  error: "Cannot access /etc/shadow: permission denied",
  code: "PERMISSION_DENIED",
  suggestion: "This file requires elevated privileges and cannot be " +
    "read by this skill. Consider reading a different file."
}

// Rate limited -> tell the agent to wait
{
  success: false,
  error: "API rate limit exceeded",
  code: "RATE_LIMITED",
  suggestion: "Rate limit resets in approximately 30 seconds. " +
    "Wait before retrying this request."
}

// Timeout -> tell the agent to simplify
{
  success: false,
  error: "Query timed out after 10 seconds",
  code: "TIMEOUT",
  suggestion: "The query was too complex. Try adding a LIMIT clause, " +
    "narrowing the date range, or breaking it into smaller queries."
}

Never swallow errors silently

One of the worst things a skill can do is return an empty or misleading success response when something actually failed. If a search returns zero results because the index is down, that’s different from zero results because nothing matched. Make the distinction clear:

// Bad: agent thinks nothing matched
{ matches: [], totalMatches: 0 }

// Good: agent knows something went wrong
{
  success: false,
  error: "Search index unavailable",
  suggestion: "The search service is temporarily down. Try again in a few minutes."
}

Pillar 4: predictable output

The output contract is what the agent relies on to plan its next steps. If the output shape changes depending on inputs, or if important metadata is sometimes present and sometimes absent, the agent’s reasoning becomes unreliable.

Structured vs. freeform output

Always prefer structured output over freeform text. Structured data is easier for the agent to parse, reference, and reason about.

// Structured: agent can access specific fields
{
  users: [
    { id: 1, name: "Alice", email: "alice@example.com" },
    { id: 2, name: "Bob", email: "bob@example.com" }
  ],
  totalCount: 47,
  page: 1,
  hasMore: true
}

// Freeform: agent has to parse text to extract information
"Found 47 users. Showing page 1:\n1. Alice (alice@example.com)\n2. Bob (bob@example.com)\n..."

Include metadata

Every response should include enough metadata for the agent to decide what to do next without guessing:

Metadata field	Purpose	Example
`totalCount`	Are there more results beyond what was returned?	`47`
`truncated`	Was the response cut short?	`true`
`page` / `hasMore`	Is there pagination to follow?	`1` / `true`
`executionTime`	Was this slow? Should the agent optimize?	`"2.3s"`
`warnings`	Non-fatal issues the agent should know about	`["Column 'age' contains nulls"]`

Consistent shape

The response shape should be identical regardless of whether results are found or not. Don’t return an array when there are results and null when there are none:

// Bad: shape changes based on results
results.length > 0 ? results : null

// Good: shape is always the same
{
  matches: results,  // empty array when no results
  totalMatches: results.length,
  truncated: false
}

Putting it all together: a complete skill

Here’s a complete skill definition that puts all four pillars into practice. Study the annotations to understand why each piece is there.

const createNoteSkill = {
  name: "create_note",
  // PILLAR 1: Clear description with when-to-use,
  // when-not-to-use, and return shape
  description: `Create a new note in the user's notebook with a title
and body. Use this when the user asks to save, record, or remember
something. Do NOT use this for creating files in the codebase (use
write_file instead) or for creating tasks (use create_task instead).

Returns the created note's ID, title, and timestamp. If a note with
the same title already exists, returns an error with the existing
note's ID so you can update it instead.`,

  // PILLAR 2: Well-typed parameters with descriptions and validation
  parameters: {
    type: "object",
    properties: {
      title: {
        type: "string",
        description:
          "Note title. 1-200 characters. Must be unique " +
          "within the notebook.",
      },
      body: {
        type: "string",
        description:
          "Note content. Supports Markdown formatting. " +
          "Maximum 50,000 characters.",
      },
      tags: {
        type: "array",
        items: { type: "string" },
        description:
          "Optional tags for categorization. Example: " +
          "['meeting-notes', 'project-alpha']",
        default: [],
      },
    },
    required: ["title", "body"],
  },
};

async function handleCreateNote(params: {
  title: string;
  body: string;
  tags?: string[];
}) {
  // PILLAR 2 (continued): Input validation
  if (!params.title.trim()) {
    return {
      success: false,
      error: "Title cannot be empty",
      suggestion: "Provide a descriptive title for the note.",
    };
  }
  if (params.title.length > 200) {
    return {
      success: false,
      error: `Title too long (${params.title.length} chars, max 200)`,
      suggestion: "Shorten the title and put details in the body.",
    };
  }

  try {
    const note = await db.notes.create({
      title: params.title.trim(),
      body: params.body,
      tags: params.tags ?? [],
    });

    // PILLAR 4: Predictable, structured output with metadata
    return {
      success: true,
      note: {
        id: note.id,
        title: note.title,
        createdAt: note.createdAt.toISOString(),
      },
      message: `Note "${note.title}" created successfully.`,
    };
  } catch (err) {
    // PILLAR 3: Specific error handling with recovery guidance
    if (isDuplicateTitleError(err)) {
      const existing = await db.notes.findByTitle(params.title);
      return {
        success: false,
        error: `A note with title "${params.title}" already exists`,
        existingNoteId: existing.id,
        suggestion:
          "Use update_note with this ID to modify the " +
          "existing note, or choose a different title.",
      };
    }
    return {
      success: false,
      error: `Failed to create note: ${err.message}`,
      suggestion: "This may be a temporary issue. Try again.",
    };
  }
}

Design checklist

Before shipping a skill, run through this checklist:

If your skill passes every item, you have a well-designed skill that agents will be able to use reliably.

Next steps

Now that you understand the anatomy of a single skill, explore Skill Design Principles to learn how skills compose together, when to split a skill into multiple pieces, and how to manage side effects across a skill ecosystem.