# Tool-Call Benchmark Prompt

**Project:** opus46
**Route:** `/tool-call`
**Skill:** `oneshot-tool-call`

## Prompt Used

```
/oneshot-tool-call full
```

Run the full tool-calling benchmark test suite across all 5 categories:

1. **Bash Execution** (TC-B01–B04): Echo, Python arithmetic, Node JSON, pipeline commands
2. **File Operations** (TC-F01–F06): Write, read, edit, verify, glob, grep
3. **MCP Tool Calls** (TC-M01–M04): ToolSearch, Context7 resolve, Context7 query, keyword search
4. **Skill Invocations** (TC-S01–S03): current-datetime, brand-guidelines, chart-taste
5. **Generation** (TC-G01–G03): PDF creation, PDF verification, SVG-to-PNG rasterisation

Produce a deterministic benchmark report with binary pass/fail criteria, timing data, and misfire tracking.

## Harness

- **Model:** claude-4.6-opus-high-thinking
- **Harness:** Cursor IDE
- **Date:** 2026-04-05
