droid_rules/VISION.md

# VISION

Need a skill for factory droid which can launch `droid exec` for multiple things.

## Available Models

| Model Name       | Alias           |
|------------------|-----------------|
| `opus_4.6`       | Opus 4.6        |
| `gpt_5.2`        | GPT 5.2         |
| `gpt_5.3_codex`  | GPT 5.3 Codex   |
| `kimi_k2.5`      | Kimi k2.5       |

## Model Selection Criteria

| Role                        | Recommended Model         | Reason                              |
|-----------------------------|---------------------------|-------------------------------------|
| The workhorse               | `kimi_k2.5`               | Fast and cost-effective             |
| The critic                  | `opus_4.6`                | Good at reviewing and finding issues|
| The brainy one              | `gpt_5.2`                 | Highest code intelligence           |
| The coder                   | `gpt_5.3_codex`           | Specialized for code generation     |
| The fast one                | `kimi_k2.5`               | Fastest response time               |
| Good Instructions Following | `kimi_k2.5`, `gpt_5.3_codex` | Strong adherence to requirements |
| The vision model            | `kimi_k2.5`               | Fast vision processing              |

## Coding Task Breakdown

1. Code exploration
2. Planning/spec generation
3. Code generation
4. Formatting, linting, typecheck and other quality checks
5. Review and find bugs
6. Build or test or run the code

## Model Rejection Criteria

### `gpt_5.2` and `gpt_5.3_codex`

- Too slow and expensive for the workhorse role
- Not at all suggested for exploration or tool calls
- Strictly for planning/spec gen and code gen

### `opus_4.6`

- Very buggy and looks for shortcuts in code gen
- Can be a good critic and reviewer
- Never use for code gen

### `kimi_k2.5`

- OK in all areas and fast
- Never be primary for large code gen
- Can be used for a second opinion

## Model Performance Comparison

### Cost (High to Low)

| Rank | Model            |
|------|------------------|
| 1    | `opus_4.6`       |
| 2    | `gpt_5.2`        |
| 3    | `gpt_5.3_codex`  |
| 4    | `kimi_k2.5`      |

### Speed (Fast to Slow)

| Rank | Model            |
|------|------------------|
| 1    | `kimi_k2.5`      |
| 2    | `opus_4.6`       |
| 3    | `gpt_5.3_codex`  |
| 4    | `gpt_5.2`        |

### Code Intelligence (High to Low)

| Rank | Model            |
|------|------------------|
| 1    | `gpt_5.2`        |
| 2    | `gpt_5.3_codex`  |
| 3    | `opus_4.6`       |
| 4    | `kimi_k2.5`      |

### Overthinking (High to Low)

| Rank | Model            |
|------|------------------|
| 1    | `gpt_5.2`        |
| 2    | `gpt_5.3_codex`  |
| 3    | `opus_4.6`       |
| 4    | `kimi_k2.5`      |

## Flow

-> Start with good instruction follower (kimi_k2.5 or gpt_5.3_codex).
User asks a question or give a task.
-> Make a todo list.
-> exploration is always needed. launch multiple explorer droid with kimi_k2.5 asking question in natural language.
-> After exploration, evaluate context with spec droid with gpt_5.2.
-> Confirm spec with user.
-> For code gen, use gpt_5.3_codex for large code gen, or kimi_k2.5 for small code gen.
-> After code gen, run quality check droid with kimi_k2.5.
-> Run review droid with opus_4.6 to find bugs and issues.
-> Run build/test/run droid with kimi_k2.5.
-> Provide summary