VISION of the Project
Need a skill for factory droid which can launch droid exec for multiple things.
Available Models
| Model Name |
Alias |
opus_4.6 |
Opus 4.6 |
gpt_5.2 |
GPT 5.2 |
gpt_5.3_codex |
GPT 5.3 Codex |
kimi_k2.5 |
Kimi k2.5 |
Model Selection Criteria
| Role |
Recommended Model |
Reason |
| The workhorse |
kimi_k2.5 |
Fast and cost-effective |
| The critic |
opus_4.6 |
Good at reviewing and finding issues |
| The brainy one |
gpt_5.2 |
Highest code intelligence |
| The coder |
gpt_5.3_codex |
Specialized for code generation |
| The fast one |
kimi_k2.5 |
Fastest response time |
| Good Instructions Following |
kimi_k2.5, gpt_5.3_codex |
Strong adherence to requirements |
| The vision model |
kimi_k2.5 |
Fast vision processing |
Coding Task Breakdown
- Code exploration
- Planning/spec generation
- Code generation
- Formatting, linting, typecheck and other quality checks
- Review and find bugs
- Build or test or run the code
Model Rejection Criteria
gpt_5.2 and gpt_5.3_codex
- Too slow and expensive for the workhorse role
- Not at all suggested for exploration or tool calls
- Strictly for planning/spec gen and code gen
opus_4.6
- Very buggy and looks for shortcuts in code gen
- Can be a good critic and reviewer
- Never use for code gen
kimi_k2.5
- OK in all areas and fast
- Never be primary for large code gen
- Can be used for a second opinion
Model Performance Comparison
Cost (High to Low)
| Rank |
Model |
| 1 |
opus_4.6 |
| 2 |
gpt_5.2 |
| 3 |
gpt_5.3_codex |
| 4 |
kimi_k2.5 |
Speed (Fast to Slow)
| Rank |
Model |
| 1 |
kimi_k2.5 |
| 2 |
opus_4.6 |
| 3 |
gpt_5.3_codex |
| 4 |
gpt_5.2 |
Code Intelligence (High to Low)
| Rank |
Model |
| 1 |
gpt_5.2 |
| 2 |
gpt_5.3_codex |
| 3 |
opus_4.6 |
| 4 |
kimi_k2.5 |
Overthinking (High to Low)
| Rank |
Model |
| 1 |
gpt_5.2 |
| 2 |
opus_4.6 |
| 3 |
gpt_5.3_codex |
| 4 |
kimi_k2.5 |
Flow
-> Start with kimi_k2.5 as the driver and entrypoint.
User asks a question or give a task.
-> Make a todo list.
-> exploration is always needed. launch multiple explorer droid with kimi_k2.5 asking question in natural language.
-> After exploration, evaluate context with spec droid with gpt_5.2.
-> Confirm spec with user.
-> For code gen, use gpt_5.3_codex for large code gen, or kimi_k2.5 for small code gen.
-> After code gen, run quality check droid with kimi_k2.5.
-> Run review droid with opus_4.6 to find bugs and issues.
-> Run build/test/run droid with kimi_k2.5.
-> Provide summary
Important Notes
-
Assume that all droid exec with any model will try to explore the code base. So we need to provide as many context as possible that there should not be need to explore again when it comes to opus 4.6 or gpt 5.2. 5.3-codex and kimi-k2.5 are good at exploring, so they can be let loose.
-
Do not create unnecessary new markdown files. Need to ask this in every droid exec. Only the driver (kimi-k2.5) should be doing it.