new models

This commit is contained in:
2026-02-06 09:36:51 +05:30
parent 10027abf0b
commit ac213793b0
3 changed files with 261 additions and 15 deletions

View File

@@ -6,9 +6,9 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Model Name | Alias |
|------------------|-----------------|
| `opus_4.5` | Opus 4.5 |
| `opus_4.6` | Opus 4.6 |
| `gpt_5.2` | GPT 5.2 |
| `gpt_5.2_codex` | GPT 5.2 Codex |
| `gpt_5.3_codex` | GPT 5.3 Codex |
| `kimi_k2.5` | Kimi k2.5 |
## Model Selection Criteria
@@ -16,11 +16,12 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Role | Recommended Model | Reason |
|-----------------------------|---------------------------|-------------------------------------|
| The workhorse | `kimi_k2.5` | Fast and cost-effective |
| The critic | `opus_4.5` | Good at reviewing and finding issues|
| The critic | `opus_4.6` | Good at reviewing and finding issues|
| The brainy one | `gpt_5.2` | Highest code intelligence |
| The coder | `gpt_5.2_codex` | Specialized for code generation |
| The coder | `gpt_5.3_codex` | Specialized for code generation |
| The fast one | `kimi_k2.5` | Fastest response time |
| Good Instructions Following | `kimi_k2.5`, `gpt_5.2_codex` | Strong adherence to requirements |
| Good Instructions Following | `kimi_k2.5`, `gpt_5.3_codex` | Strong adherence to requirements |
| The vision model | `kimi_k2.5` | Fast vision processing |
## Coding Task Breakdown
@@ -33,13 +34,13 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
## Model Rejection Criteria
### `gpt_5.2` and `gpt_5.2_codex`
### `gpt_5.2` and `gpt_5.3_codex`
- Too slow and expensive for the workhorse role
- Not at all suggested for exploration or tool calls
- Strictly for planning/spec gen and code gen
### `opus_4.5`
### `opus_4.6`
- Very buggy and looks for shortcuts in code gen
- Can be a good critic and reviewer
@@ -57,9 +58,9 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `opus_4.5` |
| 1 | `opus_4.6` |
| 2 | `gpt_5.2` |
| 3 | `gpt_5.2_codex` |
| 3 | `gpt_5.3_codex` |
| 4 | `kimi_k2.5` |
### Speed (Fast to Slow)
@@ -67,8 +68,8 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `kimi_k2.5` |
| 2 | `opus_4.5` |
| 3 | `gpt_5.2_codex` |
| 2 | `opus_4.6` |
| 3 | `gpt_5.3_codex` |
| 4 | `gpt_5.2` |
### Code Intelligence (High to Low)
@@ -76,8 +77,8 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `gpt_5.2` |
| 2 | `gpt_5.2_codex` |
| 3 | `opus_4.5` |
| 2 | `gpt_5.3_codex` |
| 3 | `opus_4.6` |
| 4 | `kimi_k2.5` |
### Overthinking (High to Low)
@@ -85,6 +86,20 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `gpt_5.2` |
| 2 | `gpt_5.2_codex` |
| 3 | `opus_4.5` |
| 2 | `gpt_5.3_codex` |
| 3 | `opus_4.6` |
| 4 | `kimi_k2.5` |
## Flow
-> Start with good instruction follower (kimi_k2.5 or gpt_5.3_codex).
User asks a question or give a task.
-> Make a todo list.
-> exploration is always needed. launch multiple explorer droid with kimi_k2.5 asking question in natural language.
-> After exploration, evaluate context with spec droid with gpt_5.2.
-> Confirm spec with user.
-> For code gen, use gpt_5.3_codex for large code gen, or kimi_k2.5 for small code gen.
-> After code gen, run quality check droid with kimi_k2.5.
-> Run review droid with opus_4.6 to find bugs and issues.
-> Run build/test/run droid with kimi_k2.5.
-> Provide summary