new models

This commit is contained in:
2026-02-06 09:36:51 +05:30
parent 10027abf0b
commit ac213793b0
3 changed files with 261 additions and 15 deletions

138
DROIDS.md Normal file
View File

@@ -0,0 +1,138 @@
# Factory Droids
A system for orchestrating AI droids to handle complex coding tasks through specialized roles.
## Overview
Factory Droids uses `droid exec` to run AI agents non-interactively, each specializing in different aspects of software development.
## Available Commands
```bash
droid exec --help # Show exec command options (includes model list)
droid --help # Show all droid commands
droid exec --list-tools # List available tools for a model
```
> **Tip:** Run `droid exec --help` to see all available models including BYOK custom models.
## Quick Start
```bash
# Read-only analysis (default)
droid exec "analyze the codebase structure"
# With file input
droid exec -f prompt.txt
# With specific model
droid exec --model custom:kimi-k2.5 "explore the project"
# Low autonomy - safe file operations
droid exec --auto low "add JSDoc comments"
# Medium autonomy - development tasks
droid exec --auto medium "install deps and run tests"
# High autonomy - production operations
droid exec --auto high "fix, test, commit and push"
```
## Available Models (BYOK)
| Model ID | Name | Reasoning |
|-----------------------------------|----------------------|-----------|
| `custom:kimi-k2.5` | Kimi K2.5 | Yes |
| `custom:claude-opus-4.6` | Claude Opus 4.6 | Yes |
| `custom:gpt-5.3-codex` | GPT 5.3 Codex | Yes |
| `custom:gpt-5.2` | GPT 5.2 | Yes |
## Droid Roles
| Droid | Model | Purpose | Auto Level |
|------------|-------------------------------|---------------------------------------|------------|
| Explorer | `custom:kimi-k2.5` | Code exploration and research | high |
| Spec | `custom:gpt-5.2` | Planning and specification generation | high |
| Coder | `custom:gpt-5.3-codex` | Large code generation | high |
| Coder-lite | `custom:kimi-k2.5` | Small code generation and fixes | high |
| Quality | `custom:kimi-k2.5` | Formatting, linting, type checking | high |
| Reviewer | `custom:claude-opus-4-6` | Code review and bug finding | high |
| Runner | `custom:kimi-k2.5` | Build, test, and execution | high |
## Workflow
1. **Start** with a good instruction follower (`custom:kimi-k2.5` or `custom:gpt-5.3-codex`)
2. **Make** a todo list
3. **Explore** - Launch multiple explorer droids with `custom:kimi-k2.5` in parallel
4. **Spec** - Evaluate context with spec droid using `custom:gpt-5.2`
5. **Confirm** spec with user
6. **Code** - Use `custom:gpt-5.3-codex` for large code gen, `custom:kimi-k2.5` for small
7. **Quality** - Run quality check droid with `custom:kimi-k2.5 --auto high`
8. **Review** - Run review droid with `custom:claude-opus-4-6 --auto high`
9. **Run** - Run build/test droid with `custom:kimi-k2.5 --auto high`
10. **Summarize** - Provide final summary
## Autonomy Levels
| Level | Flag | Description |
|---------|----------------|-------------------------------------------------------|
| Default | (none) | Read-only - safest for reviewing planned changes |
| Low | `--auto low` | Basic file operations, no system changes |
| Medium | `--auto medium`| Development ops - install packages, build, git local |
| High | `--auto high` | Production ops - git push, deploy, migrations |
| Unsafe | `--skip-permissions-unsafe` | Bypass all checks - DANGEROUS! |
## Command Options
```
Usage: droid exec [options] [prompt]
Arguments:
prompt The prompt to execute
Options:
-o, --output-format <format> Output format (default: "text")
--input-format <format> Input format: stream-json for multi-turn
-f, --file <path> Read prompt from file
--auto <level> Autonomy level: low|medium|high
--skip-permissions-unsafe Skip ALL permission checks (unsafe)
-s, --session-id <id> Existing session to continue
-m, --model <id> Model ID (default: claude-opus-4-5-20251101)
-r, --reasoning-effort <level> Reasoning effort (model-specific)
--enabled-tools <ids> Enable specific tools
--disabled-tools <ids> Disable specific tools
--cwd <path> Working directory path
--log-group-id <id> Log group ID for filtering logs
--list-tools List available tools and exit
-h, --help Display help
```
## Authentication
Create API key: https://app.factory.ai/settings/api-keys
```bash
export FACTORY_API_KEY=fk-... && droid exec "fix the bug"
```
## Examples
```bash
# Analysis (read-only)
droid exec "Review the codebase for security vulnerabilities"
# Documentation
droid exec --auto low "add JSDoc comments to all functions"
droid exec --auto low "fix typos in README.md"
# Development
droid exec --auto medium "install deps, run tests, fix issues"
droid exec --auto medium "update packages and resolve conflicts"
# Production
droid exec --auto high "fix bug, test, commit, and push to main"
droid exec --auto high "deploy to staging after running tests"
# Continue session
droid exec -s <session-id> "continue previous task"
```

View File

@@ -6,9 +6,9 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Model Name | Alias |
|------------------|-----------------|
| `opus_4.5` | Opus 4.5 |
| `opus_4.6` | Opus 4.6 |
| `gpt_5.2` | GPT 5.2 |
| `gpt_5.2_codex` | GPT 5.2 Codex |
| `gpt_5.3_codex` | GPT 5.3 Codex |
| `kimi_k2.5` | Kimi k2.5 |
## Model Selection Criteria
@@ -16,11 +16,12 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Role | Recommended Model | Reason |
|-----------------------------|---------------------------|-------------------------------------|
| The workhorse | `kimi_k2.5` | Fast and cost-effective |
| The critic | `opus_4.5` | Good at reviewing and finding issues|
| The critic | `opus_4.6` | Good at reviewing and finding issues|
| The brainy one | `gpt_5.2` | Highest code intelligence |
| The coder | `gpt_5.2_codex` | Specialized for code generation |
| The coder | `gpt_5.3_codex` | Specialized for code generation |
| The fast one | `kimi_k2.5` | Fastest response time |
| Good Instructions Following | `kimi_k2.5`, `gpt_5.2_codex` | Strong adherence to requirements |
| Good Instructions Following | `kimi_k2.5`, `gpt_5.3_codex` | Strong adherence to requirements |
| The vision model | `kimi_k2.5` | Fast vision processing |
## Coding Task Breakdown
@@ -33,13 +34,13 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
## Model Rejection Criteria
### `gpt_5.2` and `gpt_5.2_codex`
### `gpt_5.2` and `gpt_5.3_codex`
- Too slow and expensive for the workhorse role
- Not at all suggested for exploration or tool calls
- Strictly for planning/spec gen and code gen
### `opus_4.5`
### `opus_4.6`
- Very buggy and looks for shortcuts in code gen
- Can be a good critic and reviewer
@@ -57,9 +58,9 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `opus_4.5` |
| 1 | `opus_4.6` |
| 2 | `gpt_5.2` |
| 3 | `gpt_5.2_codex` |
| 3 | `gpt_5.3_codex` |
| 4 | `kimi_k2.5` |
### Speed (Fast to Slow)
@@ -67,8 +68,8 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `kimi_k2.5` |
| 2 | `opus_4.5` |
| 3 | `gpt_5.2_codex` |
| 2 | `opus_4.6` |
| 3 | `gpt_5.3_codex` |
| 4 | `gpt_5.2` |
### Code Intelligence (High to Low)
@@ -76,8 +77,8 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `gpt_5.2` |
| 2 | `gpt_5.2_codex` |
| 3 | `opus_4.5` |
| 2 | `gpt_5.3_codex` |
| 3 | `opus_4.6` |
| 4 | `kimi_k2.5` |
### Overthinking (High to Low)
@@ -85,6 +86,20 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
| Rank | Model |
|------|------------------|
| 1 | `gpt_5.2` |
| 2 | `gpt_5.2_codex` |
| 3 | `opus_4.5` |
| 2 | `gpt_5.3_codex` |
| 3 | `opus_4.6` |
| 4 | `kimi_k2.5` |
## Flow
-> Start with good instruction follower (kimi_k2.5 or gpt_5.3_codex).
User asks a question or give a task.
-> Make a todo list.
-> exploration is always needed. launch multiple explorer droid with kimi_k2.5 asking question in natural language.
-> After exploration, evaluate context with spec droid with gpt_5.2.
-> Confirm spec with user.
-> For code gen, use gpt_5.3_codex for large code gen, or kimi_k2.5 for small code gen.
-> After code gen, run quality check droid with kimi_k2.5.
-> Run review droid with opus_4.6 to find bugs and issues.
-> Run build/test/run droid with kimi_k2.5.
-> Provide summary

93
settings.json Normal file
View File

@@ -0,0 +1,93 @@
{
"logoAnimation": "off",
"customModels": [
{
"model": "Kimi-K2.5",
"id": "custom:Kimi-K2.5-(BYOK)-0",
"index": 0,
"baseUrl": "http://localhost:8383",
"apiKey": "sk-abcd",
"displayName": "Kimi K2.5 (BYOK)",
"maxOutputTokens": 131072,
"noImageSupport": false,
"provider": "anthropic"
},
{
"model": "Kimi-for-Coding",
"id": "custom:Kimi-for-Coding-(BYOK)-1",
"index": 1,
"baseUrl": "http://localhost:8383",
"apiKey": "sk-abcd",
"displayName": "Kimi for Coding (BYOK)",
"noImageSupport": false,
"provider": "anthropic"
},
{
"model": "Opus-4.5",
"id": "custom:Opus-4.5-(BYOK)-2",
"index": 2,
"baseUrl": "http://localhost:8383",
"apiKey": "sk-abcd",
"displayName": "Opus 4.5 (BYOK)",
"maxOutputTokens": 128000,
"extraArgs": {
"parallel_tool_calls": true,
"thinking": {
"type": "enabled",
"budget_tokens": 120000
}
},
"noImageSupport": true,
"provider": "anthropic"
},
{
"model": "Gpt-5.3-Codex",
"id": "custom:Gpt-5.3-Codex-(BYOK)-3",
"index": 3,
"baseUrl": "http://localhost:8383/v1",
"apiKey": "sk-abcd",
"displayName": "Gpt 5.3 Codex (BYOK)",
"maxOutputTokens": 128000,
"extraArgs": {
"parallel_tool_calls": true,
"reasoning": {
"effort": "xhigh"
}
},
"noImageSupport": true,
"provider": "openai"
},
{
"model": "Gpt-5.2",
"id": "custom:Gpt-5.2-(BYOK)-4",
"index": 4,
"baseUrl": "http://localhost:8383/v1",
"apiKey": "sk-abcd",
"displayName": "Gpt 5.2 (BYOK)",
"maxOutputTokens": 128000,
"extraArgs": {
"parallel_tool_calls": true,
"reasoning": {
"effort": "xhigh"
}
},
"noImageSupport": true,
"provider": "openai"
}
],
"sessionDefaultSettings": {
"model": "custom:Gpt-5.3-Codex-(BYOK)-3",
"autonomyMode": "auto-low",
"specModeReasoningEffort": "none",
"reasoningEffort": "none"
},
"cloudSessionSync": false,
"ideAutoConnect": true,
"includeCoAuthoredByDroid": false,
"showTokenUsageIndicator": true,
"showThinkingInMainView": true,
"allowBackgroundProcesses": true,
"ideExtensionPromptedAt": {
"vscode": 1769532708384
}
}