new models

2026-02-06 09:36:51 +05:30
parent 10027abf0b
commit ac213793b0
3 changed files with 261 additions and 15 deletions
--- a/DROIDS.md
+++ b/DROIDS.md
@@ -0,0 +1,138 @@
+# Factory Droids
+
+A system for orchestrating AI droids to handle complex coding tasks through specialized roles.
+
+## Overview
+
+Factory Droids uses `droid exec` to run AI agents non-interactively, each specializing in different aspects of software development.
+
+## Available Commands
+
+```bash
+droid exec --help              # Show exec command options (includes model list)
+droid --help                   # Show all droid commands
+droid exec --list-tools        # List available tools for a model
+```
+
+> **Tip:** Run `droid exec --help` to see all available models including BYOK custom models.
+
+## Quick Start
+
+```bash
+# Read-only analysis (default)
+droid exec "analyze the codebase structure"
+
+# With file input
+droid exec -f prompt.txt
+
+# With specific model
+droid exec --model custom:kimi-k2.5 "explore the project"
+
+# Low autonomy - safe file operations
+droid exec --auto low "add JSDoc comments"
+
+# Medium autonomy - development tasks
+droid exec --auto medium "install deps and run tests"
+
+# High autonomy - production operations
+droid exec --auto high "fix, test, commit and push"
+```
+
+## Available Models (BYOK)
+
+| Model ID                          | Name                 | Reasoning |
+|-----------------------------------|----------------------|-----------|
+| `custom:kimi-k2.5`                | Kimi K2.5            | Yes       |
+| `custom:claude-opus-4.6`          | Claude Opus 4.6      | Yes       |
+| `custom:gpt-5.3-codex`            | GPT 5.3 Codex        | Yes       |
+| `custom:gpt-5.2`                  | GPT 5.2              | Yes       |
+
+## Droid Roles
+
+| Droid      | Model                         | Purpose                               | Auto Level |
+|------------|-------------------------------|---------------------------------------|------------|
+| Explorer   | `custom:kimi-k2.5`            | Code exploration and research         | high       |
+| Spec       | `custom:gpt-5.2`              | Planning and specification generation | high       |
+| Coder      | `custom:gpt-5.3-codex`        | Large code generation                 | high       |
+| Coder-lite | `custom:kimi-k2.5`            | Small code generation and fixes       | high       |
+| Quality    | `custom:kimi-k2.5`            | Formatting, linting, type checking    | high       |
+| Reviewer   | `custom:claude-opus-4-6`      | Code review and bug finding           | high       |
+| Runner     | `custom:kimi-k2.5`            | Build, test, and execution            | high       |
+
+## Workflow
+
+1. **Start** with a good instruction follower (`custom:kimi-k2.5` or `custom:gpt-5.3-codex`)
+2. **Make** a todo list
+3. **Explore** - Launch multiple explorer droids with `custom:kimi-k2.5` in parallel
+4. **Spec** - Evaluate context with spec droid using `custom:gpt-5.2`
+5. **Confirm** spec with user
+6. **Code** - Use `custom:gpt-5.3-codex` for large code gen, `custom:kimi-k2.5` for small
+7. **Quality** - Run quality check droid with `custom:kimi-k2.5 --auto high`
+8. **Review** - Run review droid with `custom:claude-opus-4-6 --auto high`
+9. **Run** - Run build/test droid with `custom:kimi-k2.5 --auto high`
+10. **Summarize** - Provide final summary
+
+## Autonomy Levels
+
+| Level   | Flag           | Description                                           |
+|---------|----------------|-------------------------------------------------------|
+| Default | (none)         | Read-only - safest for reviewing planned changes      |
+| Low     | `--auto low`   | Basic file operations, no system changes              |
+| Medium  | `--auto medium`| Development ops - install packages, build, git local  |
+| High    | `--auto high`  | Production ops - git push, deploy, migrations         |
+| Unsafe  | `--skip-permissions-unsafe` | Bypass all checks - DANGEROUS!         |
+
+## Command Options
+
+```
+Usage: droid exec [options] [prompt]
+
+Arguments:
+  prompt                      The prompt to execute
+
+Options:
+  -o, --output-format <format>    Output format (default: "text")
+  --input-format <format>         Input format: stream-json for multi-turn
+  -f, --file <path>               Read prompt from file
+  --auto <level>                  Autonomy level: low|medium|high
+  --skip-permissions-unsafe       Skip ALL permission checks (unsafe)
+  -s, --session-id <id>           Existing session to continue
+  -m, --model <id>                Model ID (default: claude-opus-4-5-20251101)
+  -r, --reasoning-effort <level>  Reasoning effort (model-specific)
+  --enabled-tools <ids>           Enable specific tools
+  --disabled-tools <ids>          Disable specific tools
+  --cwd <path>                    Working directory path
+  --log-group-id <id>             Log group ID for filtering logs
+  --list-tools                    List available tools and exit
+  -h, --help                      Display help
+```
+
+## Authentication
+
+Create API key: https://app.factory.ai/settings/api-keys
+
+```bash
+export FACTORY_API_KEY=fk-... && droid exec "fix the bug"
+```
+
+## Examples
+
+```bash
+# Analysis (read-only)
+droid exec "Review the codebase for security vulnerabilities"
+
+# Documentation
+droid exec --auto low "add JSDoc comments to all functions"
+droid exec --auto low "fix typos in README.md"
+
+# Development
+droid exec --auto medium "install deps, run tests, fix issues"
+droid exec --auto medium "update packages and resolve conflicts"
+
+# Production
+droid exec --auto high "fix bug, test, commit, and push to main"
+droid exec --auto high "deploy to staging after running tests"
+
+# Continue session
+droid exec -s <session-id> "continue previous task"
+```
--- a/VISION.md
+++ b/VISION.md
@@ -6,9 +6,9 @@ Need a skill for factory droid which can launch `droid exec` for multiple things

 | Model Name       | Alias           |
 |------------------|-----------------|
-| `opus_4.5`       | Opus 4.5        |
+| `opus_4.6`       | Opus 4.6        |
 | `gpt_5.2`        | GPT 5.2         |
-| `gpt_5.2_codex`  | GPT 5.2 Codex   |
+| `gpt_5.3_codex`  | GPT 5.3 Codex   |
 | `kimi_k2.5`      | Kimi k2.5       |

 ## Model Selection Criteria
@@ -16,11 +16,12 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
 | Role                        | Recommended Model         | Reason                              |
 |-----------------------------|---------------------------|-------------------------------------|
 | The workhorse               | `kimi_k2.5`               | Fast and cost-effective             |
-| The critic                  | `opus_4.5`                | Good at reviewing and finding issues|
+| The critic                  | `opus_4.6`                | Good at reviewing and finding issues|
 | The brainy one              | `gpt_5.2`                 | Highest code intelligence           |
-| The coder                   | `gpt_5.2_codex`           | Specialized for code generation     |
+| The coder                   | `gpt_5.3_codex`           | Specialized for code generation     |
 | The fast one                | `kimi_k2.5`               | Fastest response time               |
-| Good Instructions Following | `kimi_k2.5`, `gpt_5.2_codex` | Strong adherence to requirements |
+| Good Instructions Following | `kimi_k2.5`, `gpt_5.3_codex` | Strong adherence to requirements |
+| The vision model            | `kimi_k2.5`               | Fast vision processing              |

 ## Coding Task Breakdown

@@ -33,13 +34,13 @@ Need a skill for factory droid which can launch `droid exec` for multiple things

 ## Model Rejection Criteria

-### `gpt_5.2` and `gpt_5.2_codex`
+### `gpt_5.2` and `gpt_5.3_codex`

 - Too slow and expensive for the workhorse role
 - Not at all suggested for exploration or tool calls
 - Strictly for planning/spec gen and code gen

-### `opus_4.5`
+### `opus_4.6`

 - Very buggy and looks for shortcuts in code gen
 - Can be a good critic and reviewer
@@ -57,9 +58,9 @@ Need a skill for factory droid which can launch `droid exec` for multiple things

 | Rank | Model            |
 |------|------------------|
-| 1    | `opus_4.5`       |
+| 1    | `opus_4.6`       |
 | 2    | `gpt_5.2`        |
-| 3    | `gpt_5.2_codex`  |
+| 3    | `gpt_5.3_codex`  |
 | 4    | `kimi_k2.5`      |

 ### Speed (Fast to Slow)
@@ -67,8 +68,8 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
 | Rank | Model            |
 |------|------------------|
 | 1    | `kimi_k2.5`      |
-| 2    | `opus_4.5`       |
-| 3    | `gpt_5.2_codex`  |
+| 2    | `opus_4.6`       |
+| 3    | `gpt_5.3_codex`  |
 | 4    | `gpt_5.2`        |

 ### Code Intelligence (High to Low)
@@ -76,8 +77,8 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
 | Rank | Model            |
 |------|------------------|
 | 1    | `gpt_5.2`        |
-| 2    | `gpt_5.2_codex`  |
-| 3    | `opus_4.5`       |
+| 2    | `gpt_5.3_codex`  |
+| 3    | `opus_4.6`       |
 | 4    | `kimi_k2.5`      |

 ### Overthinking (High to Low)
@@ -85,6 +86,20 @@ Need a skill for factory droid which can launch `droid exec` for multiple things
 | Rank | Model            |
 |------|------------------|
 | 1    | `gpt_5.2`        |
-| 2    | `gpt_5.2_codex`  |
-| 3    | `opus_4.5`       |
+| 2    | `gpt_5.3_codex`  |
+| 3    | `opus_4.6`       |
 | 4    | `kimi_k2.5`      |
+
+## Flow
+
+-> Start with good instruction follower (kimi_k2.5 or gpt_5.3_codex).
+User asks a question or give a task.
+-> Make a todo list.
+-> exploration is always needed. launch multiple explorer droid with kimi_k2.5 asking question in natural language.
+-> After exploration, evaluate context with spec droid with gpt_5.2.
+-> Confirm spec with user.
+-> For code gen, use gpt_5.3_codex for large code gen, or kimi_k2.5 for small code gen.
+-> After code gen, run quality check droid with kimi_k2.5.
+-> Run review droid with opus_4.6 to find bugs and issues.
+-> Run build/test/run droid with kimi_k2.5.
+-> Provide summary
--- a/settings.json
+++ b/settings.json
@@ -0,0 +1,93 @@
+{
+  "logoAnimation": "off",
+  "customModels": [
+    {
+      "model": "Kimi-K2.5",
+      "id": "custom:Kimi-K2.5-(BYOK)-0",
+      "index": 0,
+      "baseUrl": "http://localhost:8383",
+      "apiKey": "sk-abcd",
+      "displayName": "Kimi K2.5 (BYOK)",
+      "maxOutputTokens": 131072,
+      "noImageSupport": false,
+      "provider": "anthropic"
+    },
+    {
+      "model": "Kimi-for-Coding",
+      "id": "custom:Kimi-for-Coding-(BYOK)-1",
+      "index": 1,
+      "baseUrl": "http://localhost:8383",
+      "apiKey": "sk-abcd",
+      "displayName": "Kimi for Coding (BYOK)",
+      "noImageSupport": false,
+      "provider": "anthropic"
+    },
+    {
+      "model": "Opus-4.5",
+      "id": "custom:Opus-4.5-(BYOK)-2",
+      "index": 2,
+      "baseUrl": "http://localhost:8383",
+      "apiKey": "sk-abcd",
+      "displayName": "Opus 4.5 (BYOK)",
+      "maxOutputTokens": 128000,
+      "extraArgs": {
+        "parallel_tool_calls": true,
+        "thinking": {
+          "type": "enabled",
+          "budget_tokens": 120000
+        }
+      },
+      "noImageSupport": true,
+      "provider": "anthropic"
+    },
+    {
+      "model": "Gpt-5.3-Codex",
+      "id": "custom:Gpt-5.3-Codex-(BYOK)-3",
+      "index": 3,
+      "baseUrl": "http://localhost:8383/v1",
+      "apiKey": "sk-abcd",
+      "displayName": "Gpt 5.3 Codex (BYOK)",
+      "maxOutputTokens": 128000,
+      "extraArgs": {
+        "parallel_tool_calls": true,
+        "reasoning": {
+          "effort": "xhigh"
+        }
+      },
+      "noImageSupport": true,
+      "provider": "openai"
+    },
+    {
+      "model": "Gpt-5.2",
+      "id": "custom:Gpt-5.2-(BYOK)-4",
+      "index": 4,
+      "baseUrl": "http://localhost:8383/v1",
+      "apiKey": "sk-abcd",
+      "displayName": "Gpt 5.2 (BYOK)",
+      "maxOutputTokens": 128000,
+      "extraArgs": {
+        "parallel_tool_calls": true,
+        "reasoning": {
+          "effort": "xhigh"
+        }
+      },
+      "noImageSupport": true,
+      "provider": "openai"
+    }
+  ],
+  "sessionDefaultSettings": {
+    "model": "custom:Gpt-5.3-Codex-(BYOK)-3",
+    "autonomyMode": "auto-low",
+    "specModeReasoningEffort": "none",
+    "reasoningEffort": "none"
+  },
+  "cloudSessionSync": false,
+  "ideAutoConnect": true,
+  "includeCoAuthoredByDroid": false,
+  "showTokenUsageIndicator": true,
+  "showThinkingInMainView": true,
+  "allowBackgroundProcesses": true,
+  "ideExtensionPromptedAt": {
+    "vscode": 1769532708384
+  }
+}