Files

abhishekbhakat 9b35a38728 Enhance droid documentation and coding rules:

- Update coder and reviewer descriptions to clarify subagent roles.
- Improve coding rules for modularity and project structure.
- Add new semantic code search skill documentation for ColGREP.
- Introduce rules skill for accessing project coding conventions.

2026-02-16 16:16:26 +05:30

4.5 KiB

Raw Blame History

name, description, user-invokable, disable-model-invocation

name	description	user-invokable	disable-model-invocation
colgrep	Semantic code search using ColGREP - combines regex filtering with semantic ranking. Use when the user wants to search code by meaning, find relevant code snippets, or explore a codebase semantically. All local - code never leaves the machine.	false	false

ColGREP Semantic Code Search

ColGREP is a semantic code search tool that combines regex filtering with semantic ranking. It uses multi-vector search (via NextPlaid) to find code by meaning, not just keywords.

When to use this skill

Searching for code by semantic meaning ("database connection pooling")
Finding relevant code snippets when exploring a new codebase
Combining pattern matching with semantic understanding
Setting up code search for a new project
When grep returns too many irrelevant results
When you don't know the exact naming conventions used in a codebase

Prerequisites

ColGREP must be installed. It's a single Rust binary with no external dependencies.

Quick Reference

Check if ColGREP is installed

which colgrep || echo "ColGREP not installed"

Install ColGREP

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.sh | sh

Initialize index for a project

# Current directory
colgrep init

# Specific path
colgrep init /path/to/project

Basic semantic search

colgrep "database connection pooling"

Combine regex with semantic search

colgrep -e "async.*await" "error handling"

Essential Flags

Flag	Description	Example
`-c, --content`	Show full function/class content with syntax highlighting	`colgrep -c "authentication"`
`-e <pattern>`	Pre-filter with regex, then rank semantically	`colgrep -e "def.*auth" "login"`
`--include "*.py"`	Filter by file type	`colgrep --include "*.rs" "error handling"`
`--code-only`	Skip text/config files (md, yaml, json)	`colgrep --code-only "parser"`
`-k <n>`	Number of results (default: 15)	`colgrep -k 5 "database"`
`-n <lines>`	Context lines around match	`colgrep -n 10 "config"`
`-l, --files-only`	List only filenames	`colgrep -l "test helpers"`
`--json`	Output as JSON for scripting	`colgrep --json "api" \| jq '.[].unit.file'`
`-y`	Auto-confirm indexing for large codebases	`colgrep -y "search term"`

How it works

Tree-sitter parsing - Extracts functions, methods, classes from code
Structured representation - Creates rich text with signature, params, docstring, calls, variables
LateOn-Code-edge model - 17M parameter model creates multi-vector embeddings (runs on CPU)
NextPlaid indexing - Quantized, memory-mapped, incremental index
Search - SQLite filtering + semantic ranking with grep-compatible flags

Recommended Workflow

For exploring a new codebase:

# 1. Initialize (one-time)
colgrep init

# 2. Search with content display to see actual code
colgrep -c -k 5 "function that handles user authentication"

# 3. Refine with regex if needed
colgrep -c -e "def.*auth" "login validation"

# 4. Filter by language
colgrep -c --include "*.py" "database connection pooling"

For finding specific patterns:

# Hybrid search: regex filter + semantic ranking
colgrep -e "class.*View" "API endpoint handling"

# Skip config files, focus on code
colgrep --code-only "error handling middleware"

# Just get filenames for further processing
colgrep -l "unit test helpers"

For scripting/automation:

# JSON output for piping to other tools
colgrep --json "configuration parser" | jq '.[] | {file: .unit.file, score: .score}'

Pro Tips

Always use -c for initial exploration - Shows full function content, no need to read files separately
Use -e to narrow results - Regex pre-filter is much faster than semantic ranking everything
Index auto-updates - Each search detects file changes; no need to re-run init manually
Large codebases - Use -y to skip confirmation prompts for indexing >10K files

Example workflow

First time setup for a project:
```
cd /path/to/project
colgrep init
```

Search with content display (recommended):

colgrep -c -k 5 "authentication middleware"

Refine with regex:

colgrep -c -e "def.*auth" "login validation"

The index auto-updates - each search detects file changes and updates automatically

4.5 KiB Raw Blame History