Initial implementation by kimi k2 0905
This commit is contained in:
63
AGENTS.md
Normal file
63
AGENTS.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# CLM System - Agent Guidelines
|
||||
|
||||
## Important Notes
|
||||
- **Always use `uv` with `--active` flag** for dependency management
|
||||
- **Read docs from context7** whenever in doubt or needs confirmation on how to do things the right way
|
||||
|
||||
## Build/Run Commands
|
||||
```bash
|
||||
# Install dependencies
|
||||
uv add --active streamlit langchain langchain-community pypdf2 python-docx pytesseract lancedb
|
||||
|
||||
# Run Streamlit app
|
||||
streamlit run app.py
|
||||
|
||||
# Manual scan
|
||||
python scripts/manual_scan.py
|
||||
|
||||
# Generate reports
|
||||
python scripts/generate_reports.py
|
||||
```
|
||||
|
||||
## Code Style
|
||||
- **Framework**: Streamlit + LangChain + LanceDB
|
||||
- **Structure**: Monolithic with modular components in `src/`
|
||||
- **Imports**: Standard library first, then third-party, then local modules
|
||||
- **Naming**: snake_case for functions/variables, PascalCase for classes
|
||||
- **Error Handling**: Use try/except blocks with logging to `Logger` singleton
|
||||
- **Types**: Use type hints where beneficial, focus on readability
|
||||
|
||||
## Key Patterns
|
||||
- **Document Processing Pipeline**: FileValidator → OCRProcessor → TextExtractor → Chunker → Embedder → VectorStore
|
||||
- **Singletons**: ConfigurationManager, VectorDatabaseConnection, Logger
|
||||
- **Strategy Pattern**: ChunkingStrategy (basic fixed-size), EmbeddingModel (single model)
|
||||
- **Direct File Operations**: Simple utility functions for file I/O
|
||||
|
||||
## Testing
|
||||
```bash
|
||||
# Run basic tests
|
||||
python -m pytest tests/
|
||||
|
||||
# Test single component
|
||||
python -m pytest tests/test_ingestion.py -v
|
||||
```
|
||||
|
||||
## Linting and Type Checking
|
||||
```bash
|
||||
# Run ruff linter (auto-fix issues)
|
||||
ruff check --fix .
|
||||
|
||||
# Run pyright type checker
|
||||
pyright
|
||||
|
||||
# Run both after making changes
|
||||
cd clm-system && ruff check --fix . && pyright
|
||||
```
|
||||
|
||||
## Vector DB Choice
|
||||
Use LanceDB - lightweight, local, no server setup required for this scope
|
||||
|
||||
|
||||
# STRICT RULES
|
||||
- Do not make `sys.path.append` fixes to any code. Always understand where you are executing codes from.
|
||||
- Do not make use of `pathlib` or `os.path` always use `importlib.resources` and define resources in `pyproject.toml`.
|
||||
Reference in New Issue
Block a user