Initial implementation by kimi k2 0905
This commit is contained in:
53
PLANNING/design.md
Normal file
53
PLANNING/design.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# CLM System Architecture Design
|
||||
|
||||
## Design Patterns
|
||||
|
||||
### 1. Monolithic Architecture
|
||||
Single FastAPI application with modular components:
|
||||
- **Document Ingestion Module**: Handles multiple file formats (PDF, DOCX, TXT)
|
||||
- **RAG Module**: Manages vector embeddings and retrieval
|
||||
- **AI Agent Module**: Daily contract monitoring and reporting
|
||||
- **Chatbot Module**: User interface for contract queries
|
||||
|
||||
### 2. Direct File Operations
|
||||
- Simple utility functions for file I/O
|
||||
- Direct file system operations for document storage
|
||||
- No abstraction layer needed for this scope
|
||||
|
||||
### 3. Direct File Processing
|
||||
- Simple file type detection and processing functions
|
||||
- Direct embedding generation using selected model
|
||||
|
||||
### 4. Strategy Pattern
|
||||
- `ChunkingStrategy`: Basic fixed-size chunking
|
||||
- `EmbeddingModel`: Single model (OpenAI or local)
|
||||
|
||||
### 5. Chain of Responsibility
|
||||
- Document processing pipeline:
|
||||
1. `FileValidator` → 2. `OCRProcessor` → 3. `TextExtractor` → 4. `Chunker` → 5. `Embedder` → 6. `VectorStore`
|
||||
|
||||
### 6. Singleton Pattern
|
||||
- `ConfigurationManager`: Global config access
|
||||
- `VectorDatabaseConnection`: Single connection
|
||||
- `Logger`: Basic error logging
|
||||
|
||||
## Data Flow
|
||||
|
||||
1. **Document Ingestion**: File → Validation → Processing → Storage
|
||||
2. **Query Processing**: User Query → RAG Pipeline → Context Retrieval → Response Generation
|
||||
3. **Daily Monitoring**: Scheduled Trigger → Contract Scan → Conflict Detection → Report Generation
|
||||
|
||||
## Technology Stack
|
||||
|
||||
- **Framework**: FastAPI (async support, automatic docs)
|
||||
- **Vector DB**: ChromaDB (lightweight, easy setup)
|
||||
- **LLM Framework**: LangChain
|
||||
- **Container**: Docker + Docker Compose
|
||||
|
||||
## Implementation Priority
|
||||
|
||||
1. Document ingestion and indexing
|
||||
2. Basic RAG pipeline
|
||||
3. AI agent for daily reports
|
||||
4. Simple chatbot interface
|
||||
5. Document similarity function
|
||||
Reference in New Issue
Block a user