From b2ff4238af458b04ef53a7712c7e25937cc88506 Mon Sep 17 00:00:00 2001 From: AI Station Server Date: Thu, 1 Jan 2026 18:06:28 +0100 Subject: [PATCH] readme add --- README.md | 593 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 305 insertions(+), 288 deletions(-) diff --git a/README.md b/README.md index 92192137..9a0c6fe8 100644 --- a/README.md +++ b/README.md @@ -1,325 +1,342 @@ -# AI Station - Document Analysis Platform +# AI Station DFFM - Vision-RAG Hybrid System -## ๐Ÿ“‹ Overview +Sistema AI multi-utente con supporto RAG (Retrieval-Augmented Generation), analisi immagini e gestione documenti avanzata, basato su Chainlit, Ollama e BGE-M3. -**AI Station** รจ una piattaforma di analisi documentale basata su AI che utilizza **Retrieval-Augmented Generation (RAG)** per analizzare PDF e documenti testuali con il modello **GLM-4.6:Cloud**. +## ๐ŸŒŸ Features + +### Core AI +- **RAG Hybrid Search** con BGE-M3 (dense + sparse embeddings) +- **Vision Analysis** tramite MiniCPM-V per OCR e descrizione immagini +- **Document Processing** con Docling (PDF, DOCX) con preservazione tabelle/formule +- **Multi-Model Support** (Ollama locale + cloud models) +- **Streaming Responses** con latenza ridotta + +### Multi-Utente +- **OAuth2 Google** con profili personalizzati per ruolo +- **Workspace isolati** per utente/team +- **RAG Collections dedicate** per knowledge base separate +- **Permessi granulari** (admin, engineering, business, architecture) + +### UI/UX +- **Badge ruolo personalizzato** con colori dedicati +- **Settings dinamici** (temperatura, top_k RAG, modello, istruzioni custom) +- **Chat history persistente** con ripresa conversazioni +- **Auto-save codice Python** estratto dalle risposte +- **Metriche real-time** (response time, RAG hits, errori) + +### Performance +- **Caching embeddings** (LRU cache 1000 query) +- **Chunking intelligente** (2000 char con overlap 200) +- **Async operations** su Qdrant e Ollama +- **PostgreSQL** per persistenza thread e metadata + +## ๐Ÿ—๏ธ Architettura + +### Hardware Setup + +#### AI-SRV (Chainlit VM) +- **IP**: 192.168.1.244 +- **CPU**: 16 core (QEMU Virtual) +- **RAM**: 64 GB +- **Storage**: 195 GB +- **Ruolo**: Host Chainlit app + PostgreSQL + Qdrant + +#### AI-Server (GPU Workstation) +- **IP**: 192.168.1.243 +- **CPU**: Intel Core Ultra 7 265 (20 core, max 6.5 GHz) +- **RAM**: 32 GB +- **GPU**: NVIDIA RTX A1000 (8 GB VRAM) +- **Storage**: 936 GB NVMe +- **Ruolo**: Ollama models + BGE-M3 embeddings service ### Stack Tecnologico -- **Backend**: Python + Chainlit (LLM UI framework) -- **LLM**: GLM-4.6:Cloud (via Ollama Cloud) -- **Vector DB**: Qdrant (semantic search) -- **PDF Processing**: PyMuPDF (fitz) -- **Database**: PostgreSQL + SQLAlchemy ORM -- **Containerization**: Docker Compose -- **Embeddings**: nomic-embed-text (via Ollama local) ---- +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ Chainlit UI (ai-srv) โ”‚ +โ”‚ Badge + Settings + Chat History โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +โ”‚ +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ Python Backend (app.py) โ”‚ +โ”‚ - OAuth2 Google โ”‚ +โ”‚ - Multi-user profiles โ”‚ +โ”‚ - File processing orchestration โ”‚ +โ””โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +โ”‚ โ”‚ โ”‚ โ”‚ +โ–ผ โ–ผ โ–ผ โ–ผ +โ”Œโ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ PG โ”‚ โ”‚Qdrantโ”‚ โ”‚ Ollama โ”‚ โ”‚ BGE API โ”‚ +โ”‚ โ”‚ โ”‚Vectorโ”‚ โ”‚ GPU โ”‚ โ”‚ CPU โ”‚ +โ”‚ โ”‚ โ”‚ DB โ”‚ โ”‚ Server โ”‚ โ”‚ Server โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +ai-srv ai-srv ai-server ai-server -## ๐Ÿš€ Quick Start +text -### Prerequisites -- Docker & Docker Compose -- Ollama installed locally (for embeddings) -- Ollama Cloud account (for glm-4.6:cloud) +## ๐Ÿ“‹ Requisiti -### 1๏ธโƒฃ Clone & Setup +### Sistema +- Docker 24.x+ con Docker Compose +- Accesso a Google Cloud Console (per OAuth2) +- 2 server (o VM) con networking condiviso + +### Modelli Ollama (da installare su ai-server) ```bash -git clone git@github.com:your-username/ai-station.git +ollama pull minicpm-v # Vision model (5.5 GB) +ollama pull glm-4.6:cloud # Cloud reasoning +ollama pull qwen2.5-coder:32b # Code generation (9 GB) +ollama pull llama3.2 # Fast general purpose (4.7 GB) +๐Ÿš€ Installazione +1. Clone Repository +bash +git clone cd ai-station +2. Configurazione Ambiente +Crea .env: -# Configure environment -cat > .env << 'EOF' -DATABASE_URL=postgresql+asyncpg://ai_user:secure_password_here@postgres:5432/ai_station +bash +# Database +DATABASE_URL=postgresql+asyncpg://ai_user:CHANGE_ME@postgres:5432/ai_station + +# AI Services OLLAMA_URL=http://192.168.1.243:11434 QDRANT_URL=http://qdrant:6333 -EOF -``` +BGE_API_URL=http://192.168.1.243:8001/embed -### 2๏ธโƒฃ Authenticate Ollama Cloud -```bash -ollama signin -# Follow the link to authenticate with your Ollama account -``` +# OAuth Google +OAUTH_GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com +OAUTH_GOOGLE_CLIENT_SECRET=your-secret +CHAINLIT_AUTH_SECRET=$(openssl rand -base64 32) -### 3๏ธโƒฃ Start Services -```bash -docker compose up -d +3. Configurazione OAuth Google +Vai su Google Cloud Console + +Crea nuovo progetto โ†’ API e servizi โ†’ Credenziali + +Crea "ID client OAuth 2.0" + +Aggiungi URI autorizzati: + +https://ai.dffm.it/auth/oauth/google/callback + +http://localhost:8000/auth/oauth/google/callback (dev) + +Copia Client ID e Secret in .env + +4. Personalizza Utenti +Modifica app.py โ†’ USER_PROFILES: + +python +USER_PROFILES = { + "tuo.email@example.com": { + "role": "admin", + "name": "Nome", + "workspace": "workspace_name", + "rag_collection": "docs_collection", + "capabilities": ["debug", "all"], + "show_code": True, + }, + # ... altri utenti +} +5. Deploy +bash +# Build e avvio +docker compose up -d --build + +# Verifica logs docker compose logs -f chainlit-app + +# Dovresti vedere: +# โœ… Tutte le tabelle create con successo. +# Your app is available at http://localhost:8000 +6. Setup BGE-M3 Service (su ai-server) +bash +# Installa dependencies +pip install fastapi uvicorn FlagEmbedding torch + +# Salva il file bge_service.py (vedi docs/) +python bge_service.py +# Listening on http://0.0.0.0:8001 +๐ŸŽฏ Utilizzo +Login +Accedi via browser: https://ai.dffm.it (o http://localhost:8000) + +Click su "Continue with Google" + +Autorizza con account configurato in USER_PROFILES + +Chat con RAG +Carica PDF/DOCX โ†’ Sistema li indicizza automaticamente + +Fai domande โ†’ Risposta con contesto dai documenti + +Regola top_k (numero documenti) via settings + +Analisi Immagini +Carica screenshot/diagrammi + +Il sistema: + +Estrae testo (OCR) + +Descrive grafici/tabelle + +Usa descrizione come contesto per rispondere + +Settings Disponibili +Numero Documenti RAG (1-10): Quanti chunk recuperare + +Modello: Scegli tra locale/cloud + +Temperatura (0-1): Creativitร  risposta + +RAG Enabled: On/Off recupero documenti + +Istruzione Custom: Prompt system personalizzato + +Ripresa Chat +Sidebar โ†’ Chat History + +Click su conversazione โ†’ "Riprendi" + +Continua da dove avevi lasciato + +๐Ÿ“Š Metriche +Ogni risposta logga (stdout): + +json +{ + "response_time": 18.65, + "rag_hits": 4, + "model": "glm-4.6:cloud", + "user_role": "admin", + "error": null +} +Raccogli con: + +bash +docker logs ai-station-app | grep METRICS > metrics.log +๐Ÿ”ง Troubleshooting +RAG non trova documenti +Verifica collection name in USER_PROFILES[email]["rag_collection"] + +Controlla Qdrant: curl http://localhost:6333/collections + +Badge HTML non si vede +Abilita in .chainlit/config.toml: ``` -### 4๏ธโƒฃ Access UI -Navigate to: **http://localhost:8000** - ---- - -## ๐Ÿ“ Project Structure +```text +[features] +unsafe_allow_html = true +Modello Ollama non risponde +bash +# Testa connessione +curl http://192.168.1.243:11434/api/tags +# Verifica modello disponibile +ollama list +BGE embeddings fail ``` +```bash +# Testa API +curl -X POST http://192.168.1.243:8001/embed \ + -H "Content-Type: application/json" \ + -d '{"texts": ["test"]}' +``` + +๐Ÿ“ Struttura Progetto +```bash ai-station/ -โ”œโ”€โ”€ app.py # Main Chainlit application -โ”œโ”€โ”€ requirements.txt # Python dependencies -โ”œโ”€โ”€ docker-compose.yml # Docker services config -โ”œโ”€โ”€ .env # Environment variables (gitignored) -โ”œโ”€โ”€ workspaces/ # User workspace directories -โ”‚ โ””โ”€โ”€ admin/ # Admin user files -โ””โ”€โ”€ README.md # This file +โ”œโ”€โ”€ app.py # Main Chainlit app +โ”œโ”€โ”€ init_db.py # Database schema init +โ”œโ”€โ”€ requirements.txt # Python deps +โ”œโ”€โ”€ Dockerfile # Container config +โ”œโ”€โ”€ docker-compose.yaml # Multi-service orchestration +โ”œโ”€โ”€ .chainlit/ +โ”‚ โ””โ”€โ”€ config.toml # UI/features config +โ”œโ”€โ”€ public/ +โ”‚ โ””โ”€โ”€ custom.css # Custom styling +โ”œโ”€โ”€ workspaces/ # User file storage (volume) +โ”‚ โ”œโ”€โ”€ admin_workspace/ +โ”‚ โ”œโ”€โ”€ engineering_workspace/ +โ”‚ โ””โ”€โ”€ ... +โ””โ”€โ”€ .files/ # Chainlit storage (volume) ``` - ---- - -## ๐Ÿ”ง Features - -### โœ… Implemented -- **PDF Upload & Processing**: Extract text from PDF documents using PyMuPDF -- **Document Indexing**: Automatic chunking and semantic indexing via Qdrant -- **RAG Search**: Retrieve relevant document chunks based on semantic similarity -- **Intelligent Analysis**: GLM-4.6:Cloud analyzes documents with full context -- **Code Extraction**: Automatically save Python code blocks from responses -- **Chat History**: Persistent conversation storage via SQLAlchemy -- **Streaming Responses**: Real-time token streaming via Chainlit - -### ๐Ÿ”„ Workflow -1. User uploads PDF or TXT file -2. System extracts text and creates semantic chunks -3. Chunks indexed in Qdrant vector database -4. User asks questions about documents -5. RAG retrieves relevant chunks -6. GLM-4.6:Cloud analyzes with full context -7. Streaming response to user - ---- - -## ๐Ÿ“Š Technical Details - -### Document Processing Pipeline - -``` -PDF Upload - โ†“ -PyMuPDF Text Extraction - โ†“ -Text Chunking (1500 chars, 200 char overlap) - โ†“ -nomic-embed-text Embeddings (Ollama local) - โ†“ -Qdrant Vector Storage - โ†“ -Semantic Search on User Query - โ†“ -GLM-4.6:Cloud Analysis with RAG Context - โ†“ -Chainlit Streaming Response -``` - -### Key Functions - -| Function | Purpose | -|----------|---------| -| `extract_text_from_pdf()` | Convert PDF to text using PyMuPDF | -| `chunk_text()` | Split text into overlapping chunks | -| `get_embeddings()` | Generate embeddings via Ollama | -| `index_document()` | Store chunks in Qdrant | -| `search_qdrant()` | Retrieve relevant context | -| `on_message()` | Process user queries with RAG | - ---- - -## ๐Ÿ” Environment Variables - -```env -DATABASE_URL=postgresql+asyncpg://user:pass@postgres:5432/ai_station -OLLAMA_URL=http://192.168.1.243:11434 # Local Ollama for embeddings -QDRANT_URL=http://qdrant:6333 # Vector database -``` - -**Note**: GLM-4.6:Cloud authentication is handled automatically via `ollama signin` - ---- - -## ๐Ÿณ Docker Services - -| Service | Port | Purpose | -|---------|------|---------| -| `chainlit-app` | 8000 | Chainlit UI & API | -| `postgres` | 5432 | Conversation persistence | -| `qdrant` | 6333 | Vector database | -| `ollama` | 11434 | Local embeddings (external) | - -Start/Stop: +๐Ÿ” Sicurezza ```bash -docker compose up -d # Start all services -docker compose down # Stop all services -docker compose logs -f # View logs -docker compose restart # Restart services +OAuth2 obbligatorio (no accesso anonimo) + +Workspace isolation (file separati per utente) + +HTML sanitization (configurable via unsafe_allow_html) + +Environment secrets (.env mai committato) + +PostgreSQL passwords cambiate da default ``` ---- -## ๐Ÿ“ Usage Examples - -### Example 1: Analyze Tax Document -``` -User: "Qual รจ l'importo totale del documento?" -AI Station: - โœ… Extracts PDF content - โœ… Searches relevant sections - โœ… Analyzes with GLM-4.6:Cloud - ๐Ÿ“„ Returns: "Based on the document, the total amount is..." -``` - -### Example 2: Multi-Document Analysis -``` -1. Upload multiple PDFs (invoices, contracts) -2. All documents automatically indexed -3. Query across all documents simultaneously -4. RAG retrieves most relevant chunks -5. GLM-4.6:Cloud synthesizes answer -``` - ---- - -## ๐Ÿ› ๏ธ Development - -### Install Dependencies +๐Ÿšฆ Roadmap ```bash -pip install -r requirements.txt + Re-ranking con cross-encoder + + Query expansion automatica + + Feedback loop (๐Ÿ‘๐Ÿ‘Ž su risposte) + + Export conversazioni PDF/Markdown + + Multi-query RAG parallelo + + Prometheus/Grafana monitoring + + Adaptive chunking per tipo documento + + Audio input support ``` -### Requirements -``` -chainlit==1.3.2 -pydantic==2.9.2 -ollama>=0.1.0 -asyncpg>=0.29.0 -psycopg2-binary -qdrant-client>=1.10.0 -sqlalchemy>=2.0.0 -greenlet>=3.0.0 -sniffio -aiohttp -alembic -pymupdf -python-dotenv + +## ๐Ÿ“ Licenza +```tect +MIT License - vedi file [LICENSE](LICENSE) per dettagli. +Crea file LICENSE nella root del progetto +text +MIT License + +Copyright (c) 2026 DFFM / Giuseppe De Franceschi + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. ``` -### Local Testing (without Docker) -```bash -# Start Ollama, PostgreSQL, Qdrant manually -ollama serve & -chainlit run app.py -``` +๐Ÿ‘ฅ Contributors +Giuseppe De Franceschi - @defranceschi ---- +๐Ÿ™ Credits +Chainlit - UI framework -## ๐Ÿ”„ Model Details +Ollama - LLM runtime -### GLM-4.6:Cloud -- **Provider**: Zhipu AI via Ollama Cloud -- **Capabilities**: Long context, reasoning, multilingual -- **Cost**: Free tier available -- **Authentication**: Device key (automatic via `ollama signin`) +Qdrant - Vector DB -### nomic-embed-text -- **Local embedding model** for chunking/retrieval -- **Dimensions**: 768 -- **Speed**: Fast, runs locally -- **Used for**: RAG semantic search +BGE-M3 - Embeddings ---- +Docling - Document processing -## ๐Ÿ“ˆ Monitoring & Logs - -### Check Service Health -```bash -# View all logs -docker compose logs - -# Follow live logs -docker compose logs -f chainlit-app - -# Check specific container -docker inspect ai-station-chainlit-app -``` - -### Common Issues -| Issue | Solution | -|-------|----------| -| `unauthorized` error | Run `ollama signin` on server | -| Database connection failed | Check PostgreSQL is running | -| Qdrant unavailable | Verify `docker-compose up` completed | -| PDF not extracted | Ensure PyMuPDF installed: `pip install pymupdf` | - ---- - -## ๐Ÿš€ Deployment - -### Production Checklist -- [ ] Set secure PostgreSQL credentials in `.env` -- [ ] Enable SSL/TLS for Chainlit endpoints -- [ ] Configure CORS for frontend -- [ ] Setup log aggregation (ELK, Datadog, etc.) -- [ ] Implement rate limiting -- [ ] Add API authentication -- [ ] Configure backup strategy for Qdrant - -### Cloud Deployment Options -- **AWS**: ECS + RDS + VectorDB -- **Google Cloud**: Cloud Run + Cloud SQL -- **DigitalOcean**: App Platform + Managed Databases - ---- - -## ๐Ÿ“š API Reference - -### REST Endpoints (via Chainlit) -- `POST /api/chat` - Send message with context -- `GET /api/threads` - List conversations -- `POST /api/upload` - Upload document - -### WebSocket -- Real-time streaming responses via Chainlit protocol - ---- - -## ๐Ÿ”ฎ Future Features - -- [ ] OAuth2 Google authentication -- [ ] Document metadata extraction (dates, amounts, entities) -- [ ] Advanced search filters (type, date range, language) -- [ ] Export results (PDF, CSV, JSON) -- [ ] Analytics dashboard -- [ ] Multi-language support -- [ ] Document versioning -- [ ] Compliance reporting (GDPR, audit trails) - ---- - -## ๐Ÿ“ž Support - -### Troubleshooting -1. Check logs: `docker compose logs chainlit-app` -2. Verify Ollama authentication: `ollama show glm-4.6:cloud` -3. Test Qdrant connection: `curl http://localhost:6333/health` -4. Inspect PostgreSQL: `docker compose exec postgres psql -U ai_user -d ai_station` - -### Performance Tips -- Increase chunk overlap for better context retrieval -- Adjust embedding model based on latency requirements -- Monitor Qdrant memory usage for large document sets -- Implement caching for frequent queries - ---- - -## ๐Ÿ“„ License - -MIT License - See LICENSE file - -## ๐Ÿ‘ค Author - -AI Station Team - ---- - -**Last Updated**: December 26, 2025 -**Version**: 1.0.0 -**Status**: Production Ready โœ… +## **Status**: ๐Ÿ”จ Pre-Production | **Last Update**: 2026-01-01 \ No newline at end of file