readme add

2026-01-01 18:06:28 +01:00 · 2026-01-01 18:06:28 +01:00 · b2ff4238af
parent 939a3d11a7
commit b2ff4238af
1 changed files with 305 additions and 288 deletions
--- a/README.md
+++ b/README.md
@ -1,325 +1,342 @@
-# AI Station - Document Analysis Platform
+# AI Station DFFM - Vision-RAG Hybrid System

-## 📋 Overview
+Sistema AI multi-utente con supporto RAG (Retrieval-Augmented Generation), analisi immagini e gestione documenti avanzata, basato su Chainlit, Ollama e BGE-M3.

-**AI Station** è una piattaforma di analisi documentale basata su AI che utilizza **Retrieval-Augmented Generation (RAG)** per analizzare PDF e documenti testuali con il modello **GLM-4.6:Cloud**.
+## 🌟 Features
+
+### Core AI
+- **RAG Hybrid Search** con BGE-M3 (dense + sparse embeddings)
+- **Vision Analysis** tramite MiniCPM-V per OCR e descrizione immagini
+- **Document Processing** con Docling (PDF, DOCX) con preservazione tabelle/formule
+- **Multi-Model Support** (Ollama locale + cloud models)
+- **Streaming Responses** con latenza ridotta
+
+### Multi-Utente
+- **OAuth2 Google** con profili personalizzati per ruolo
+- **Workspace isolati** per utente/team
+- **RAG Collections dedicate** per knowledge base separate
+- **Permessi granulari** (admin, engineering, business, architecture)
+
+### UI/UX
+- **Badge ruolo personalizzato** con colori dedicati
+- **Settings dinamici** (temperatura, top_k RAG, modello, istruzioni custom)
+- **Chat history persistente** con ripresa conversazioni
+- **Auto-save codice Python** estratto dalle risposte
+- **Metriche real-time** (response time, RAG hits, errori)
+
+### Performance
+- **Caching embeddings** (LRU cache 1000 query)
+- **Chunking intelligente** (2000 char con overlap 200)
+- **Async operations** su Qdrant e Ollama
+- **PostgreSQL** per persistenza thread e metadata
+
+## 🏗️ Architettura
+
+### Hardware Setup
+
+#### AI-SRV (Chainlit VM)
+- **IP**: 192.168.1.244
+- **CPU**: 16 core (QEMU Virtual)
+- **RAM**: 64 GB
+- **Storage**: 195 GB
+- **Ruolo**: Host Chainlit app + PostgreSQL + Qdrant
+
+#### AI-Server (GPU Workstation)
+- **IP**: 192.168.1.243
+- **CPU**: Intel Core Ultra 7 265 (20 core, max 6.5 GHz)
+- **RAM**: 32 GB
+- **GPU**: NVIDIA RTX A1000 (8 GB VRAM)
+- **Storage**: 936 GB NVMe
+- **Ruolo**: Ollama models + BGE-M3 embeddings service

 ### Stack Tecnologico
- **Backend**: Python + Chainlit (LLM UI framework)
- **LLM**: GLM-4.6:Cloud (via Ollama Cloud)
- **Vector DB**: Qdrant (semantic search)
- **PDF Processing**: PyMuPDF (fitz)
- **Database**: PostgreSQL + SQLAlchemy ORM
- **Containerization**: Docker Compose
- **Embeddings**: nomic-embed-text (via Ollama local)

---
+┌─────────────────────────────────────────┐
+│ Chainlit UI (ai-srv) │
+│ Badge + Settings + Chat History │
+└──────────────┬──────────────────────────┘
+│
+┌──────────────▼──────────────────────────┐
+│ Python Backend (app.py) │
+│ - OAuth2 Google │
+│ - Multi-user profiles │
+│ - File processing orchestration │
+└─┬────────┬──────────┬──────────┬────────┘
+│ │ │ │
+▼ ▼ ▼ ▼
+┌─────┐ ┌─────┐ ┌────────┐ ┌──────────┐
+│ PG │ │Qdrant│ │ Ollama │ │ BGE API │
+│ │ │Vector│ │ GPU │ │ CPU │
+│ │ │ DB │ │ Server │ │ Server │
+└─────┘ └─────┘ └────────┘ └──────────┘
+ai-srv ai-srv ai-server ai-server

-## 🚀 Quick Start
+text

-### Prerequisites
- Docker & Docker Compose
- Ollama installed locally (for embeddings)
- Ollama Cloud account (for glm-4.6:cloud)
+## 📋 Requisiti

-### 1️⃣ Clone & Setup
+### Sistema
+- Docker 24.x+ con Docker Compose
+- Accesso a Google Cloud Console (per OAuth2)
+- 2 server (o VM) con networking condiviso
+
+### Modelli Ollama (da installare su ai-server)
 ```bash
-git clone git@github.com:your-username/ai-station.git
+ollama pull minicpm-v              # Vision model (5.5 GB)
+ollama pull glm-4.6:cloud          # Cloud reasoning
+ollama pull qwen2.5-coder:32b      # Code generation (9 GB)
+ollama pull llama3.2               # Fast general purpose (4.7 GB)
+🚀 Installazione
+1. Clone Repository
+bash
+git clone <your-repo>
 cd ai-station
+2. Configurazione Ambiente
+Crea .env:

-# Configure environment
-cat > .env << 'EOF'
-DATABASE_URL=postgresql+asyncpg://ai_user:secure_password_here@postgres:5432/ai_station
+bash
+# Database
+DATABASE_URL=postgresql+asyncpg://ai_user:CHANGE_ME@postgres:5432/ai_station
+
+# AI Services
 OLLAMA_URL=http://192.168.1.243:11434
 QDRANT_URL=http://qdrant:6333
-EOF
-```
+BGE_API_URL=http://192.168.1.243:8001/embed

-### 2️⃣ Authenticate Ollama Cloud
-```bash
-ollama signin
-# Follow the link to authenticate with your Ollama account
-```
+# OAuth Google
+OAUTH_GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com
+OAUTH_GOOGLE_CLIENT_SECRET=your-secret
+CHAINLIT_AUTH_SECRET=$(openssl rand -base64 32)

-### 3️⃣ Start Services
-```bash
-docker compose up -d
+3. Configurazione OAuth Google
+Vai su Google Cloud Console
+
+Crea nuovo progetto → API e servizi → Credenziali
+
+Crea "ID client OAuth 2.0"
+
+Aggiungi URI autorizzati:
+
+https://ai.dffm.it/auth/oauth/google/callback
+
+http://localhost:8000/auth/oauth/google/callback (dev)
+
+Copia Client ID e Secret in .env
+
+4. Personalizza Utenti
+Modifica app.py → USER_PROFILES:
+
+python
+USER_PROFILES = {
+    "tuo.email@example.com": {
+        "role": "admin",
+        "name": "Nome",
+        "workspace": "workspace_name",
+        "rag_collection": "docs_collection",
+        "capabilities": ["debug", "all"],
+        "show_code": True,
+    },
+    # ... altri utenti
+}
+5. Deploy
+bash
+# Build e avvio
+docker compose up -d --build
+
+# Verifica logs
 docker compose logs -f chainlit-app
+
+# Dovresti vedere:
+# ✅ Tutte le tabelle create con successo.
+# Your app is available at http://localhost:8000
+6. Setup BGE-M3 Service (su ai-server)
+bash
+# Installa dependencies
+pip install fastapi uvicorn FlagEmbedding torch
+
+# Salva il file bge_service.py (vedi docs/)
+python bge_service.py
+# Listening on http://0.0.0.0:8001
+🎯 Utilizzo
+Login
+Accedi via browser: https://ai.dffm.it (o http://localhost:8000)
+
+Click su "Continue with Google"
+
+Autorizza con account configurato in USER_PROFILES
+
+Chat con RAG
+Carica PDF/DOCX → Sistema li indicizza automaticamente
+
+Fai domande → Risposta con contesto dai documenti
+
+Regola top_k (numero documenti) via settings
+
+Analisi Immagini
+Carica screenshot/diagrammi
+
+Il sistema:
+
+Estrae testo (OCR)
+
+Descrive grafici/tabelle
+
+Usa descrizione come contesto per rispondere
+
+Settings Disponibili
+Numero Documenti RAG (1-10): Quanti chunk recuperare
+
+Modello: Scegli tra locale/cloud
+
+Temperatura (0-1): Creatività risposta
+
+RAG Enabled: On/Off recupero documenti
+
+Istruzione Custom: Prompt system personalizzato
+
+Ripresa Chat
+Sidebar → Chat History
+
+Click su conversazione → "Riprendi"
+
+Continua da dove avevi lasciato
+
+📊 Metriche
+Ogni risposta logga (stdout):
+
+json
+{
+  "response_time": 18.65,
+  "rag_hits": 4,
+  "model": "glm-4.6:cloud",
+  "user_role": "admin",
+  "error": null
+}
+Raccogli con:
+
+bash
+docker logs ai-station-app | grep METRICS > metrics.log
+🔧 Troubleshooting
+RAG non trova documenti
+Verifica collection name in USER_PROFILES[email]["rag_collection"]
+
+Controlla Qdrant: curl http://localhost:6333/collections
+
+Badge HTML non si vede
+Abilita in .chainlit/config.toml:
 ```

-### 4️⃣ Access UI
-Navigate to: **http://localhost:8000**
-
---
-
-## 📁 Project Structure
+```text
+[features]
+unsafe_allow_html = true
+Modello Ollama non risponde
+bash
+# Testa connessione
+curl http://192.168.1.243:11434/api/tags

+# Verifica modello disponibile
+ollama list
+BGE embeddings fail
 ```
+```bash
+# Testa API
+curl -X POST http://192.168.1.243:8001/embed \
+  -H "Content-Type: application/json" \
+  -d '{"texts": ["test"]}'
+```
+
+📁 Struttura Progetto
+```bash
 ai-station/
-├── app.py                 # Main Chainlit application
-├── requirements.txt       # Python dependencies
-├── docker-compose.yml     # Docker services config
-├── .env                   # Environment variables (gitignored)
-├── workspaces/           # User workspace directories
-│   └── admin/            # Admin user files
-└── README.md             # This file
+├── app.py                    # Main Chainlit app
+├── init_db.py               # Database schema init
+├── requirements.txt         # Python deps
+├── Dockerfile              # Container config
+├── docker-compose.yaml     # Multi-service orchestration
+├── .chainlit/
+│   └── config.toml         # UI/features config
+├── public/
+│   └── custom.css          # Custom styling
+├── workspaces/             # User file storage (volume)
+│   ├── admin_workspace/
+│   ├── engineering_workspace/
+│   └── ...
+└── .files/                 # Chainlit storage (volume)
 ```
-
---
-
-## 🔧 Features
-
-### ✅ Implemented
- **PDF Upload & Processing**: Extract text from PDF documents using PyMuPDF
- **Document Indexing**: Automatic chunking and semantic indexing via Qdrant
- **RAG Search**: Retrieve relevant document chunks based on semantic similarity
- **Intelligent Analysis**: GLM-4.6:Cloud analyzes documents with full context
- **Code Extraction**: Automatically save Python code blocks from responses
- **Chat History**: Persistent conversation storage via SQLAlchemy
- **Streaming Responses**: Real-time token streaming via Chainlit
-
-### 🔄 Workflow
-1. User uploads PDF or TXT file
-2. System extracts text and creates semantic chunks
-3. Chunks indexed in Qdrant vector database
-4. User asks questions about documents
-5. RAG retrieves relevant chunks
-6. GLM-4.6:Cloud analyzes with full context
-7. Streaming response to user
-
---
-
-## 📊 Technical Details
-
-### Document Processing Pipeline
-
-```
-PDF Upload
-    ↓
-PyMuPDF Text Extraction
-    ↓
-Text Chunking (1500 chars, 200 char overlap)
-    ↓
-nomic-embed-text Embeddings (Ollama local)
-    ↓
-Qdrant Vector Storage
-    ↓
-Semantic Search on User Query
-    ↓
-GLM-4.6:Cloud Analysis with RAG Context
-    ↓
-Chainlit Streaming Response
-```
-
-### Key Functions
-
-| Function | Purpose |
-|----------|---------|
-| `extract_text_from_pdf()` | Convert PDF to text using PyMuPDF |
-| `chunk_text()` | Split text into overlapping chunks |
-| `get_embeddings()` | Generate embeddings via Ollama |
-| `index_document()` | Store chunks in Qdrant |
-| `search_qdrant()` | Retrieve relevant context |
-| `on_message()` | Process user queries with RAG |
-
---
-
-## 🔐 Environment Variables
-
-```env
-DATABASE_URL=postgresql+asyncpg://user:pass@postgres:5432/ai_station
-OLLAMA_URL=http://192.168.1.243:11434          # Local Ollama for embeddings
-QDRANT_URL=http://qdrant:6333                  # Vector database
-```
-
-**Note**: GLM-4.6:Cloud authentication is handled automatically via `ollama signin`
-
---
-
-## 🐳 Docker Services
-
-| Service | Port | Purpose |
-|---------|------|---------|
-| `chainlit-app` | 8000 | Chainlit UI & API |
-| `postgres` | 5432 | Conversation persistence |
-| `qdrant` | 6333 | Vector database |
-| `ollama` | 11434 | Local embeddings (external) |
-
-Start/Stop:
+🔐 Sicurezza
 ```bash
-docker compose up -d      # Start all services
-docker compose down       # Stop all services
-docker compose logs -f    # View logs
-docker compose restart    # Restart services
+OAuth2 obbligatorio (no accesso anonimo)
+
+Workspace isolation (file separati per utente)
+
+HTML sanitization (configurable via unsafe_allow_html)
+
+Environment secrets (.env mai committato)
+
+PostgreSQL passwords cambiate da default
 ```

---

-## 📝 Usage Examples
-
-### Example 1: Analyze Tax Document
-```
-User: "Qual è l'importo totale del documento?"
-AI Station: 
-  ✅ Extracts PDF content
-  ✅ Searches relevant sections
-  ✅ Analyzes with GLM-4.6:Cloud
-  📄 Returns: "Based on the document, the total amount is..."
-```
-
-### Example 2: Multi-Document Analysis
-```
-1. Upload multiple PDFs (invoices, contracts)
-2. All documents automatically indexed
-3. Query across all documents simultaneously
-4. RAG retrieves most relevant chunks
-5. GLM-4.6:Cloud synthesizes answer
-```
-
---
-
-## 🛠️ Development
-
-### Install Dependencies
+🚦 Roadmap
 ```bash
-pip install -r requirements.txt
+ Re-ranking con cross-encoder
+
+ Query expansion automatica
+
+ Feedback loop (👍👎 su risposte)
+
+ Export conversazioni PDF/Markdown
+
+ Multi-query RAG parallelo
+
+ Prometheus/Grafana monitoring
+
+ Adaptive chunking per tipo documento
+
+ Audio input support
 ```

-### Requirements
-```
-chainlit==1.3.2
-pydantic==2.9.2
-ollama>=0.1.0
-asyncpg>=0.29.0
-psycopg2-binary
-qdrant-client>=1.10.0
-sqlalchemy>=2.0.0
-greenlet>=3.0.0
-sniffio
-aiohttp
-alembic
-pymupdf
-python-dotenv
+
+## 📝 Licenza
+```tect
+MIT License - vedi file [LICENSE](LICENSE) per dettagli.
+Crea file LICENSE nella root del progetto
+text
+MIT License
+
+Copyright (c) 2026 DFFM / Giuseppe De Franceschi
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
 ```

-### Local Testing (without Docker)
-```bash
-# Start Ollama, PostgreSQL, Qdrant manually
-ollama serve &
-chainlit run app.py
-```
+👥 Contributors
+Giuseppe De Franceschi - @defranceschi

---
+🙏 Credits
+Chainlit - UI framework

-## 🔄 Model Details
+Ollama - LLM runtime

-### GLM-4.6:Cloud
- **Provider**: Zhipu AI via Ollama Cloud
- **Capabilities**: Long context, reasoning, multilingual
- **Cost**: Free tier available
- **Authentication**: Device key (automatic via `ollama signin`)
+Qdrant - Vector DB

-### nomic-embed-text
- **Local embedding model** for chunking/retrieval
- **Dimensions**: 768
- **Speed**: Fast, runs locally
- **Used for**: RAG semantic search
+BGE-M3 - Embeddings

---
+Docling - Document processing

-## 📈 Monitoring & Logs
-
-### Check Service Health
-```bash
-# View all logs
-docker compose logs
-
-# Follow live logs
-docker compose logs -f chainlit-app
-
-# Check specific container
-docker inspect ai-station-chainlit-app
-```
-
-### Common Issues
-| Issue | Solution |
-|-------|----------|
-| `unauthorized` error | Run `ollama signin` on server |
-| Database connection failed | Check PostgreSQL is running |
-| Qdrant unavailable | Verify `docker-compose up` completed |
-| PDF not extracted | Ensure PyMuPDF installed: `pip install pymupdf` |
-
---
-
-## 🚀 Deployment
-
-### Production Checklist
- [ ] Set secure PostgreSQL credentials in `.env`
- [ ] Enable SSL/TLS for Chainlit endpoints
- [ ] Configure CORS for frontend
- [ ] Setup log aggregation (ELK, Datadog, etc.)
- [ ] Implement rate limiting
- [ ] Add API authentication
- [ ] Configure backup strategy for Qdrant
-
-### Cloud Deployment Options
- **AWS**: ECS + RDS + VectorDB
- **Google Cloud**: Cloud Run + Cloud SQL
- **DigitalOcean**: App Platform + Managed Databases
-
---
-
-## 📚 API Reference
-
-### REST Endpoints (via Chainlit)
- `POST /api/chat` - Send message with context
- `GET /api/threads` - List conversations
- `POST /api/upload` - Upload document
-
-### WebSocket
- Real-time streaming responses via Chainlit protocol
-
---
-
-## 🔮 Future Features
-
- [ ] OAuth2 Google authentication
- [ ] Document metadata extraction (dates, amounts, entities)
- [ ] Advanced search filters (type, date range, language)
- [ ] Export results (PDF, CSV, JSON)
- [ ] Analytics dashboard
- [ ] Multi-language support
- [ ] Document versioning
- [ ] Compliance reporting (GDPR, audit trails)
-
---
-
-## 📞 Support
-
-### Troubleshooting
-1. Check logs: `docker compose logs chainlit-app`
-2. Verify Ollama authentication: `ollama show glm-4.6:cloud`
-3. Test Qdrant connection: `curl http://localhost:6333/health`
-4. Inspect PostgreSQL: `docker compose exec postgres psql -U ai_user -d ai_station`
-
-### Performance Tips
- Increase chunk overlap for better context retrieval
- Adjust embedding model based on latency requirements
- Monitor Qdrant memory usage for large document sets
- Implement caching for frequent queries
-
---
-
-## 📄 License
-
-MIT License - See LICENSE file
-
-## 👤 Author
-
-AI Station Team
-
---
-
-**Last Updated**: December 26, 2025
-**Version**: 1.0.0
-**Status**: Production Ready ✅
+## **Status**: 🔨 Pre-Production | **Last Update**: 2026-01-01