ai-station/README.md

342 lines
9.2 KiB
Markdown
Raw Normal View History

2026-01-01 17:06:28 +00:00
# AI Station DFFM - Vision-RAG Hybrid System
Sistema AI multi-utente con supporto RAG (Retrieval-Augmented Generation), analisi immagini e gestione documenti avanzata, basato su Chainlit, Ollama e BGE-M3.
## 🌟 Features
### Core AI
- **RAG Hybrid Search** con BGE-M3 (dense + sparse embeddings)
- **Vision Analysis** tramite MiniCPM-V per OCR e descrizione immagini
- **Document Processing** con Docling (PDF, DOCX) con preservazione tabelle/formule
- **Multi-Model Support** (Ollama locale + cloud models)
- **Streaming Responses** con latenza ridotta
### Multi-Utente
- **OAuth2 Google** con profili personalizzati per ruolo
- **Workspace isolati** per utente/team
- **RAG Collections dedicate** per knowledge base separate
- **Permessi granulari** (admin, engineering, business, architecture)
### UI/UX
- **Badge ruolo personalizzato** con colori dedicati
- **Settings dinamici** (temperatura, top_k RAG, modello, istruzioni custom)
- **Chat history persistente** con ripresa conversazioni
- **Auto-save codice Python** estratto dalle risposte
- **Metriche real-time** (response time, RAG hits, errori)
### Performance
- **Caching embeddings** (LRU cache 1000 query)
- **Chunking intelligente** (2000 char con overlap 200)
- **Async operations** su Qdrant e Ollama
- **PostgreSQL** per persistenza thread e metadata
## 🏗️ Architettura
### Hardware Setup
#### AI-SRV (Chainlit VM)
- **IP**: 192.168.1.244
- **CPU**: 16 core (QEMU Virtual)
- **RAM**: 64 GB
- **Storage**: 195 GB
- **Ruolo**: Host Chainlit app + PostgreSQL + Qdrant
#### AI-Server (GPU Workstation)
- **IP**: 192.168.1.243
- **CPU**: Intel Core Ultra 7 265 (20 core, max 6.5 GHz)
- **RAM**: 32 GB
- **GPU**: NVIDIA RTX A1000 (8 GB VRAM)
- **Storage**: 936 GB NVMe
- **Ruolo**: Ollama models + BGE-M3 embeddings service
2025-12-29 05:50:06 +00:00
### Stack Tecnologico
2026-01-01 17:06:28 +00:00
┌─────────────────────────────────────────┐
│ Chainlit UI (ai-srv) │
│ Badge + Settings + Chat History │
└──────────────┬──────────────────────────┘
┌──────────────▼──────────────────────────┐
│ Python Backend (app.py) │
│ - OAuth2 Google │
│ - Multi-user profiles │
│ - File processing orchestration │
└─┬────────┬──────────┬──────────┬────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────┐ ┌─────┐ ┌────────┐ ┌──────────┐
│ PG │ │Qdrant│ │ Ollama │ │ BGE API │
│ │ │Vector│ │ GPU │ │ CPU │
│ │ │ DB │ │ Server │ │ Server │
└─────┘ └─────┘ └────────┘ └──────────┘
ai-srv ai-srv ai-server ai-server
text
## 📋 Requisiti
### Sistema
- Docker 24.x+ con Docker Compose
- Accesso a Google Cloud Console (per OAuth2)
- 2 server (o VM) con networking condiviso
### Modelli Ollama (da installare su ai-server)
2025-12-29 05:50:06 +00:00
```bash
2026-01-01 17:06:28 +00:00
ollama pull minicpm-v # Vision model (5.5 GB)
ollama pull glm-4.6:cloud # Cloud reasoning
ollama pull qwen2.5-coder:32b # Code generation (9 GB)
ollama pull llama3.2 # Fast general purpose (4.7 GB)
🚀 Installazione
1. Clone Repository
bash
git clone <your-repo>
cd ai-station
2026-01-01 17:06:28 +00:00
2. Configurazione Ambiente
Crea .env:
bash
# Database
DATABASE_URL=postgresql+asyncpg://ai_user:CHANGE_ME@postgres:5432/ai_station
2026-01-01 17:06:28 +00:00
# AI Services
2025-12-29 05:50:06 +00:00
OLLAMA_URL=http://192.168.1.243:11434
QDRANT_URL=http://qdrant:6333
2026-01-01 17:06:28 +00:00
BGE_API_URL=http://192.168.1.243:8001/embed
2026-01-01 17:06:28 +00:00
# OAuth Google
OAUTH_GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com
OAUTH_GOOGLE_CLIENT_SECRET=your-secret
CHAINLIT_AUTH_SECRET=$(openssl rand -base64 32)
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
3. Configurazione OAuth Google
Vai su Google Cloud Console
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Crea nuovo progetto → API e servizi → Credenziali
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Crea "ID client OAuth 2.0"
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Aggiungi URI autorizzati:
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
https://ai.dffm.it/auth/oauth/google/callback
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
http://localhost:8000/auth/oauth/google/callback (dev)
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Copia Client ID e Secret in .env
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
4. Personalizza Utenti
Modifica app.py → USER_PROFILES:
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
python
USER_PROFILES = {
"tuo.email@example.com": {
"role": "admin",
"name": "Nome",
"workspace": "workspace_name",
"rag_collection": "docs_collection",
"capabilities": ["debug", "all"],
"show_code": True,
},
# ... altri utenti
}
5. Deploy
bash
# Build e avvio
docker compose up -d --build
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
# Verifica logs
docker compose logs -f chainlit-app
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
# Dovresti vedere:
# ✅ Tutte le tabelle create con successo.
# Your app is available at http://localhost:8000
6. Setup BGE-M3 Service (su ai-server)
bash
# Installa dependencies
pip install fastapi uvicorn FlagEmbedding torch
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
# Salva il file bge_service.py (vedi docs/)
python bge_service.py
# Listening on http://0.0.0.0:8001
🎯 Utilizzo
Login
Accedi via browser: https://ai.dffm.it (o http://localhost:8000)
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Click su "Continue with Google"
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Autorizza con account configurato in USER_PROFILES
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Chat con RAG
Carica PDF/DOCX → Sistema li indicizza automaticamente
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Fai domande → Risposta con contesto dai documenti
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Regola top_k (numero documenti) via settings
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Analisi Immagini
Carica screenshot/diagrammi
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Il sistema:
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Estrae testo (OCR)
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Descrive grafici/tabelle
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Usa descrizione come contesto per rispondere
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Settings Disponibili
Numero Documenti RAG (1-10): Quanti chunk recuperare
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Modello: Scegli tra locale/cloud
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Temperatura (0-1): Creatività risposta
2026-01-01 17:06:28 +00:00
RAG Enabled: On/Off recupero documenti
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Istruzione Custom: Prompt system personalizzato
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Ripresa Chat
Sidebar → Chat History
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Click su conversazione → "Riprendi"
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Continua da dove avevi lasciato
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
📊 Metriche
Ogni risposta logga (stdout):
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
json
{
"response_time": 18.65,
"rag_hits": 4,
"model": "glm-4.6:cloud",
"user_role": "admin",
"error": null
}
Raccogli con:
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
bash
docker logs ai-station-app | grep METRICS > metrics.log
🔧 Troubleshooting
RAG non trova documenti
Verifica collection name in USER_PROFILES[email]["rag_collection"]
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Controlla Qdrant: curl http://localhost:6333/collections
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Badge HTML non si vede
Abilita in .chainlit/config.toml:
```
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
```text
[features]
unsafe_allow_html = true
Modello Ollama non risponde
bash
# Testa connessione
curl http://192.168.1.243:11434/api/tags
# Verifica modello disponibile
ollama list
BGE embeddings fail
```
2025-12-29 05:50:06 +00:00
```bash
2026-01-01 17:06:28 +00:00
# Testa API
curl -X POST http://192.168.1.243:8001/embed \
-H "Content-Type: application/json" \
-d '{"texts": ["test"]}'
2025-12-29 05:50:06 +00:00
```
2026-01-01 17:06:28 +00:00
📁 Struttura Progetto
```bash
ai-station/
├── app.py # Main Chainlit app
├── init_db.py # Database schema init
├── requirements.txt # Python deps
├── Dockerfile # Container config
├── docker-compose.yaml # Multi-service orchestration
├── .chainlit/
│ └── config.toml # UI/features config
├── public/
│ └── custom.css # Custom styling
├── workspaces/ # User file storage (volume)
│ ├── admin_workspace/
│ ├── engineering_workspace/
│ └── ...
└── .files/ # Chainlit storage (volume)
```
🔐 Sicurezza
```bash
OAuth2 obbligatorio (no accesso anonimo)
2026-01-01 17:06:28 +00:00
Workspace isolation (file separati per utente)
2026-01-01 17:06:28 +00:00
HTML sanitization (configurable via unsafe_allow_html)
2026-01-01 17:06:28 +00:00
Environment secrets (.env mai committato)
2026-01-01 17:06:28 +00:00
PostgreSQL passwords cambiate da default
```
2026-01-01 17:06:28 +00:00
🚦 Roadmap
```bash
Re-ranking con cross-encoder
2026-01-01 17:06:28 +00:00
Query expansion automatica
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Feedback loop (👍👎 su risposte)
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Export conversazioni PDF/Markdown
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Multi-query RAG parallelo
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Prometheus/Grafana monitoring
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Adaptive chunking per tipo documento
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Audio input support
```
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
## 📝 Licenza
```tect
MIT License - vedi file [LICENSE](LICENSE) per dettagli.
Crea file LICENSE nella root del progetto
text
MIT License
Copyright (c) 2026 DFFM / Giuseppe De Franceschi
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
👥 Contributors
Giuseppe De Franceschi - @defranceschi
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
🙏 Credits
Chainlit - UI framework
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Ollama - LLM runtime
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
Qdrant - Vector DB
2025-12-29 05:50:06 +00:00
2026-01-01 17:06:28 +00:00
BGE-M3 - Embeddings
2026-01-01 17:06:28 +00:00
Docling - Document processing
2026-01-01 17:06:28 +00:00
## **Status**: 🔨 Pre-Production | **Last Update**: 2026-01-01