# AI Station DFFM - Vision-RAG Hybrid System Sistema AI multi-utente con supporto RAG (Retrieval-Augmented Generation), analisi immagini e gestione documenti avanzata, basato su Chainlit, Ollama e BGE-M3. ## 🌟 Features ### Core AI - **RAG Hybrid Search** con BGE-M3 (dense + sparse embeddings) - **Vision Analysis** tramite MiniCPM-V per OCR e descrizione immagini - **Document Processing** con Docling (PDF, DOCX) con preservazione tabelle/formule - **Multi-Model Support** (Ollama locale + cloud models) - **Streaming Responses** con latenza ridotta ### Multi-Utente - **OAuth2 Google** con profili personalizzati per ruolo - **Workspace isolati** per utente/team - **RAG Collections dedicate** per knowledge base separate - **Permessi granulari** (admin, engineering, business, architecture) ### UI/UX - **Badge ruolo personalizzato** con colori dedicati - **Settings dinamici** (temperatura, top_k RAG, modello, istruzioni custom) - **Chat history persistente** con ripresa conversazioni - **Auto-save codice Python** estratto dalle risposte - **Metriche real-time** (response time, RAG hits, errori) ### Performance - **Caching embeddings** (LRU cache 1000 query) - **Chunking intelligente** (2000 char con overlap 200) - **Async operations** su Qdrant e Ollama - **PostgreSQL** per persistenza thread e metadata ## πŸ—οΈ Architettura ### Hardware Setup #### AI-SRV (Chainlit VM) - **IP**: 192.168.1.244 - **CPU**: 16 core (QEMU Virtual) - **RAM**: 64 GB - **Storage**: 195 GB - **Ruolo**: Host Chainlit app + PostgreSQL + Qdrant #### AI-Server (GPU Workstation) - **IP**: 192.168.1.243 - **CPU**: Intel Core Ultra 7 265 (20 core, max 6.5 GHz) - **RAM**: 32 GB - **GPU**: NVIDIA RTX A1000 (8 GB VRAM) - **Storage**: 936 GB NVMe - **Ruolo**: Ollama models + BGE-M3 embeddings service ### Stack Tecnologico β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Chainlit UI (ai-srv) β”‚ β”‚ Badge + Settings + Chat History β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Python Backend (app.py) β”‚ β”‚ - OAuth2 Google β”‚ β”‚ - Multi-user profiles β”‚ β”‚ - File processing orchestration β”‚ β””β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β–Ό β–Ό β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ PG β”‚ β”‚Qdrantβ”‚ β”‚ Ollama β”‚ β”‚ BGE API β”‚ β”‚ β”‚ β”‚Vectorβ”‚ β”‚ GPU β”‚ β”‚ CPU β”‚ β”‚ β”‚ β”‚ DB β”‚ β”‚ Server β”‚ β”‚ Server β”‚ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ai-srv ai-srv ai-server ai-server text ## πŸ“‹ Requisiti ### Sistema - Docker 24.x+ con Docker Compose - Accesso a Google Cloud Console (per OAuth2) - 2 server (o VM) con networking condiviso ### Modelli Ollama (da installare su ai-server) ```bash ollama pull minicpm-v # Vision model (5.5 GB) ollama pull glm-4.6:cloud # Cloud reasoning ollama pull qwen2.5-coder:32b # Code generation (9 GB) ollama pull llama3.2 # Fast general purpose (4.7 GB) πŸš€ Installazione 1. Clone Repository bash git clone cd ai-station 2. Configurazione Ambiente Crea .env: bash # Database DATABASE_URL=postgresql+asyncpg://ai_user:CHANGE_ME@postgres:5432/ai_station # AI Services OLLAMA_URL=http://192.168.1.243:11434 QDRANT_URL=http://qdrant:6333 BGE_API_URL=http://192.168.1.243:8001/embed # OAuth Google OAUTH_GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com OAUTH_GOOGLE_CLIENT_SECRET=your-secret CHAINLIT_AUTH_SECRET=$(openssl rand -base64 32) 3. Configurazione OAuth Google Vai su Google Cloud Console Crea nuovo progetto β†’ API e servizi β†’ Credenziali Crea "ID client OAuth 2.0" Aggiungi URI autorizzati: https://ai.dffm.it/auth/oauth/google/callback http://localhost:8000/auth/oauth/google/callback (dev) Copia Client ID e Secret in .env 4. Personalizza Utenti Modifica app.py β†’ USER_PROFILES: python USER_PROFILES = { "tuo.email@example.com": { "role": "admin", "name": "Nome", "workspace": "workspace_name", "rag_collection": "docs_collection", "capabilities": ["debug", "all"], "show_code": True, }, # ... altri utenti } 5. Deploy bash # Build e avvio docker compose up -d --build # Verifica logs docker compose logs -f chainlit-app # Dovresti vedere: # βœ… Tutte le tabelle create con successo. # Your app is available at http://localhost:8000 6. Setup BGE-M3 Service (su ai-server) bash # Installa dependencies pip install fastapi uvicorn FlagEmbedding torch # Salva il file bge_service.py (vedi docs/) python bge_service.py # Listening on http://0.0.0.0:8001 🎯 Utilizzo Login Accedi via browser: https://ai.dffm.it (o http://localhost:8000) Click su "Continue with Google" Autorizza con account configurato in USER_PROFILES Chat con RAG Carica PDF/DOCX β†’ Sistema li indicizza automaticamente Fai domande β†’ Risposta con contesto dai documenti Regola top_k (numero documenti) via settings Analisi Immagini Carica screenshot/diagrammi Il sistema: Estrae testo (OCR) Descrive grafici/tabelle Usa descrizione come contesto per rispondere Settings Disponibili Numero Documenti RAG (1-10): Quanti chunk recuperare Modello: Scegli tra locale/cloud Temperatura (0-1): CreativitΓ  risposta RAG Enabled: On/Off recupero documenti Istruzione Custom: Prompt system personalizzato Ripresa Chat Sidebar β†’ Chat History Click su conversazione β†’ "Riprendi" Continua da dove avevi lasciato πŸ“Š Metriche Ogni risposta logga (stdout): json { "response_time": 18.65, "rag_hits": 4, "model": "glm-4.6:cloud", "user_role": "admin", "error": null } Raccogli con: bash docker logs ai-station-app | grep METRICS > metrics.log πŸ”§ Troubleshooting RAG non trova documenti Verifica collection name in USER_PROFILES[email]["rag_collection"] Controlla Qdrant: curl http://localhost:6333/collections Badge HTML non si vede Abilita in .chainlit/config.toml: ``` ```text [features] unsafe_allow_html = true Modello Ollama non risponde bash # Testa connessione curl http://192.168.1.243:11434/api/tags # Verifica modello disponibile ollama list BGE embeddings fail ``` ```bash # Testa API curl -X POST http://192.168.1.243:8001/embed \ -H "Content-Type: application/json" \ -d '{"texts": ["test"]}' ``` πŸ“ Struttura Progetto ```bash ai-station/ β”œβ”€β”€ app.py # Main Chainlit app β”œβ”€β”€ init_db.py # Database schema init β”œβ”€β”€ requirements.txt # Python deps β”œβ”€β”€ Dockerfile # Container config β”œβ”€β”€ docker-compose.yaml # Multi-service orchestration β”œβ”€β”€ .chainlit/ β”‚ └── config.toml # UI/features config β”œβ”€β”€ public/ β”‚ └── custom.css # Custom styling β”œβ”€β”€ workspaces/ # User file storage (volume) β”‚ β”œβ”€β”€ admin_workspace/ β”‚ β”œβ”€β”€ engineering_workspace/ β”‚ └── ... └── .files/ # Chainlit storage (volume) ``` πŸ” Sicurezza ```bash OAuth2 obbligatorio (no accesso anonimo) Workspace isolation (file separati per utente) HTML sanitization (configurable via unsafe_allow_html) Environment secrets (.env mai committato) PostgreSQL passwords cambiate da default ``` 🚦 Roadmap ```bash Re-ranking con cross-encoder Query expansion automatica Feedback loop (πŸ‘πŸ‘Ž su risposte) Export conversazioni PDF/Markdown Multi-query RAG parallelo Prometheus/Grafana monitoring Adaptive chunking per tipo documento Audio input support ``` ## πŸ“ Licenza ```tect MIT License - vedi file [LICENSE](LICENSE) per dettagli. Crea file LICENSE nella root del progetto text MIT License Copyright (c) 2026 DFFM / Giuseppe De Franceschi Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ``` πŸ‘₯ Contributors Giuseppe De Franceschi - @defranceschi πŸ™ Credits Chainlit - UI framework Ollama - LLM runtime Qdrant - Vector DB BGE-M3 - Embeddings Docling - Document processing ## **Status**: πŸ”¨ Pre-Production | **Last Update**: 2026-01-01