readme add

2026-01-01 18:06:28 +01:00 · 2026-01-01 18:06:28 +01:00 · b2ff4238af
parent 939a3d11a7
commit b2ff4238af
1 changed files with 305 additions and 288 deletions
--- a/README.md
+++ b/README.md
@ -1,325 +1,342 @@
-# AI Station - Document Analysis Platform
+# AI Station DFFM - Vision-RAG Hybrid System
-## 📋 Overview
+Sistema AI multi-utente con supporto RAG (Retrieval-Augmented Generation), analisi immagini e gestione documenti avanzata, basato su Chainlit, Ollama e BGE-M3.
-**AI Station** è una piattaforma di analisi documentale basata su AI che utilizza **Retrieval-Augmented Generation (RAG)** per analizzare PDF e documenti testuali con il modello **GLM-4.6:Cloud**.
+## 🌟 Features
 ### Core AI
 - **RAG Hybrid Search** con BGE-M3 (dense + sparse embeddings)
 - **Vision Analysis** tramite MiniCPM-V per OCR e descrizione immagini
 - **Document Processing** con Docling (PDF, DOCX) con preservazione tabelle/formule
 - **Multi-Model Support** (Ollama locale + cloud models)
 - **Streaming Responses** con latenza ridotta
 ### Multi-Utente
 - **OAuth2 Google** con profili personalizzati per ruolo
 - **Workspace isolati** per utente/team
 - **RAG Collections dedicate** per knowledge base separate
 - **Permessi granulari** (admin, engineering, business, architecture)
 ### UI/UX
 - **Badge ruolo personalizzato** con colori dedicati
 - **Settings dinamici** (temperatura, top_k RAG, modello, istruzioni custom)
 - **Chat history persistente** con ripresa conversazioni
 - **Auto-save codice Python** estratto dalle risposte
 - **Metriche real-time** (response time, RAG hits, errori)
 ### Performance
 - **Caching embeddings** (LRU cache 1000 query)
 - **Chunking intelligente** (2000 char con overlap 200)
 - **Async operations** su Qdrant e Ollama
 - **PostgreSQL** per persistenza thread e metadata
 ## 🏗️ Architettura
 ### Hardware Setup
 #### AI-SRV (Chainlit VM)
 - **IP**: 192.168.1.244
 - **CPU**: 16 core (QEMU Virtual)
 - **RAM**: 64 GB
 - **Storage**: 195 GB
 - **Ruolo**: Host Chainlit app + PostgreSQL + Qdrant
 #### AI-Server (GPU Workstation)
 - **IP**: 192.168.1.243
 - **CPU**: Intel Core Ultra 7 265 (20 core, max 6.5 GHz)
 - **RAM**: 32 GB
 - **GPU**: NVIDIA RTX A1000 (8 GB VRAM)
 - **Storage**: 936 GB NVMe
 - **Ruolo**: Ollama models + BGE-M3 embeddings service
 ### Stack Tecnologico
 - **Backend**: Python + Chainlit (LLM UI framework)
 - **LLM**: GLM-4.6:Cloud (via Ollama Cloud)
 - **Vector DB**: Qdrant (semantic search)
 - **PDF Processing**: PyMuPDF (fitz)
 - **Database**: PostgreSQL + SQLAlchemy ORM
 - **Containerization**: Docker Compose
 - **Embeddings**: nomic-embed-text (via Ollama local)
---
+┌─────────────────────────────────────────┐
 │ Chainlit UI (ai-srv) │
 │ Badge + Settings + Chat History │
 └──────────────┬──────────────────────────┘
 │
 ┌──────────────▼──────────────────────────┐
 │ Python Backend (app.py) │
 │ - OAuth2 Google │
 │ - Multi-user profiles │
 │ - File processing orchestration │
 └─┬────────┬──────────┬──────────┬────────┘
 │ │ │ │
 ▼ ▼ ▼ ▼
 ┌─────┐ ┌─────┐ ┌────────┐ ┌──────────┐
 │ PG │ │Qdrant│ │ Ollama │ │ BGE API │
 │ │ │Vector│ │ GPU │ │ CPU │
 │ │ │ DB │ │ Server │ │ Server │
 └─────┘ └─────┘ └────────┘ └──────────┘
 ai-srv ai-srv ai-server ai-server
-## 🚀 Quick Start
+text
-### Prerequisites
+## 📋 Requisiti
 - Docker & Docker Compose
 - Ollama installed locally (for embeddings)
 - Ollama Cloud account (for glm-4.6:cloud)
-### 1️⃣ Clone & Setup
+### Sistema
 - Docker 24.x+ con Docker Compose
 - Accesso a Google Cloud Console (per OAuth2)
 - 2 server (o VM) con networking condiviso
 ### Modelli Ollama (da installare su ai-server)
 ```bash
-git clone git@github.com:your-username/ai-station.git
+ollama pull minicpm-v              # Vision model (5.5 GB)
 ollama pull glm-4.6:cloud          # Cloud reasoning
 ollama pull qwen2.5-coder:32b      # Code generation (9 GB)
 ollama pull llama3.2               # Fast general purpose (4.7 GB)
 🚀 Installazione
 1. Clone Repository
 bash
 git clone <your-repo>
 cd ai-station
 2. Configurazione Ambiente
 Crea .env:
-# Configure environment
+bash
-cat > .env << 'EOF'
+# Database
-DATABASE_URL=postgresql+asyncpg://ai_user:secure_password_here@postgres:5432/ai_station
+DATABASE_URL=postgresql+asyncpg://ai_user:CHANGE_ME@postgres:5432/ai_station
 # AI Services
 OLLAMA_URL=http://192.168.1.243:11434
 QDRANT_URL=http://qdrant:6333
-EOF
+BGE_API_URL=http://192.168.1.243:8001/embed
 ```
-### 2️⃣ Authenticate Ollama Cloud
+# OAuth Google
-```bash
+OAUTH_GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com
-ollama signin
+OAUTH_GOOGLE_CLIENT_SECRET=your-secret
-# Follow the link to authenticate with your Ollama account
+CHAINLIT_AUTH_SECRET=$(openssl rand -base64 32)
 ```
-### 3️⃣ Start Services
+3. Configurazione OAuth Google
-```bash
+Vai su Google Cloud Console
-docker compose up -d
+
 Crea nuovo progetto → API e servizi → Credenziali
 Crea "ID client OAuth 2.0"
 Aggiungi URI autorizzati:
 https://ai.dffm.it/auth/oauth/google/callback
 http://localhost:8000/auth/oauth/google/callback (dev)
 Copia Client ID e Secret in .env
 4. Personalizza Utenti
 Modifica app.py → USER_PROFILES:
 python
 USER_PROFILES = {
    "tuo.email@example.com": {
        "role": "admin",
        "name": "Nome",
        "workspace": "workspace_name",
        "rag_collection": "docs_collection",
        "capabilities": ["debug", "all"],
        "show_code": True,
    },
    # ... altri utenti
 }
 5. Deploy
 bash
 # Build e avvio
 docker compose up -d --build
 # Verifica logs
 docker compose logs -f chainlit-app
 # Dovresti vedere:
 # ✅ Tutte le tabelle create con successo.
 # Your app is available at http://localhost:8000
 6. Setup BGE-M3 Service (su ai-server)
 bash
 # Installa dependencies
 pip install fastapi uvicorn FlagEmbedding torch
 # Salva il file bge_service.py (vedi docs/)
 python bge_service.py
 # Listening on http://0.0.0.0:8001
 🎯 Utilizzo
 Login
 Accedi via browser: https://ai.dffm.it (o http://localhost:8000)
 Click su "Continue with Google"
 Autorizza con account configurato in USER_PROFILES
 Chat con RAG
 Carica PDF/DOCX → Sistema li indicizza automaticamente
 Fai domande → Risposta con contesto dai documenti
 Regola top_k (numero documenti) via settings
 Analisi Immagini
 Carica screenshot/diagrammi
 Il sistema:
 Estrae testo (OCR)
 Descrive grafici/tabelle
 Usa descrizione come contesto per rispondere
 Settings Disponibili
 Numero Documenti RAG (1-10): Quanti chunk recuperare
 Modello: Scegli tra locale/cloud
 Temperatura (0-1): Creatività risposta
 RAG Enabled: On/Off recupero documenti
 Istruzione Custom: Prompt system personalizzato
 Ripresa Chat
 Sidebar → Chat History
 Click su conversazione → "Riprendi"
 Continua da dove avevi lasciato
 📊 Metriche
 Ogni risposta logga (stdout):
 json
 {
  "response_time": 18.65,
  "rag_hits": 4,
  "model": "glm-4.6:cloud",
  "user_role": "admin",
  "error": null
 }
 Raccogli con:
 bash
 docker logs ai-station-app | grep METRICS > metrics.log
 🔧 Troubleshooting
 RAG non trova documenti
 Verifica collection name in USER_PROFILES[email]["rag_collection"]
 Controlla Qdrant: curl http://localhost:6333/collections
 Badge HTML non si vede
 Abilita in .chainlit/config.toml:
 ```
-### 4️⃣ Access UI
+```text
-Navigate to: **http://localhost:8000**
+[features]
-
+unsafe_allow_html = true
---
+Modello Ollama non risponde
-
+bash
-## 📁 Project Structure
+# Testa connessione
 curl http://192.168.1.243:11434/api/tags
 # Verifica modello disponibile
 ollama list
 BGE embeddings fail
 ```
 ```bash
 # Testa API
 curl -X POST http://192.168.1.243:8001/embed \
  -H "Content-Type: application/json" \
  -d '{"texts": ["test"]}'
 ```
 📁 Struttura Progetto
 ```bash
 ai-station/
-├── app.py                 # Main Chainlit application
+├── app.py                    # Main Chainlit app
-├── requirements.txt       # Python dependencies
+├── init_db.py               # Database schema init
-├── docker-compose.yml     # Docker services config
+├── requirements.txt         # Python deps
-├── .env                   # Environment variables (gitignored)
+├── Dockerfile              # Container config
-├── workspaces/           # User workspace directories
+├── docker-compose.yaml     # Multi-service orchestration
-│   └── admin/            # Admin user files
+├── .chainlit/
-└── README.md             # This file
+│   └── config.toml         # UI/features config
 ├── public/
 │   └── custom.css          # Custom styling
 ├── workspaces/             # User file storage (volume)
 │   ├── admin_workspace/
 │   ├── engineering_workspace/
 │   └── ...
 └── .files/                 # Chainlit storage (volume)
 ```
-
+🔐 Sicurezza
 ---
 ## 🔧 Features
 ### ✅ Implemented
 - **PDF Upload & Processing**: Extract text from PDF documents using PyMuPDF
 - **Document Indexing**: Automatic chunking and semantic indexing via Qdrant
 - **RAG Search**: Retrieve relevant document chunks based on semantic similarity
 - **Intelligent Analysis**: GLM-4.6:Cloud analyzes documents with full context
 - **Code Extraction**: Automatically save Python code blocks from responses
 - **Chat History**: Persistent conversation storage via SQLAlchemy
 - **Streaming Responses**: Real-time token streaming via Chainlit
 ### 🔄 Workflow
 1. User uploads PDF or TXT file
 2. System extracts text and creates semantic chunks
 3. Chunks indexed in Qdrant vector database
 4. User asks questions about documents
 5. RAG retrieves relevant chunks
 6. GLM-4.6:Cloud analyzes with full context
 7. Streaming response to user
 ---
 ## 📊 Technical Details
 ### Document Processing Pipeline
 ```
 PDF Upload
    ↓
 PyMuPDF Text Extraction
    ↓
 Text Chunking (1500 chars, 200 char overlap)
    ↓
 nomic-embed-text Embeddings (Ollama local)
    ↓
 Qdrant Vector Storage
    ↓
 Semantic Search on User Query
    ↓
 GLM-4.6:Cloud Analysis with RAG Context
    ↓
 Chainlit Streaming Response
 ```
 ### Key Functions
 | Function | Purpose |
 |----------|---------|
 | `extract_text_from_pdf()` | Convert PDF to text using PyMuPDF |
 | `chunk_text()` | Split text into overlapping chunks |
 | `get_embeddings()` | Generate embeddings via Ollama |
 | `index_document()` | Store chunks in Qdrant |
 | `search_qdrant()` | Retrieve relevant context |
 | `on_message()` | Process user queries with RAG |
 ---
 ## 🔐 Environment Variables
 ```env
 DATABASE_URL=postgresql+asyncpg://user:pass@postgres:5432/ai_station
 OLLAMA_URL=http://192.168.1.243:11434          # Local Ollama for embeddings
 QDRANT_URL=http://qdrant:6333                  # Vector database
 ```
 **Note**: GLM-4.6:Cloud authentication is handled automatically via `ollama signin`
 ---
 ## 🐳 Docker Services
 | Service | Port | Purpose |
 |---------|------|---------|
 | `chainlit-app` | 8000 | Chainlit UI & API |
 | `postgres` | 5432 | Conversation persistence |
 | `qdrant` | 6333 | Vector database |
 | `ollama` | 11434 | Local embeddings (external) |
 Start/Stop:
 ```bash
-docker compose up -d      # Start all services
+OAuth2 obbligatorio (no accesso anonimo)
-docker compose down       # Stop all services
+
-docker compose logs -f    # View logs
+Workspace isolation (file separati per utente)
-docker compose restart    # Restart services
+
 HTML sanitization (configurable via unsafe_allow_html)
 Environment secrets (.env mai committato)
 PostgreSQL passwords cambiate da default
 ```
 ---
-## 📝 Usage Examples
+🚦 Roadmap
 ### Example 1: Analyze Tax Document
 ```
 User: "Qual è l'importo totale del documento?"
 AI Station: 
  ✅ Extracts PDF content
  ✅ Searches relevant sections
  ✅ Analyzes with GLM-4.6:Cloud
  📄 Returns: "Based on the document, the total amount is..."
 ```
 ### Example 2: Multi-Document Analysis
 ```
 1. Upload multiple PDFs (invoices, contracts)
 2. All documents automatically indexed
 3. Query across all documents simultaneously
 4. RAG retrieves most relevant chunks
 5. GLM-4.6:Cloud synthesizes answer
 ```
 ---
 ## 🛠️ Development
 ### Install Dependencies
 ```bash
-pip install -r requirements.txt
+ Re-ranking con cross-encoder
 Query expansion automatica
 Feedback loop (👍👎 su risposte)
 Export conversazioni PDF/Markdown
 Multi-query RAG parallelo
 Prometheus/Grafana monitoring
 Adaptive chunking per tipo documento
 Audio input support
 ```
-### Requirements
+
-```
+## 📝 Licenza
-chainlit==1.3.2
+```tect
-pydantic==2.9.2
+MIT License - vedi file [LICENSE](LICENSE) per dettagli.
-ollama>=0.1.0
+Crea file LICENSE nella root del progetto
-asyncpg>=0.29.0
+text
-psycopg2-binary
+MIT License
-qdrant-client>=1.10.0
+
-sqlalchemy>=2.0.0
+Copyright (c) 2026 DFFM / Giuseppe De Franceschi
-greenlet>=3.0.0
+
-sniffio
+Permission is hereby granted, free of charge, to any person obtaining a copy
-aiohttp
+of this software and associated documentation files (the "Software"), to deal
-alembic
+in the Software without restriction, including without limitation the rights
-pymupdf
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-python-dotenv
+copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 ```
-### Local Testing (without Docker)
+👥 Contributors
-```bash
+Giuseppe De Franceschi - @defranceschi
 # Start Ollama, PostgreSQL, Qdrant manually
 ollama serve &
 chainlit run app.py
 ```
---
+🙏 Credits
 Chainlit - UI framework
-## 🔄 Model Details
+Ollama - LLM runtime
-### GLM-4.6:Cloud
+Qdrant - Vector DB
 - **Provider**: Zhipu AI via Ollama Cloud
 - **Capabilities**: Long context, reasoning, multilingual
 - **Cost**: Free tier available
 - **Authentication**: Device key (automatic via `ollama signin`)
-### nomic-embed-text
+BGE-M3 - Embeddings
 - **Local embedding model** for chunking/retrieval
 - **Dimensions**: 768
 - **Speed**: Fast, runs locally
 - **Used for**: RAG semantic search
---
+Docling - Document processing
-## 📈 Monitoring & Logs
+## **Status**: 🔨 Pre-Production | **Last Update**: 2026-01-01
 ### Check Service Health
 ```bash
 # View all logs
 docker compose logs
 # Follow live logs
 docker compose logs -f chainlit-app
 # Check specific container
 docker inspect ai-station-chainlit-app
 ```
 ### Common Issues
 | Issue | Solution |
 |-------|----------|
 | `unauthorized` error | Run `ollama signin` on server |
 | Database connection failed | Check PostgreSQL is running |
 | Qdrant unavailable | Verify `docker-compose up` completed |
 | PDF not extracted | Ensure PyMuPDF installed: `pip install pymupdf` |
 ---
 ## 🚀 Deployment
 ### Production Checklist
 - [ ] Set secure PostgreSQL credentials in `.env`
 - [ ] Enable SSL/TLS for Chainlit endpoints
 - [ ] Configure CORS for frontend
 - [ ] Setup log aggregation (ELK, Datadog, etc.)
 - [ ] Implement rate limiting
 - [ ] Add API authentication
 - [ ] Configure backup strategy for Qdrant
 ### Cloud Deployment Options
 - **AWS**: ECS + RDS + VectorDB
 - **Google Cloud**: Cloud Run + Cloud SQL
 - **DigitalOcean**: App Platform + Managed Databases
 ---
 ## 📚 API Reference
 ### REST Endpoints (via Chainlit)
 - `POST /api/chat` - Send message with context
 - `GET /api/threads` - List conversations
 - `POST /api/upload` - Upload document
 ### WebSocket
 - Real-time streaming responses via Chainlit protocol
 ---
 ## 🔮 Future Features
 - [ ] OAuth2 Google authentication
 - [ ] Document metadata extraction (dates, amounts, entities)
 - [ ] Advanced search filters (type, date range, language)
 - [ ] Export results (PDF, CSV, JSON)
 - [ ] Analytics dashboard
 - [ ] Multi-language support
 - [ ] Document versioning
 - [ ] Compliance reporting (GDPR, audit trails)
 ---
 ## 📞 Support
 ### Troubleshooting
 1. Check logs: `docker compose logs chainlit-app`
 2. Verify Ollama authentication: `ollama show glm-4.6:cloud`
 3. Test Qdrant connection: `curl http://localhost:6333/health`
 4. Inspect PostgreSQL: `docker compose exec postgres psql -U ai_user -d ai_station`
 ### Performance Tips
 - Increase chunk overlap for better context retrieval
 - Adjust embedding model based on latency requirements
 - Monitor Qdrant memory usage for large document sets
 - Implement caching for frequent queries
 ---
 ## 📄 License
 MIT License - See LICENSE file
 ## 👤 Author
 AI Station Team
 ---
 **Last Updated**: December 26, 2025
 **Version**: 1.0.0
 **Status**: Production Ready ✅