Installation Guide¶
New in v0.1.17.1: Modular Installation
NetIntel-OCR v0.1.17.1 introduces modular installation, reducing the base installation from ~2.5GB to ~500MB. Install only the features you need!
Prerequisites¶
System Requirements¶
- Operating System: Linux (Ubuntu 20.04+, RHEL 8+, or compatible)
- Python: 3.10, 3.11, or 3.12
- Memory: Minimum 8GB RAM (16GB+ recommended for Knowledge Graph)
- Storage: 2GB minimum (varies with optional modules)
Required Software¶
# Install Python and system dependencies
sudo apt-get update
sudo apt-get install -y \
python3.11 \
python3.11-venv \
python3.11-dev \
gcc g++ \
libgl1-mesa-glx \
libglib2.0-0
# Install Ollama for LLM support
curl -fsSL https://ollama.com/install.sh | sh
# Pull the default OCR model
ollama pull nanonets-ocr-s:latest
Installation Options¶
🚀 Quick Install (Base Only)¶
The base installation includes core OCR functionality with network diagram detection (~500MB):
📦 Modular Installation¶
Choose the features you need with modular installation:
Knowledge Graph Support (+1.5GB)¶
# Install with Knowledge Graph support
pip install "netintel-ocr[kg]"
# This adds:
# - PyKEEN for knowledge graph embeddings
# - torch for deep learning
# - FalkorDB for graph storage
# - scikit-learn, matplotlib, plotly for analysis
Vector Store Support (+300MB)¶
# Install with vector database support
pip install "netintel-ocr[vector]"
# This adds:
# - pymilvus for Milvus integration
# - qdrant-client for Qdrant support
# - chromadb for ChromaDB
# - lancedb for LanceDB
API Server Support (+50MB)¶
# Install with REST API server
pip install "netintel-ocr[api]"
# This adds:
# - FastAPI web framework
# - uvicorn ASGI server
# - python-multipart for file uploads
MCP Server Support (+30MB)¶
# Install with Model Context Protocol server
pip install "netintel-ocr[mcp]"
# This adds:
# - fastmcp for MCP protocol
# - websockets for real-time communication
Performance Optimizations (+200MB)¶
# Install with C++ performance optimizations
pip install "netintel-ocr[performance]"
# This adds:
# - numpy with MKL optimizations
# - numba for JIT compilation
# - cython for C extensions
Development Tools (+100MB)¶
# Install with development and testing tools
pip install "netintel-ocr[dev]"
# This adds:
# - pytest for testing
# - black for code formatting
# - ruff for linting
# - mypy for type checking
🎯 Preset Configurations¶
Production Installation¶
Cloud Deployment¶
Complete Installation¶
Checking Installation Status¶
Version Information¶
The enhanced --version command shows comprehensive installation status:
Output example:
NetIntel-OCR v0.1.17.1
├── Core Components:
│ ├── C++ Core: ✓ v1.0.1
│ ├── AVX2: ✓
│ ├── OpenMP: ✓
│ └── Platform: Linux x86_64
├── Installed Modules:
│ ├── [base] Core OCR: ✓ (always installed)
│ ├── [kg] Knowledge Graph: ✓ (pykeen 1.10.1)
│ ├── [vector] Vector Store: ✓ (pymilvus 2.3.0)
│ ├── [api] API Server: ✗ (not installed)
│ └── [mcp] MCP Server: ✗ (not installed)
├── Available for Install:
│ ├── [api] API Server: pip install netintel-ocr[api]
│ │ └── Adds: fastapi, uvicorn (+50MB)
│ └── [mcp] MCP Server: pip install netintel-ocr[mcp]
│ └── Adds: fastmcp, websockets (+30MB)
└── Active Features:
├── FalkorDB: ✓ (connected to localhost:6379)
├── Milvus: ✓ (connected to localhost:19530)
└── Ollama: ✓ (connected to localhost:11434)
JSON Output¶
For programmatic use, get JSON output:
Docker Installation¶
Using Pre-built Image¶
# Pull the Docker image
docker pull netintel/netintel-ocr:0.1.17.1
# Run with base features
docker run -v $(pwd):/data netintel/netintel-ocr:0.1.17.1 process pdf document.pdf
Building Custom Image¶
# Dockerfile for custom installation
FROM python:3.11-slim
# Install only what you need
RUN pip install netintel-ocr[kg,vector]
# Your configuration...
Environment Configuration¶
Knowledge Graph Configuration¶
# Set environment variables for KG support
export FALKORDB_HOST=localhost
export FALKORDB_PORT=6379
export MINIRAG_LLM="ollama/gemma3:4b-it-qat"
export MINIRAG_EMBEDDING="ollama/qwen3-embedding:8b"
Milvus Configuration¶
# Configure Milvus vector store
export MILVUS_HOST=localhost
export MILVUS_PORT=19530
export MILVUS_COLLECTION=netintel_vectors
Ollama Configuration¶
Upgrading from v0.1.17¶
For Minimal Users¶
If you only need core OCR:
For Full Installation Users¶
If you had everything installed in v0.1.17:
Checking What You Need¶
Run this before upgrading to see what features you're using:
Troubleshooting Installation¶
Module Not Found Errors¶
If you get "module not found" errors:
# Check what's installed
netintel-ocr --version
# Install missing module (example for KG)
pip install "netintel-ocr[kg]"
Connection Errors¶
If features show as "not connected":
Performance Issues¶
If processing is slow:
# Install performance optimizations
pip install "netintel-ocr[performance]"
# Enable GPU support (if available)
pip install "netintel-ocr[kg]" # Includes torch with CUDA
Next Steps¶
- Quick Start Guide - Process your first document
- Configuration Guide - Configure NetIntel-OCR
- Knowledge Graph Setup - Enable KG features
- API Documentation - Use the REST API
Support¶
For installation issues: - Check Troubleshooting Guide - View GitHub Issues - Join Discord Community