Installation Guide¶

New in v0.1.17.1: Modular Installation

NetIntel-OCR v0.1.17.1 introduces modular installation, reducing the base installation from ~2.5GB to ~500MB. Install only the features you need!

Prerequisites¶

System Requirements¶

Operating System: Linux (Ubuntu 20.04+, RHEL 8+, or compatible)
Python: 3.10, 3.11, or 3.12
Memory: Minimum 8GB RAM (16GB+ recommended for Knowledge Graph)
Storage: 2GB minimum (varies with optional modules)

Required Software¶

# Install Python and system dependencies
sudo apt-get update
sudo apt-get install -y \
    python3.11 \
    python3.11-venv \
    python3.11-dev \
    gcc g++ \
    libgl1-mesa-glx \
    libglib2.0-0

# Install Ollama for LLM support
curl -fsSL https://ollama.com/install.sh | sh

# Pull the default OCR model
ollama pull nanonets-ocr-s:latest

Installation Options¶

🚀 Quick Install (Base Only)¶

The base installation includes core OCR functionality with network diagram detection (~500MB):

# Install base package
pip install netintel-ocr

# Verify installation
netintel-ocr --version

📦 Modular Installation¶

Choose the features you need with modular installation:

Knowledge Graph Support (+1.5GB)¶

# Install with Knowledge Graph support
pip install "netintel-ocr[kg]"

# This adds:
# - PyKEEN for knowledge graph embeddings
# - torch for deep learning
# - FalkorDB for graph storage
# - scikit-learn, matplotlib, plotly for analysis

Vector Store Support (+300MB)¶

# Install with vector database support
pip install "netintel-ocr[vector]"

# This adds:
# - pymilvus for Milvus integration
# - qdrant-client for Qdrant support
# - chromadb for ChromaDB
# - lancedb for LanceDB

API Server Support (+50MB)¶

# Install with REST API server
pip install "netintel-ocr[api]"

# This adds:
# - FastAPI web framework
# - uvicorn ASGI server
# - python-multipart for file uploads

MCP Server Support (+30MB)¶

# Install with Model Context Protocol server
pip install "netintel-ocr[mcp]"

# This adds:
# - fastmcp for MCP protocol
# - websockets for real-time communication

Performance Optimizations (+200MB)¶

# Install with C++ performance optimizations
pip install "netintel-ocr[performance]"

# This adds:
# - numpy with MKL optimizations
# - numba for JIT compilation
# - cython for C extensions

Development Tools (+100MB)¶

# Install with development and testing tools
pip install "netintel-ocr[dev]"

# This adds:
# - pytest for testing
# - black for code formatting
# - ruff for linting
# - mypy for type checking

🎯 Preset Configurations¶

Production Installation¶

# Includes: KG + Vector + API + Performance
pip install "netintel-ocr[production]"

Cloud Deployment¶

# Includes: Vector + API + MCP
pip install "netintel-ocr[cloud]"

Complete Installation¶

# Install everything
pip install "netintel-ocr[all]"

Checking Installation Status¶

Version Information¶

The enhanced --version command shows comprehensive installation status:

netintel-ocr --version

Output example:

NetIntel-OCR v0.1.17.1
├── Core Components:
│   ├── C++ Core: ✓ v1.0.1
│   ├── AVX2: ✓
│   ├── OpenMP: ✓
│   └── Platform: Linux x86_64
├── Installed Modules:
│   ├── [base] Core OCR: ✓ (always installed)
│   ├── [kg] Knowledge Graph: ✓ (pykeen 1.10.1)
│   ├── [vector] Vector Store: ✓ (pymilvus 2.3.0)
│   ├── [api] API Server: ✗ (not installed)
│   └── [mcp] MCP Server: ✗ (not installed)
├── Available for Install:
│   ├── [api] API Server: pip install netintel-ocr[api]
│   │   └── Adds: fastapi, uvicorn (+50MB)
│   └── [mcp] MCP Server: pip install netintel-ocr[mcp]
│       └── Adds: fastmcp, websockets (+30MB)
└── Active Features:
    ├── FalkorDB: ✓ (connected to localhost:6379)
    ├── Milvus: ✓ (connected to localhost:19530)
    └── Ollama: ✓ (connected to localhost:11434)

JSON Output¶

For programmatic use, get JSON output:

netintel-ocr --version --json

Docker Installation¶

Using Pre-built Image¶

# Pull the Docker image
docker pull netintel/netintel-ocr:0.1.17.1

# Run with base features
docker run -v $(pwd):/data netintel/netintel-ocr:0.1.17.1 process pdf document.pdf

Building Custom Image¶

# Dockerfile for custom installation
FROM python:3.11-slim

# Install only what you need
RUN pip install netintel-ocr[kg,vector]

# Your configuration...

Environment Configuration¶

Knowledge Graph Configuration¶

# Set environment variables for KG support
export FALKORDB_HOST=localhost
export FALKORDB_PORT=6379
export MINIRAG_LLM="ollama/gemma3:4b-it-qat"
export MINIRAG_EMBEDDING="ollama/qwen3-embedding:8b"

Milvus Configuration¶

# Configure Milvus vector store
export MILVUS_HOST=localhost
export MILVUS_PORT=19530
export MILVUS_COLLECTION=netintel_vectors

Ollama Configuration¶

# Configure Ollama endpoint
export OLLAMA_HOST="http://localhost:11434"

Upgrading from v0.1.17¶

For Minimal Users¶

If you only need core OCR:

pip uninstall netintel-ocr
pip install netintel-ocr==0.1.17.1

For Full Installation Users¶

If you had everything installed in v0.1.17:

pip uninstall netintel-ocr
pip install "netintel-ocr[all]==0.1.17.1"

Checking What You Need¶

Run this before upgrading to see what features you're using:

netintel-ocr --version --detailed

Troubleshooting Installation¶

Module Not Found Errors¶

If you get "module not found" errors:

# Check what's installed
netintel-ocr --version

# Install missing module (example for KG)
pip install "netintel-ocr[kg]"

Connection Errors¶

If features show as "not connected":

# Start required services
docker-compose up -d falkordb milvus
ollama serve

Performance Issues¶

If processing is slow:

# Install performance optimizations
pip install "netintel-ocr[performance]"

# Enable GPU support (if available)
pip install "netintel-ocr[kg]"  # Includes torch with CUDA

Next Steps¶

Quick Start Guide - Process your first document
Configuration Guide - Configure NetIntel-OCR
Knowledge Graph Setup - Enable KG features
API Documentation - Use the REST API

Support¶

For installation issues: - Check Troubleshooting Guide - View GitHub Issues - Join Discord Community