Skip to content

Multi-Model Selection Guide

Overview

NetIntel-OCR supports multiple vision-language models optimized for different tasks. Selecting the right model improves accuracy and performance.

Model Categories

OCR-Optimized Models

Best for text extraction from documents.

Model Speed Accuracy Memory Use Case
Nanonets-OCR-s:latest ⚡⚡⚡ High 4GB Default OCR
moondream:latest ⚡⚡ Medium 3GB Fast processing
NetIntelOCR-7B-0925 ⚡⚡ Very High 8GB Default (v0.1.16)

Vision-Language Models

Best for diagram understanding and component extraction.

Model Speed Accuracy Memory Use Case
qwen2.5vl:7b ⚡⚡ Very High 8GB Recommended
llava:13b Highest 16GB Complex diagrams
cogvlm:latest Slow Highest 32GB Critical accuracy
minicpm-v:latest ⚡⚡⚡ Medium 4GB Quick preview

Lightweight Models

Best for quick detection and simple diagrams.

Model Speed Accuracy Memory Use Case
bakllava:latest ⚡⚡⚡ Medium 4GB Fast detection
llava-phi3:latest ⚡⚡⚡ Medium 3GB Edge deployment
llama3.2-vision:11b High 12GB Balanced

Task-Specific Recommendations

Text Extraction

# Fast text extraction
netintel-ocr process file document.pdf --model moondream:latest

# High accuracy OCR
netintel-ocr process file document.pdf --model Nanonets-OCR-s:latest

# Default balanced approach
netintel-ocr process file document.pdf --model NetIntelOCR-7B-0925

Network Diagrams

# Simple network topology
netintel-ocr process file network.pdf --network-model minicpm-v:latest

# Complex architecture
netintel-ocr process file architecture.pdf --network-model llava:13b

# Recommended for most cases
netintel-ocr process file design.pdf --network-model qwen2.5vl:7b

Flow Diagrams

# Business process flows
netintel-ocr process file process.pdf --flow-model qwen2.5vl:7b

# Complex decision trees
netintel-ocr process file workflow.pdf --flow-model llava:13b

# Quick extraction
netintel-ocr process file simple-flow.pdf --flow-model bakllava:latest

Model Selection Strategy

By Document Type

Technical Specifications

netintel-ocr \
  --model Nanonets-OCR-s:latest \
  --network-model cogvlm:latest \
  --flow-model llava:13b \
  technical-spec.pdf

Marketing Materials

netintel-ocr \
  --model moondream:latest \
  --network-model minicpm-v:latest \
  --flow-model bakllava:latest \
  brochure.pdf

Security Documentation

netintel-ocr \
  --model NetIntelOCR-7B-0925 \
  --network-model qwen2.5vl:7b \
  --security-focus \
  security-guide.pdf

By Resource Constraints

Limited Memory (4GB)

netintel-ocr \
  --model moondream:latest \
  --network-model minicpm-v:latest \
  --low-memory \
  document.pdf

GPU Available

netintel-ocr \
  --model NetIntelOCR-7B-0925 \
  --network-model llava:13b \
  --gpu \
  document.pdf

CPU Only

netintel-ocr \
  --model Nanonets-OCR-s:latest \
  --network-model bakllava:latest \
  --cpu-optimized \
  document.pdf

Model Configuration

Default Models

Set default models in configuration:

# config.yaml
models:
  text_extraction: NetIntelOCR-7B-0925
  network_detection: qwen2.5vl:7b
  flow_detection: qwen2.5vl:7b
  component_extraction: qwen2.5vl:7b

fallbacks:
  text_extraction: Nanonets-OCR-s:latest
  network_detection: minicpm-v:latest

Model-Specific Parameters

model_configs:
  qwen2.5vl:
    temperature: 0.3
    max_tokens: 4096
    top_p: 0.9

  llava:
    temperature: 0.5
    max_tokens: 8192
    num_predict: 2048

  NetIntelOCR-7B-0925:
    temperature: 0.2
    repeat_penalty: 1.1

Performance Optimization

Batch Processing

# Use fast models for batch
netintel-ocr process batch \
  --model moondream:latest \
  --network-model minicpm-v:latest \
  *.pdf

Multi-Pass Strategy

# First pass: Quick detection
netintel-ocr --detect-only \
  --network-model bakllava:latest \
  document.pdf

# Second pass: Detailed extraction on detected pages
netintel-ocr --pages 5,12,18 \
  --network-model llava:13b \
  document.pdf

Model Caching

# Preload models
netintel-ocr model preload \
  "qwen2.5vl:7b,Nanonets-OCR-s:latest"

# Keep models in memory
netintel-ocr model keep-loaded \
  --model-cache-ttl 3600 \
  document.pdf

Model Benchmarks

Processing Speed (pages/minute)

Task Nanonets qwen2.5vl llava minicpm-v
Text Only 12 8 4 15
Simple Diagram 8 6 3 10
Complex Diagram 4 4 2 6

Accuracy Scores (F1)

Task Nanonets qwen2.5vl llava minicpm-v
Text OCR 0.95 0.92 0.94 0.85
Component Detection 0.82 0.91 0.94 0.78
Connection Tracing 0.75 0.88 0.92 0.72

Custom Model Integration

Add Custom Model

# Download and configure
ollama pull your-custom-model:latest

# Register with NetIntel-OCR
netintel-ocr model register \
  --name custom-model \
  --type vision-language \
  --capabilities "network,flow,text"

Model Evaluation

# Test model performance
netintel-ocr model evaluate custom-model:latest \
  --test-set /path/to/test/documents \
  --metrics "accuracy,speed,memory"

Troubleshooting Models

Model Not Found

# List available models
ollama list

# Pull missing model
ollama pull qwen2.5vl:7b

Out of Memory

# Use smaller model
netintel-ocr process file document.pdf --network-model minicpm-v:latest

# Reduce context size
netintel-ocr process file document.pdf --max-context 2048

Slow Processing

# Use faster model
netintel-ocr process file document.pdf --network-model bakllava:latest

# Enable GPU
netintel-ocr process file document.pdf --gpu

# Reduce quality for speed
netintel-ocr process file document.pdf --fast-mode

Next Steps