Multi-Model Selection Guide¶

Overview¶

NetIntel-OCR supports multiple vision-language models optimized for different tasks. Selecting the right model improves accuracy and performance.

Model Categories¶

OCR-Optimized Models¶

Best for text extraction from documents.

Model	Speed	Accuracy	Memory	Use Case
`Nanonets-OCR-s:latest`	⚡⚡⚡	High	4GB	Default OCR
`moondream:latest`	⚡⚡	Medium	3GB	Fast processing
`NetIntelOCR-7B-0925`	⚡⚡	Very High	8GB	Default (v0.1.16)

Vision-Language Models¶

Best for diagram understanding and component extraction.

Model	Speed	Accuracy	Memory	Use Case
`qwen2.5vl:7b`	⚡⚡	Very High	8GB	Recommended
`llava:13b`	⚡	Highest	16GB	Complex diagrams
`cogvlm:latest`	Slow	Highest	32GB	Critical accuracy
`minicpm-v:latest`	⚡⚡⚡	Medium	4GB	Quick preview

Lightweight Models¶

Best for quick detection and simple diagrams.

Model	Speed	Accuracy	Memory	Use Case
`bakllava:latest`	⚡⚡⚡	Medium	4GB	Fast detection
`llava-phi3:latest`	⚡⚡⚡	Medium	3GB	Edge deployment
`llama3.2-vision:11b`	⚡	High	12GB	Balanced

Task-Specific Recommendations¶

Text Extraction¶

# Fast text extraction
netintel-ocr process file document.pdf --model moondream:latest

# High accuracy OCR
netintel-ocr process file document.pdf --model Nanonets-OCR-s:latest

# Default balanced approach
netintel-ocr process file document.pdf --model NetIntelOCR-7B-0925

Network Diagrams¶

# Simple network topology
netintel-ocr process file network.pdf --network-model minicpm-v:latest

# Complex architecture
netintel-ocr process file architecture.pdf --network-model llava:13b

# Recommended for most cases
netintel-ocr process file design.pdf --network-model qwen2.5vl:7b

Flow Diagrams¶

# Business process flows
netintel-ocr process file process.pdf --flow-model qwen2.5vl:7b

# Complex decision trees
netintel-ocr process file workflow.pdf --flow-model llava:13b

# Quick extraction
netintel-ocr process file simple-flow.pdf --flow-model bakllava:latest

Model Selection Strategy¶

By Document Type¶

Technical Specifications¶

netintel-ocr \
  --model Nanonets-OCR-s:latest \
  --network-model cogvlm:latest \
  --flow-model llava:13b \
  technical-spec.pdf

Marketing Materials¶

netintel-ocr \
  --model moondream:latest \
  --network-model minicpm-v:latest \
  --flow-model bakllava:latest \
  brochure.pdf

Security Documentation¶

netintel-ocr \
  --model NetIntelOCR-7B-0925 \
  --network-model qwen2.5vl:7b \
  --security-focus \
  security-guide.pdf

By Resource Constraints¶

Limited Memory (4GB)¶

netintel-ocr \
  --model moondream:latest \
  --network-model minicpm-v:latest \
  --low-memory \
  document.pdf

GPU Available¶

netintel-ocr \
  --model NetIntelOCR-7B-0925 \
  --network-model llava:13b \
  --gpu \
  document.pdf

CPU Only¶

netintel-ocr \
  --model Nanonets-OCR-s:latest \
  --network-model bakllava:latest \
  --cpu-optimized \
  document.pdf

Model Configuration¶

Default Models¶

Set default models in configuration:

# config.yaml
models:
  text_extraction: NetIntelOCR-7B-0925
  network_detection: qwen2.5vl:7b
  flow_detection: qwen2.5vl:7b
  component_extraction: qwen2.5vl:7b

fallbacks:
  text_extraction: Nanonets-OCR-s:latest
  network_detection: minicpm-v:latest

Model-Specific Parameters¶

model_configs:
  qwen2.5vl:
    temperature: 0.3
    max_tokens: 4096
    top_p: 0.9

  llava:
    temperature: 0.5
    max_tokens: 8192
    num_predict: 2048

  NetIntelOCR-7B-0925:
    temperature: 0.2
    repeat_penalty: 1.1

Performance Optimization¶

Batch Processing¶

# Use fast models for batch
netintel-ocr process batch \
  --model moondream:latest \
  --network-model minicpm-v:latest \
  *.pdf

Multi-Pass Strategy¶

# First pass: Quick detection
netintel-ocr --detect-only \
  --network-model bakllava:latest \
  document.pdf

# Second pass: Detailed extraction on detected pages
netintel-ocr --pages 5,12,18 \
  --network-model llava:13b \
  document.pdf

Model Caching¶

# Preload models
netintel-ocr model preload \
  "qwen2.5vl:7b,Nanonets-OCR-s:latest"

# Keep models in memory
netintel-ocr model keep-loaded \
  --model-cache-ttl 3600 \
  document.pdf

Model Benchmarks¶

Processing Speed (pages/minute)¶

Task	Nanonets	qwen2.5vl	llava	minicpm-v
Text Only	12	8	4	15
Simple Diagram	8	6	3	10
Complex Diagram	4	4	2	6

Accuracy Scores (F1)¶

Task	Nanonets	qwen2.5vl	llava	minicpm-v
Text OCR	0.95	0.92	0.94	0.85
Component Detection	0.82	0.91	0.94	0.78
Connection Tracing	0.75	0.88	0.92	0.72

Custom Model Integration¶

Add Custom Model¶

# Download and configure
ollama pull your-custom-model:latest

# Register with NetIntel-OCR
netintel-ocr model register \
  --name custom-model \
  --type vision-language \
  --capabilities "network,flow,text"

Model Evaluation¶

# Test model performance
netintel-ocr model evaluate custom-model:latest \
  --test-set /path/to/test/documents \
  --metrics "accuracy,speed,memory"

Troubleshooting Models¶

Model Not Found¶

# List available models
ollama list

# Pull missing model
ollama pull qwen2.5vl:7b

Out of Memory¶

# Use smaller model
netintel-ocr process file document.pdf --network-model minicpm-v:latest

# Reduce context size
netintel-ocr process file document.pdf --max-context 2048

Slow Processing¶

# Use faster model
netintel-ocr process file document.pdf --network-model bakllava:latest

# Enable GPU
netintel-ocr process file document.pdf --gpu

# Reduce quality for speed
netintel-ocr process file document.pdf --fast-mode

Next Steps¶

Customization Guide - Fine-tune model parameters
Performance Guide - Optimize for large batches
Troubleshooting - Common issues and solutions