Multi-Model Selection Guide¶
Overview¶
NetIntel-OCR supports multiple vision-language models optimized for different tasks. Selecting the right model improves accuracy and performance.
Model Categories¶
OCR-Optimized Models¶
Best for text extraction from documents.
| Model | Speed | Accuracy | Memory | Use Case |
|---|---|---|---|---|
Nanonets-OCR-s:latest |
⚡⚡⚡ | High | 4GB | Default OCR |
moondream:latest |
⚡⚡ | Medium | 3GB | Fast processing |
NetIntelOCR-7B-0925 |
⚡⚡ | Very High | 8GB | Default (v0.1.16) |
Vision-Language Models¶
Best for diagram understanding and component extraction.
| Model | Speed | Accuracy | Memory | Use Case |
|---|---|---|---|---|
qwen2.5vl:7b |
⚡⚡ | Very High | 8GB | Recommended |
llava:13b |
⚡ | Highest | 16GB | Complex diagrams |
cogvlm:latest |
Slow | Highest | 32GB | Critical accuracy |
minicpm-v:latest |
⚡⚡⚡ | Medium | 4GB | Quick preview |
Lightweight Models¶
Best for quick detection and simple diagrams.
| Model | Speed | Accuracy | Memory | Use Case |
|---|---|---|---|---|
bakllava:latest |
⚡⚡⚡ | Medium | 4GB | Fast detection |
llava-phi3:latest |
⚡⚡⚡ | Medium | 3GB | Edge deployment |
llama3.2-vision:11b |
⚡ | High | 12GB | Balanced |
Task-Specific Recommendations¶
Text Extraction¶
# Fast text extraction
netintel-ocr process file document.pdf --model moondream:latest
# High accuracy OCR
netintel-ocr process file document.pdf --model Nanonets-OCR-s:latest
# Default balanced approach
netintel-ocr process file document.pdf --model NetIntelOCR-7B-0925
Network Diagrams¶
# Simple network topology
netintel-ocr process file network.pdf --network-model minicpm-v:latest
# Complex architecture
netintel-ocr process file architecture.pdf --network-model llava:13b
# Recommended for most cases
netintel-ocr process file design.pdf --network-model qwen2.5vl:7b
Flow Diagrams¶
# Business process flows
netintel-ocr process file process.pdf --flow-model qwen2.5vl:7b
# Complex decision trees
netintel-ocr process file workflow.pdf --flow-model llava:13b
# Quick extraction
netintel-ocr process file simple-flow.pdf --flow-model bakllava:latest
Model Selection Strategy¶
By Document Type¶
Technical Specifications¶
netintel-ocr \
--model Nanonets-OCR-s:latest \
--network-model cogvlm:latest \
--flow-model llava:13b \
technical-spec.pdf
Marketing Materials¶
netintel-ocr \
--model moondream:latest \
--network-model minicpm-v:latest \
--flow-model bakllava:latest \
brochure.pdf
Security Documentation¶
netintel-ocr \
--model NetIntelOCR-7B-0925 \
--network-model qwen2.5vl:7b \
--security-focus \
security-guide.pdf
By Resource Constraints¶
Limited Memory (4GB)¶
netintel-ocr \
--model moondream:latest \
--network-model minicpm-v:latest \
--low-memory \
document.pdf
GPU Available¶
CPU Only¶
netintel-ocr \
--model Nanonets-OCR-s:latest \
--network-model bakllava:latest \
--cpu-optimized \
document.pdf
Model Configuration¶
Default Models¶
Set default models in configuration:
# config.yaml
models:
text_extraction: NetIntelOCR-7B-0925
network_detection: qwen2.5vl:7b
flow_detection: qwen2.5vl:7b
component_extraction: qwen2.5vl:7b
fallbacks:
text_extraction: Nanonets-OCR-s:latest
network_detection: minicpm-v:latest
Model-Specific Parameters¶
model_configs:
qwen2.5vl:
temperature: 0.3
max_tokens: 4096
top_p: 0.9
llava:
temperature: 0.5
max_tokens: 8192
num_predict: 2048
NetIntelOCR-7B-0925:
temperature: 0.2
repeat_penalty: 1.1
Performance Optimization¶
Batch Processing¶
# Use fast models for batch
netintel-ocr process batch \
--model moondream:latest \
--network-model minicpm-v:latest \
*.pdf
Multi-Pass Strategy¶
# First pass: Quick detection
netintel-ocr --detect-only \
--network-model bakllava:latest \
document.pdf
# Second pass: Detailed extraction on detected pages
netintel-ocr --pages 5,12,18 \
--network-model llava:13b \
document.pdf
Model Caching¶
# Preload models
netintel-ocr model preload \
"qwen2.5vl:7b,Nanonets-OCR-s:latest"
# Keep models in memory
netintel-ocr model keep-loaded \
--model-cache-ttl 3600 \
document.pdf
Model Benchmarks¶
Processing Speed (pages/minute)¶
| Task | Nanonets | qwen2.5vl | llava | minicpm-v |
|---|---|---|---|---|
| Text Only | 12 | 8 | 4 | 15 |
| Simple Diagram | 8 | 6 | 3 | 10 |
| Complex Diagram | 4 | 4 | 2 | 6 |
Accuracy Scores (F1)¶
| Task | Nanonets | qwen2.5vl | llava | minicpm-v |
|---|---|---|---|---|
| Text OCR | 0.95 | 0.92 | 0.94 | 0.85 |
| Component Detection | 0.82 | 0.91 | 0.94 | 0.78 |
| Connection Tracing | 0.75 | 0.88 | 0.92 | 0.72 |
Custom Model Integration¶
Add Custom Model¶
# Download and configure
ollama pull your-custom-model:latest
# Register with NetIntel-OCR
netintel-ocr model register \
--name custom-model \
--type vision-language \
--capabilities "network,flow,text"
Model Evaluation¶
# Test model performance
netintel-ocr model evaluate custom-model:latest \
--test-set /path/to/test/documents \
--metrics "accuracy,speed,memory"
Troubleshooting Models¶
Model Not Found¶
Out of Memory¶
# Use smaller model
netintel-ocr process file document.pdf --network-model minicpm-v:latest
# Reduce context size
netintel-ocr process file document.pdf --max-context 2048
Slow Processing¶
# Use faster model
netintel-ocr process file document.pdf --network-model bakllava:latest
# Enable GPU
netintel-ocr process file document.pdf --gpu
# Reduce quality for speed
netintel-ocr process file document.pdf --fast-mode
Next Steps¶
- Customization Guide - Fine-tune model parameters
- Performance Guide - Optimize for large batches
- Troubleshooting - Common issues and solutions