API Integration Guide¶
REST API Server¶
NetIntel-OCR provides a REST API for programmatic document processing and integration with external systems.
Starting the API Server¶
# Start API server on port 8000
netintel-ocr server api
# Custom port and host
netintel-ocr server api --port 8080 --host 0.0.0.0
# With authentication
netintel-ocr server api --api-key YOUR_SECRET_KEY
Docker API Mode¶
API Endpoints¶
Health Check¶
GET /health
Response:
{
"status": "healthy",
"version": "0.1.16.15",
"models_available": ["qwen2.5vl:7b", "Nanonets-OCR-s:latest"],
"milvus_connected": true
}
Process Document¶
POST /process
Content-Type: multipart/form-data
Parameters:
- file: PDF file (required)
- model: OCR model (optional)
- network_model: Network diagram model (optional)
- start_page: Starting page (optional)
- end_page: Ending page (optional)
- confidence_threshold: Detection threshold (optional)
Response:
{
"job_id": "uuid-12345",
"status": "processing",
"estimated_time": 45
}
Get Job Status¶
GET /status/{job_id}
Response:
{
"job_id": "uuid-12345",
"status": "completed",
"progress": 100,
"pages_processed": 10,
"diagrams_found": 3
}
Get Results¶
GET /results/{job_id}
Response:
{
"job_id": "uuid-12345",
"pages": [
{
"page_number": 1,
"type": "text",
"content": "..."
},
{
"page_number": 2,
"type": "network_diagram",
"mermaid": "graph TB...",
"components": [...],
"context": {...}
}
],
"summary": {...}
}
Search Documents¶
POST /search
Content-Type: application/json
{
"query": "firewall configuration",
"collection": "network_docs",
"limit": 10,
"filters": {
"document_type": "network"
}
}
Response:
{
"results": [
{
"document": "firewall-guide.pdf",
"page": 5,
"score": 0.92,
"content": "...",
"metadata": {...}
}
]
}
Python Client¶
Installation¶
# Install client library
pip install netintel-ocr-client
# Or install full package with client
pip install netintel-ocr[client]
Package Details
Main package: https://pypi.org/project/netintel-ocr/
Client library: https://pypi.org/project/netintel-ocr-client/
Basic Usage¶
from netintel_client import NetIntelClient
# Initialize client
client = NetIntelClient(
host="http://localhost:8000",
api_key="your-secret-key"
)
# Process document
job = client.process_document(
file_path="network-design.pdf",
model="qwen2.5vl:7b",
start_page=1,
end_page=10
)
# Wait for completion
result = client.wait_for_job(job.job_id)
# Get results
pages = result.pages
diagrams = [p for p in pages if p.type == "network_diagram"]
Async Processing¶
import asyncio
from netintel_client import AsyncNetIntelClient
async def process_documents():
client = AsyncNetIntelClient("http://localhost:8000")
# Process multiple documents
jobs = []
for pdf in pdf_files:
job = await client.process_document(pdf)
jobs.append(job)
# Wait for all
results = await asyncio.gather(
*[client.wait_for_job(j.job_id) for j in jobs]
)
return results
JavaScript/TypeScript Client¶
Installation¶
Usage¶
const { NetIntelClient } = require('netintel-ocr-client');
const client = new NetIntelClient({
host: 'http://localhost:8000',
apiKey: 'your-secret-key'
});
// Process document
const job = await client.processDocument({
file: fileBuffer,
model: 'qwen2.5vl:7b'
});
// Get results
const result = await client.waitForJob(job.jobId);
console.log(`Found ${result.diagramsFound} diagrams`);
Webhook Integration¶
Configure Webhooks¶
POST /webhooks
Content-Type: application/json
{
"url": "https://your-server.com/webhook",
"events": ["job.completed", "job.failed"],
"secret": "webhook-secret"
}
Webhook Payload¶
{
"event": "job.completed",
"job_id": "uuid-12345",
"timestamp": "2024-01-15T10:30:00Z",
"data": {
"pages_processed": 10,
"diagrams_found": 3,
"processing_time": 45.2
}
}
Rate Limiting¶
Default limits: - 100 requests per minute per API key - 10 concurrent jobs per API key - 100MB max file size
Configure custom limits:
Authentication¶
API Key Authentication¶
# Set API key
export NETINTEL_API_KEY=your-secret-key
# Or in request header
curl -H "X-API-Key: your-secret-key" \
http://localhost:8000/process
JWT Authentication¶
# Get token
POST /auth/token
{
"username": "user",
"password": "pass"
}
# Use token
curl -H "Authorization: Bearer jwt-token" \
http://localhost:8000/process
Error Handling¶
Error Response Format¶
{
"error": {
"code": "INVALID_MODEL",
"message": "Model 'unknown-model' not found",
"details": {
"available_models": ["qwen2.5vl:7b", "llava:13b"]
}
},
"request_id": "req-12345"
}
Common Error Codes¶
| Code | Description | Solution |
|---|---|---|
INVALID_FILE |
PDF file corrupt or invalid | Verify PDF file |
MODEL_NOT_FOUND |
Requested model unavailable | Check available models |
RATE_LIMITED |
Too many requests | Retry after delay |
PROCESSING_FAILED |
Internal processing error | Check logs |
TIMEOUT |
Processing timeout | Reduce page range |
Monitoring¶
Metrics Endpoint¶
GET /metrics
Response (Prometheus format):
netintel_requests_total{method="POST",endpoint="/process"} 1234
netintel_processing_duration_seconds{quantile="0.99"} 45.2
netintel_active_jobs 5
Logging¶
Next Steps¶
- MCP Server Guide - Model Context Protocol integration
- Batch Processing - Process multiple documents
- Deployment Guide - Production setup