API v2 Complete User Guide¶

Overview¶

NetIntel-OCR v0.1.18.1 introduces API v2 with 100% feature parity to the CLI, providing enterprise-grade document intelligence capabilities through RESTful endpoints, GraphQL queries, and WebSocket real-time updates. All 30+ CLI options are now available programmatically!

Getting Started¶

Starting the API Server¶

# Start the API server
netintel-ocr server api --port 8000

# Start with authentication enabled
netintel-ocr server api --auth-enabled --port 8000

# Start with all features
netintel-ocr server all --port 8000

Base URL¶

All API v2 endpoints are prefixed with /api/v2:

http://localhost:8000/api/v2

Health Check¶

# Check API health
curl http://localhost:8000/api/v2/health

# Response
{
  "status": "healthy",
  "version": "0.1.18.0",
  "services": {
    "milvus": "connected",
    "falkordb": "connected",
    "redis": "connected",
    "ollama": "connected"
  }
}

Authentication¶

OAuth2 Authentication¶

NetIntel-OCR supports OAuth2/OIDC authentication with JWT tokens.

# Login to get access token
curl -X POST http://localhost:8000/api/v2/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "username": "admin",
    "password": "your-password"
  }'

# Response
{
  "access_token": "eyJhbGciOiJIUzI1NiIs...",
  "token_type": "Bearer",
  "expires_in": 3600,
  "refresh_token": "refresh-token-here"
}

Using the Token¶

Include the token in the Authorization header for all subsequent requests:

curl http://localhost:8000/api/v2/documents \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..."

RBAC (Role-Based Access Control)¶

The API supports different roles with varying permissions:

admin: Full access to all endpoints
analyst: Read/write access to documents and search
viewer: Read-only access
operator: System management access

🆕 Complete Feature Parity (v0.1.18.1)¶

All CLI Options Now Available in API¶

NetIntel-OCR v0.1.18.1 achieves 100% feature parity between CLI and API. Every single CLI option is now available through the API:

Feature Category	CLI Options	API Fields	Status
Multi-Model	`--model`, `--network-model`, `--flow-model`	✅ Complete	100%
Processing	30+ options including all modes, extraction settings	✅ Complete	100%
Vector	Milvus default, chunking strategies	✅ Complete	100%
Knowledge Graph	Full KG extraction with embeddings	✅ Complete	100%

Example: Full-Featured Document Upload¶

# Upload with ALL options (NEW in v0.1.18.1!)
curl -X POST http://localhost:8000/api/v2/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  -F 'options={
    "model": "nanonets-ocr-s",
    "network_model": "qwen2.5vl",
    "flow_model": "custom-flow",
    "pages": "1-50",
    "confidence": 0.8,
    "fast_extraction": true,
    "table_method": "hybrid",
    "with_kg": true,
    "vector_format": "milvus",
    "chunk_strategy": "semantic"
  }'

Document Management¶

Upload Document¶

Standard Upload with Multi-Model Support¶

# Upload a PDF document with multi-model configuration
curl -X POST http://localhost:8000/api/v2/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  -F 'options={
    "model": "nanonets-ocr-s",
    "network_model": "qwen2.5vl",
    "extract_tables": true,
    "with_kg": true,
    "vector_format": "milvus"
  }'

# Response
{
  "document_id": "doc_123456",
  "status": "processing",
  "filename": "document.pdf",
  "size": 5242880,
  "pages": 50,
  "processing_options": {
    "ocr_enabled": true,
    "kg_enabled": true,
    "vector_enabled": true
  }
}

Streaming Upload (Large Files)¶

For files larger than 100MB, use streaming upload:

import requests
import os

# Initialize streaming session
def init_streaming_upload(filename, file_size):
    response = requests.post(
        "http://localhost:8000/api/v2/documents/upload/stream/init",
        headers={"Authorization": f"Bearer {token}"},
        json={
            "filename": filename,
            "file_size": file_size,
            "chunk_size": 5 * 1024 * 1024  # 5MB chunks
        }
    )
    return response.json()["session_id"]

# Upload chunks
def upload_chunk(session_id, chunk_number, chunk_data):
    response = requests.post(
        f"http://localhost:8000/api/v2/documents/upload/stream/{session_id}/chunk",
        headers={"Authorization": f"Bearer {token}"},
        files={"chunk": chunk_data},
        data={"chunk_number": chunk_number}
    )
    return response.json()

# Complete upload
def complete_upload(session_id):
    response = requests.post(
        f"http://localhost:8000/api/v2/documents/upload/stream/{session_id}/complete",
        headers={"Authorization": f"Bearer {token}"}
    )
    return response.json()

# Example usage
file_path = "large_document.pdf"
file_size = os.path.getsize(file_path)
chunk_size = 5 * 1024 * 1024  # 5MB

session_id = init_streaming_upload("large_document.pdf", file_size)

with open(file_path, 'rb') as f:
    chunk_number = 0
    while True:
        chunk = f.read(chunk_size)
        if not chunk:
            break
        upload_chunk(session_id, chunk_number, chunk)
        chunk_number += 1

result = complete_upload(session_id)
print(f"Document uploaded: {result['document_id']}")

Document Versioning¶

# Get document versions
curl http://localhost:8000/api/v2/documents/doc_123456/versions \
  -H "Authorization: Bearer $TOKEN"

# Create a new version
curl -X POST http://localhost:8000/api/v2/documents/doc_123456/versions \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@updated_document.pdf" \
  -d "comment=Updated section 3"

# Compare versions
curl http://localhost:8000/api/v2/documents/doc_123456/versions/compare?v1=1&v2=2 \
  -H "Authorization: Bearer $TOKEN"

Batch Processing¶

# Submit batch processing job
curl -X POST http://localhost:8000/api/v2/documents/batch \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "input_path": "/data/documents/",
    "output_path": "/data/processed/",
    "parallel_workers": 8,
    "processing_options": {
      "enable_ocr": true,
      "enable_kg": true,
      "enable_vector": true,
      "enable_dedup": true
    }
  }'

# Check batch status
curl http://localhost:8000/api/v2/batch/batch_123/progress \
  -H "Authorization: Bearer $TOKEN"

Milvus Vector Operations¶

Collection Management¶

# Create a collection
curl -X POST http://localhost:8000/api/v2/milvus/collections \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "collection_name": "documents",
    "fields": [
      {
        "name": "id",
        "type": "int64",
        "is_primary": true,
        "auto_id": true
      },
      {
        "name": "embedding",
        "type": "float_vector",
        "dim": 768
      },
      {
        "name": "content",
        "type": "varchar",
        "max_length": 65535
      }
    ],
    "enable_dynamic_field": true
  }'

# List collections
curl http://localhost:8000/api/v2/milvus/collections \
  -H "Authorization: Bearer $TOKEN"

# Get collection details
curl http://localhost:8000/api/v2/milvus/collections/documents \
  -H "Authorization: Bearer $TOKEN"

# Load collection into memory
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/load \
  -H "Authorization: Bearer $TOKEN"

Index Management¶

# Create index
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/indexes \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "field_name": "embedding",
    "index_type": "IVF_FLAT",
    "metric_type": "L2",
    "params": {
      "nlist": 1024
    }
  }'

# Check index building progress
curl http://localhost:8000/api/v2/milvus/collections/documents/indexes/embedding/progress \
  -H "Authorization: Bearer $TOKEN"

Vector Operations¶

# Insert vectors
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/insert \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "data": [
      {
        "embedding": [0.1, 0.2, 0.3, ...],
        "content": "Document content here",
        "metadata": {
          "source": "document.pdf",
          "page": 1
        }
      }
    ]
  }'

# Search vectors
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "vector": [0.1, 0.2, 0.3, ...],
    "top_k": 10,
    "filter": "page > 0",
    "output_fields": ["content", "metadata"]
  }'

Search and Query¶

Advanced Search¶

# Advanced multi-field search
curl -X POST http://localhost:8000/api/v2/search/advanced \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "network security architecture",
    "filters": {
      "document_type": ["pdf", "docx"],
      "date_range": {
        "start": "2024-01-01",
        "end": "2024-12-31"
      },
      "confidence_min": 0.8
    },
    "search_options": {
      "enable_semantic": true,
      "enable_kg": true,
      "rerank_strategy": "cross_encoder",
      "max_results": 20
    }
  }'

Hybrid Search¶

# Hybrid vector + keyword search
curl -X POST http://localhost:8000/api/v2/search/hybrid \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text_query": "firewall configuration",
    "vector_query": [0.1, 0.2, 0.3, ...],
    "alpha": 0.7,
    "filters": {
      "document_type": "network_diagram"
    },
    "top_k": 15
  }'

Result Reranking¶

The API supports multiple reranking strategies:

# Search with reranking
curl -X POST http://localhost:8000/api/v2/search/similarity \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "VPN setup guide",
    "rerank_config": {
      "strategy": "cross_encoder",
      "model": "ms-marco-MiniLM-L-12-v2",
      "top_k": 10
    }
  }'

Available reranking strategies: - cross_encoder: Cross-encoder neural reranking - feature_based: Feature-based scoring - reciprocal_rank_fusion: RRF for combining multiple rankings - mmr: Maximal Marginal Relevance for diversity

Knowledge Graph¶

Initialize Knowledge Graph¶

# Initialize FalkorDB
curl -X POST http://localhost:8000/api/v2/kg/initialize \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "graph_name": "netintel_kg",
    "clear_existing": false
  }'

Cypher Queries¶

# Execute Cypher query
curl -X POST http://localhost:8000/api/v2/kg/cypher \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "MATCH (n:NetworkDevice)-[r:CONNECTS_TO]->(m:NetworkDevice) WHERE n.type = \"firewall\" RETURN n, r, m LIMIT 10"
  }'

Hybrid KG Search¶

# Hybrid search with Knowledge Graph
curl -X POST http://localhost:8000/api/v2/kg/hybrid-search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Show me all security devices in DMZ",
    "strategy": "adaptive",
    "include_embeddings": true,
    "max_hops": 2
  }'

Path Finding¶

# Find paths between entities
curl -X POST http://localhost:8000/api/v2/kg/paths \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "start_entity": "firewall_01",
    "end_entity": "database_server",
    "max_length": 5,
    "relationship_types": ["CONNECTS_TO", "ROUTES_THROUGH"]
  }'

Enterprise Features¶

Deduplication¶

# Check for duplicates
curl -X POST http://localhost:8000/api/v2/deduplication/check \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_path": "/data/document.pdf",
    "dedup_mode": "hybrid",
    "simhash_bits": 128,
    "hamming_threshold": 5
  }'

# Find similar documents
curl -X POST http://localhost:8000/api/v2/deduplication/find-similar \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_123456",
    "similarity_threshold": 0.85,
    "include_cdc_analysis": true,
    "limit": 20
  }'

Performance Monitoring¶

# Get performance metrics
curl http://localhost:8000/api/v2/performance/metrics \
  -H "Authorization: Bearer $TOKEN"

# Run benchmark
curl -X POST http://localhost:8000/api/v2/performance/benchmark \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "test_type": "vector_search",
    "dataset_size": 10000,
    "iterations": 5
  }'

Module Management¶

# Get module status
curl http://localhost:8000/api/v2/modules/status \
  -H "Authorization: Bearer $TOKEN"

# Configure modules
curl -X POST http://localhost:8000/api/v2/modules/configure \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "enable_kg": true,
    "enable_dedup": true,
    "enable_c_extensions": true,
    "vector_backend": "milvus"
  }'

Configuration Templates¶

# Get available templates
curl http://localhost:8000/api/v2/config/templates \
  -H "Authorization: Bearer $TOKEN"

# Apply template
curl -X POST http://localhost:8000/api/v2/config/apply-template \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "template": "enterprise",
    "customize": {
      "max_workers": 16,
      "cache_size": "10GB"
    }
  }'

GraphQL API¶

GraphQL Endpoint¶

The GraphQL endpoint is available at:

http://localhost:8000/api/v2/graphql

Query Examples¶

# Search documents
query SearchDocuments {
  searchDocuments(
    query: "network security",
    filters: {
      documentType: ["pdf"],
      dateRange: {
        start: "2024-01-01",
        end: "2024-12-31"
      }
    },
    limit: 10
  ) {
    id
    filename
    content
    metadata {
      pages
      size
      processingTime
    }
    entities {
      type
      value
      confidence
    }
  }
}

# Get document with versions
query GetDocument {
  document(id: "doc_123456") {
    id
    filename
    currentVersion
    versions {
      version
      createdAt
      comment
      size
    }
    chunks {
      id
      content
      embedding
      metadata
    }
  }
}

Mutations¶

# Process document
mutation ProcessDocument {
  processDocument(
    file: "document.pdf",
    options: {
      enableOCR: true,
      enableKG: true,
      enableVector: true
    }
  ) {
    documentId
    status
    estimatedTime
  }
}

# Update document metadata
mutation UpdateMetadata {
  updateDocumentMetadata(
    id: "doc_123456",
    metadata: {
      tags: ["security", "network"],
      category: "architecture",
      confidential: true
    }
  ) {
    success
    document {
      id
      metadata
    }
  }
}

Subscriptions¶

# Subscribe to processing updates
subscription ProcessingUpdates {
  documentProcessing(documentId: "doc_123456") {
    status
    progress
    currentStep
    errors
    warnings
  }
}

# Subscribe to search updates
subscription SearchUpdates {
  searchResults(sessionId: "search_789") {
    newResults {
      id
      content
      score
    }
    totalFound
    processingTime
  }
}

WebSocket Real-time¶

Connecting to WebSocket¶

// JavaScript WebSocket client
const ws = new WebSocket('ws://localhost:8000/api/v2/ws');

ws.onopen = () => {
  console.log('Connected to WebSocket');

  // Authenticate
  ws.send(JSON.stringify({
    type: 'auth',
    token: 'your-jwt-token'
  }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  switch(message.type) {
    case 'processing_update':
      console.log(`Processing: ${message.progress}%`);
      break;
    case 'search_result':
      console.log(`Found: ${message.result}`);
      break;
    case 'error':
      console.error(`Error: ${message.error}`);
      break;
  }
};

WebSocket Events¶

Available WebSocket event types:

processing_started: Document processing started
processing_progress: Processing progress update
processing_completed: Processing completed
processing_error: Processing error occurred
search_started: Search initiated
search_result: New search result
search_completed: Search completed
kg_update: Knowledge graph updated
vector_indexed: Vectors indexed in Milvus

Subscribing to Events¶

// Subscribe to specific document processing
ws.send(JSON.stringify({
  type: 'subscribe',
  channel: 'document',
  documentId: 'doc_123456'
}));

// Subscribe to search results
ws.send(JSON.stringify({
  type: 'subscribe',
  channel: 'search',
  searchId: 'search_789'
}));

// Unsubscribe
ws.send(JSON.stringify({
  type: 'unsubscribe',
  channel: 'document',
  documentId: 'doc_123456'
}));

Error Handling¶

Error Response Format¶

All API errors follow a consistent format:

{
  "error": {
    "code": "ERR_2001",
    "message": "Document not found",
    "details": {
      "document_id": "doc_123456",
      "suggestion": "Check if the document ID is correct"
    },
    "timestamp": "2024-09-22T10:30:00Z",
    "request_id": "req_abc123"
  }
}

Error Codes¶

Code Range	Category	Description
ERR_1xxx	Authentication	Auth/permission errors
ERR_2xxx	Document	Document processing errors
ERR_3xxx	Vector/Milvus	Vector database errors
ERR_4xxx	Knowledge Graph	KG operation errors
ERR_5xxx	Search	Search/query errors
ERR_6xxx	System	System/configuration errors

Common Error Codes¶

ERR_1001: Invalid authentication token
ERR_1002: Token expired
ERR_1003: Insufficient permissions
ERR_2001: Document not found
ERR_2002: Document processing failed
ERR_2003: Invalid document format
ERR_3001: Milvus connection failed
ERR_3002: Collection not found
ERR_3003: Vector dimension mismatch
ERR_4001: FalkorDB connection failed
ERR_4002: Invalid Cypher query
ERR_5001: Search query invalid
ERR_5002: No results found
ERR_6001: Service unavailable
ERR_6002: Rate limit exceeded

Handling Errors in Code¶

import requests

def safe_api_call(url, headers=None, json=None):
    try:
        response = requests.post(url, headers=headers, json=json)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.HTTPError as e:
        if response.status_code == 401:
            # Refresh token and retry
            refresh_token()
            return safe_api_call(url, headers, json)
        elif response.status_code == 429:
            # Rate limited, wait and retry
            time.sleep(60)
            return safe_api_call(url, headers, json)
        else:
            error_data = response.json().get('error', {})
            print(f"Error {error_data.get('code')}: {error_data.get('message')}")
            raise
    except requests.exceptions.RequestException as e:
        print(f"Network error: {e}")
        raise

Rate Limiting¶

The API implements multiple rate limiting strategies:

Rate Limit Headers¶

All responses include rate limit information:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1695384000
X-RateLimit-Strategy: sliding_window

Rate Limit Strategies¶

Fixed Window: Fixed time windows (e.g., 100 requests per minute)
Sliding Window: Rolling time window
Token Bucket: Burst capacity with refill rate
Leaky Bucket: Smooth rate limiting

Handling Rate Limits¶

def handle_rate_limit(response):
    if response.status_code == 429:
        reset_time = int(response.headers.get('X-RateLimit-Reset', 0))
        wait_time = max(0, reset_time - time.time())
        print(f"Rate limited. Waiting {wait_time} seconds...")
        time.sleep(wait_time)
        return True
    return False

Best Practices¶

1. Use Batch Operations¶

Instead of individual requests, batch operations when possible:

# Good: Batch insert
vectors = [generate_embedding(doc) for doc in documents]
response = api.insert_batch(vectors)

# Avoid: Individual inserts
for doc in documents:
    vector = generate_embedding(doc)
    api.insert_single(vector)  # Multiple API calls

2. Implement Exponential Backoff¶

def exponential_backoff(func, max_retries=5):
    for attempt in range(max_retries):
        try:
            return func()
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt
            time.sleep(wait_time)

3. Use WebSocket for Real-time Updates¶

For long-running operations, use WebSocket instead of polling:

# Good: WebSocket subscription
ws.subscribe('document', document_id)

# Avoid: Polling
while True:
    status = api.get_status(document_id)
    if status == 'completed':
        break
    time.sleep(5)  # Polling every 5 seconds

4. Cache Authentication Tokens¶

class APIClient:
    def __init__(self):
        self._token = None
        self._token_expiry = 0

    def get_token(self):
        if time.time() >= self._token_expiry:
            self._refresh_token()
        return self._token

    def _refresh_token(self):
        response = self.login()
        self._token = response['access_token']
        self._token_expiry = time.time() + response['expires_in'] - 60

5. Use Appropriate Search Strategy¶

Choose the right search strategy based on your use case:

Vector Search: For semantic similarity
Keyword Search: For exact matches
Hybrid Search: For best of both worlds
Knowledge Graph: For relationship queries

Examples and Use Cases¶

Example 1: Complete Document Processing Pipeline¶

import asyncio
import aiohttp

async def process_document_pipeline(file_path):
    async with aiohttp.ClientSession() as session:
        # 1. Upload document
        with open(file_path, 'rb') as f:
            data = aiohttp.FormData()
            data.add_field('file', f, filename='document.pdf')
            data.add_field('enable_ocr', 'true')
            data.add_field('enable_kg', 'true')
            data.add_field('enable_vector', 'true')

            async with session.post(
                'http://localhost:8000/api/v2/documents/upload',
                headers={'Authorization': f'Bearer {token}'},
                data=data
            ) as resp:
                result = await resp.json()
                document_id = result['document_id']

        # 2. Monitor processing via WebSocket
        async with session.ws_connect('ws://localhost:8000/api/v2/ws') as ws:
            await ws.send_json({'type': 'auth', 'token': token})
            await ws.send_json({
                'type': 'subscribe',
                'channel': 'document',
                'documentId': document_id
            })

            async for msg in ws:
                if msg.type == aiohttp.WSMsgType.TEXT:
                    data = msg.json()
                    if data['type'] == 'processing_completed':
                        break
                    print(f"Progress: {data.get('progress', 0)}%")

        # 3. Search the processed document
        async with session.post(
            'http://localhost:8000/api/v2/search/advanced',
            headers={'Authorization': f'Bearer {token}'},
            json={
                'query': 'network architecture',
                'filters': {'document_id': document_id},
                'search_options': {
                    'enable_semantic': True,
                    'enable_kg': True
                }
            }
        ) as resp:
            search_results = await resp.json()

        return search_results

# Run the pipeline
results = asyncio.run(process_document_pipeline('document.pdf'))

Example 2: Knowledge Graph Analysis¶

def analyze_network_topology(api_client):
    # 1. Initialize KG if needed
    api_client.kg_initialize(graph_name='network_topology')

    # 2. Find all critical paths
    critical_paths = api_client.kg_cypher(
        """
        MATCH p = (s:Server)-[*]->(d:Database)
        WHERE s.critical = true AND d.sensitive = true
        RETURN p
        ORDER BY length(p)
        LIMIT 10
        """
    )

    # 3. Identify security vulnerabilities
    vulnerabilities = api_client.kg_cypher(
        """
        MATCH (n:NetworkDevice)
        WHERE NOT (n)-[:PROTECTED_BY]->(:Firewall)
        RETURN n.name, n.type, n.ip_address
        """
    )

    # 4. Find single points of failure
    spof = api_client.kg_cypher(
        """
        MATCH (n:NetworkDevice)
        WHERE size((n)-[:CONNECTS_TO]-()) > 5
        AND NOT exists(n.redundancy)
        RETURN n
        """
    )

    return {
        'critical_paths': critical_paths,
        'vulnerabilities': vulnerabilities,
        'single_points_of_failure': spof
    }

Troubleshooting¶

Common Issues and Solutions¶

1. Milvus Connection Failed¶

# Check Milvus status
curl http://localhost:8000/api/v2/health/dependencies

# Solution: Ensure Milvus is running
docker run -d --name milvus \
  -p 19530:19530 \
  -p 9091:9091 \
  milvusdb/milvus:latest

2. Authentication Issues¶

# Test authentication
curl -X POST http://localhost:8000/api/v2/auth/verify \
  -H "Authorization: Bearer $TOKEN"

# Solution: Refresh token if expired
curl -X POST http://localhost:8000/api/v2/auth/refresh \
  -d "refresh_token=$REFRESH_TOKEN"

3. Slow Search Performance¶

# Optimize search with proper indexing
api_client.create_index(
    collection='documents',
    field='embedding',
    index_type='IVF_SQ8',  # Use for large datasets
    params={'nlist': 2048}
)

# Use filters to reduce search space
results = api_client.search(
    query='security',
    filters='date >= "2024-01-01" AND type == "network_diagram"',
    top_k=10
)

Performance Optimization¶

1. Caching Configuration¶

Configure multi-tier caching for better performance:

# config.yml
cache:
  memory:
    enabled: true
    size: 1GB
    ttl: 3600
  redis:
    enabled: true
    host: localhost
    port: 6379
    ttl: 86400
  strategy: hybrid  # memory -> redis -> source

2. Connection Pooling¶

# Use connection pooling for better performance
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

session = requests.Session()
retry = Retry(total=3, backoff_factor=0.3)
adapter = HTTPAdapter(max_retries=retry, pool_connections=10, pool_maxsize=10)
session.mount('http://', adapter)
session.mount('https://', adapter)

3. Batch Processing Settings¶

# Optimal batch settings for large datasets
curl -X POST http://localhost:8000/api/v2/documents/batch \
  -d '{
    "parallel_workers": 16,
    "batch_size": 100,
    "checkpoint_interval": 500,
    "memory_limit": "8GB"
  }'

Security Best Practices¶

1. API Key Rotation¶

Regularly rotate API keys and tokens:

def rotate_api_key():
    # Generate new API key
    new_key = api_client.generate_api_key()

    # Update applications
    update_application_configs(new_key)

    # Revoke old key after grace period
    schedule_revocation(old_key, delay_hours=24)

2. Request Signing¶

Sign sensitive requests:

import hmac
import hashlib

def sign_request(payload, secret):
    message = json.dumps(payload, sort_keys=True)
    signature = hmac.new(
        secret.encode(),
        message.encode(),
        hashlib.sha256
    ).hexdigest()
    return signature

3. Audit Logging¶

Enable comprehensive audit logging:

# Configure audit logging
curl -X POST http://localhost:8000/api/v2/admin/audit/configure \
  -d '{
    "enabled": true,
    "log_level": "detailed",
    "include_request_body": true,
    "include_response_body": false,
    "retention_days": 90
  }'

Next Steps¶

Explore MCP Integration: See the MCP Integration Guide
Learn about Milvus: Read the Milvus Vector Database Guide
Deploy to Production: Follow the Production Deployment Guide
Configure Authentication: See the Authentication & Security Guide

API Reference¶

For complete API reference documentation, visit: - Swagger UI: http://localhost:8000/docs - ReDoc: http://localhost:8000/redoc - OpenAPI Schema: http://localhost:8000/openapi.json