Skip to content

API v2 Complete User Guide

Overview

NetIntel-OCR v0.1.18.1 introduces API v2 with 100% feature parity to the CLI, providing enterprise-grade document intelligence capabilities through RESTful endpoints, GraphQL queries, and WebSocket real-time updates. All 30+ CLI options are now available programmatically!

Table of Contents

  1. Getting Started
  2. Authentication
  3. Document Management
  4. Milvus Vector Operations
  5. Search and Query
  6. Knowledge Graph
  7. Enterprise Features
  8. GraphQL API
  9. WebSocket Real-time
  10. Error Handling

Getting Started

Starting the API Server

# Start the API server
netintel-ocr server api --port 8000

# Start with authentication enabled
netintel-ocr server api --auth-enabled --port 8000

# Start with all features
netintel-ocr server all --port 8000

Base URL

All API v2 endpoints are prefixed with /api/v2:

http://localhost:8000/api/v2

Health Check

# Check API health
curl http://localhost:8000/api/v2/health

# Response
{
  "status": "healthy",
  "version": "0.1.18.0",
  "services": {
    "milvus": "connected",
    "falkordb": "connected",
    "redis": "connected",
    "ollama": "connected"
  }
}

Authentication

OAuth2 Authentication

NetIntel-OCR supports OAuth2/OIDC authentication with JWT tokens.

Login

# Login to get access token
curl -X POST http://localhost:8000/api/v2/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "username": "admin",
    "password": "your-password"
  }'

# Response
{
  "access_token": "eyJhbGciOiJIUzI1NiIs...",
  "token_type": "Bearer",
  "expires_in": 3600,
  "refresh_token": "refresh-token-here"
}

Using the Token

Include the token in the Authorization header for all subsequent requests:

curl http://localhost:8000/api/v2/documents \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..."

RBAC (Role-Based Access Control)

The API supports different roles with varying permissions:

  • admin: Full access to all endpoints
  • analyst: Read/write access to documents and search
  • viewer: Read-only access
  • operator: System management access

🆕 Complete Feature Parity (v0.1.18.1)

All CLI Options Now Available in API

NetIntel-OCR v0.1.18.1 achieves 100% feature parity between CLI and API. Every single CLI option is now available through the API:

Feature Category CLI Options API Fields Status
Multi-Model --model, --network-model, --flow-model ✅ Complete 100%
Processing 30+ options including all modes, extraction settings ✅ Complete 100%
Vector Milvus default, chunking strategies ✅ Complete 100%
Knowledge Graph Full KG extraction with embeddings ✅ Complete 100%
# Upload with ALL options (NEW in v0.1.18.1!)
curl -X POST http://localhost:8000/api/v2/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  -F 'options={
    "model": "nanonets-ocr-s",
    "network_model": "qwen2.5vl",
    "flow_model": "custom-flow",
    "pages": "1-50",
    "confidence": 0.8,
    "fast_extraction": true,
    "table_method": "hybrid",
    "with_kg": true,
    "vector_format": "milvus",
    "chunk_strategy": "semantic"
  }'

Document Management

Upload Document

Standard Upload with Multi-Model Support

# Upload a PDF document with multi-model configuration
curl -X POST http://localhost:8000/api/v2/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  -F 'options={
    "model": "nanonets-ocr-s",
    "network_model": "qwen2.5vl",
    "extract_tables": true,
    "with_kg": true,
    "vector_format": "milvus"
  }'

# Response
{
  "document_id": "doc_123456",
  "status": "processing",
  "filename": "document.pdf",
  "size": 5242880,
  "pages": 50,
  "processing_options": {
    "ocr_enabled": true,
    "kg_enabled": true,
    "vector_enabled": true
  }
}

Streaming Upload (Large Files)

For files larger than 100MB, use streaming upload:

import requests
import os

# Initialize streaming session
def init_streaming_upload(filename, file_size):
    response = requests.post(
        "http://localhost:8000/api/v2/documents/upload/stream/init",
        headers={"Authorization": f"Bearer {token}"},
        json={
            "filename": filename,
            "file_size": file_size,
            "chunk_size": 5 * 1024 * 1024  # 5MB chunks
        }
    )
    return response.json()["session_id"]

# Upload chunks
def upload_chunk(session_id, chunk_number, chunk_data):
    response = requests.post(
        f"http://localhost:8000/api/v2/documents/upload/stream/{session_id}/chunk",
        headers={"Authorization": f"Bearer {token}"},
        files={"chunk": chunk_data},
        data={"chunk_number": chunk_number}
    )
    return response.json()

# Complete upload
def complete_upload(session_id):
    response = requests.post(
        f"http://localhost:8000/api/v2/documents/upload/stream/{session_id}/complete",
        headers={"Authorization": f"Bearer {token}"}
    )
    return response.json()

# Example usage
file_path = "large_document.pdf"
file_size = os.path.getsize(file_path)
chunk_size = 5 * 1024 * 1024  # 5MB

session_id = init_streaming_upload("large_document.pdf", file_size)

with open(file_path, 'rb') as f:
    chunk_number = 0
    while True:
        chunk = f.read(chunk_size)
        if not chunk:
            break
        upload_chunk(session_id, chunk_number, chunk)
        chunk_number += 1

result = complete_upload(session_id)
print(f"Document uploaded: {result['document_id']}")

Document Versioning

# Get document versions
curl http://localhost:8000/api/v2/documents/doc_123456/versions \
  -H "Authorization: Bearer $TOKEN"

# Create a new version
curl -X POST http://localhost:8000/api/v2/documents/doc_123456/versions \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@updated_document.pdf" \
  -d "comment=Updated section 3"

# Compare versions
curl http://localhost:8000/api/v2/documents/doc_123456/versions/compare?v1=1&v2=2 \
  -H "Authorization: Bearer $TOKEN"

Batch Processing

# Submit batch processing job
curl -X POST http://localhost:8000/api/v2/documents/batch \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "input_path": "/data/documents/",
    "output_path": "/data/processed/",
    "parallel_workers": 8,
    "processing_options": {
      "enable_ocr": true,
      "enable_kg": true,
      "enable_vector": true,
      "enable_dedup": true
    }
  }'

# Check batch status
curl http://localhost:8000/api/v2/batch/batch_123/progress \
  -H "Authorization: Bearer $TOKEN"

Milvus Vector Operations

Collection Management

# Create a collection
curl -X POST http://localhost:8000/api/v2/milvus/collections \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "collection_name": "documents",
    "fields": [
      {
        "name": "id",
        "type": "int64",
        "is_primary": true,
        "auto_id": true
      },
      {
        "name": "embedding",
        "type": "float_vector",
        "dim": 768
      },
      {
        "name": "content",
        "type": "varchar",
        "max_length": 65535
      }
    ],
    "enable_dynamic_field": true
  }'

# List collections
curl http://localhost:8000/api/v2/milvus/collections \
  -H "Authorization: Bearer $TOKEN"

# Get collection details
curl http://localhost:8000/api/v2/milvus/collections/documents \
  -H "Authorization: Bearer $TOKEN"

# Load collection into memory
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/load \
  -H "Authorization: Bearer $TOKEN"

Index Management

# Create index
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/indexes \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "field_name": "embedding",
    "index_type": "IVF_FLAT",
    "metric_type": "L2",
    "params": {
      "nlist": 1024
    }
  }'

# Check index building progress
curl http://localhost:8000/api/v2/milvus/collections/documents/indexes/embedding/progress \
  -H "Authorization: Bearer $TOKEN"

Vector Operations

# Insert vectors
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/insert \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "data": [
      {
        "embedding": [0.1, 0.2, 0.3, ...],
        "content": "Document content here",
        "metadata": {
          "source": "document.pdf",
          "page": 1
        }
      }
    ]
  }'

# Search vectors
curl -X POST http://localhost:8000/api/v2/milvus/collections/documents/search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "vector": [0.1, 0.2, 0.3, ...],
    "top_k": 10,
    "filter": "page > 0",
    "output_fields": ["content", "metadata"]
  }'

Search and Query

# Advanced multi-field search
curl -X POST http://localhost:8000/api/v2/search/advanced \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "network security architecture",
    "filters": {
      "document_type": ["pdf", "docx"],
      "date_range": {
        "start": "2024-01-01",
        "end": "2024-12-31"
      },
      "confidence_min": 0.8
    },
    "search_options": {
      "enable_semantic": true,
      "enable_kg": true,
      "rerank_strategy": "cross_encoder",
      "max_results": 20
    }
  }'
# Hybrid vector + keyword search
curl -X POST http://localhost:8000/api/v2/search/hybrid \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text_query": "firewall configuration",
    "vector_query": [0.1, 0.2, 0.3, ...],
    "alpha": 0.7,
    "filters": {
      "document_type": "network_diagram"
    },
    "top_k": 15
  }'

Result Reranking

The API supports multiple reranking strategies:

# Search with reranking
curl -X POST http://localhost:8000/api/v2/search/similarity \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "VPN setup guide",
    "rerank_config": {
      "strategy": "cross_encoder",
      "model": "ms-marco-MiniLM-L-12-v2",
      "top_k": 10
    }
  }'

Available reranking strategies: - cross_encoder: Cross-encoder neural reranking - feature_based: Feature-based scoring - reciprocal_rank_fusion: RRF for combining multiple rankings - mmr: Maximal Marginal Relevance for diversity

Knowledge Graph

Initialize Knowledge Graph

# Initialize FalkorDB
curl -X POST http://localhost:8000/api/v2/kg/initialize \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "graph_name": "netintel_kg",
    "clear_existing": false
  }'

Cypher Queries

# Execute Cypher query
curl -X POST http://localhost:8000/api/v2/kg/cypher \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "MATCH (n:NetworkDevice)-[r:CONNECTS_TO]->(m:NetworkDevice) WHERE n.type = \"firewall\" RETURN n, r, m LIMIT 10"
  }'
# Hybrid search with Knowledge Graph
curl -X POST http://localhost:8000/api/v2/kg/hybrid-search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Show me all security devices in DMZ",
    "strategy": "adaptive",
    "include_embeddings": true,
    "max_hops": 2
  }'

Path Finding

# Find paths between entities
curl -X POST http://localhost:8000/api/v2/kg/paths \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "start_entity": "firewall_01",
    "end_entity": "database_server",
    "max_length": 5,
    "relationship_types": ["CONNECTS_TO", "ROUTES_THROUGH"]
  }'

Enterprise Features

Deduplication

# Check for duplicates
curl -X POST http://localhost:8000/api/v2/deduplication/check \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_path": "/data/document.pdf",
    "dedup_mode": "hybrid",
    "simhash_bits": 128,
    "hamming_threshold": 5
  }'

# Find similar documents
curl -X POST http://localhost:8000/api/v2/deduplication/find-similar \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_123456",
    "similarity_threshold": 0.85,
    "include_cdc_analysis": true,
    "limit": 20
  }'

Performance Monitoring

# Get performance metrics
curl http://localhost:8000/api/v2/performance/metrics \
  -H "Authorization: Bearer $TOKEN"

# Run benchmark
curl -X POST http://localhost:8000/api/v2/performance/benchmark \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "test_type": "vector_search",
    "dataset_size": 10000,
    "iterations": 5
  }'

Module Management

# Get module status
curl http://localhost:8000/api/v2/modules/status \
  -H "Authorization: Bearer $TOKEN"

# Configure modules
curl -X POST http://localhost:8000/api/v2/modules/configure \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "enable_kg": true,
    "enable_dedup": true,
    "enable_c_extensions": true,
    "vector_backend": "milvus"
  }'

Configuration Templates

# Get available templates
curl http://localhost:8000/api/v2/config/templates \
  -H "Authorization: Bearer $TOKEN"

# Apply template
curl -X POST http://localhost:8000/api/v2/config/apply-template \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "template": "enterprise",
    "customize": {
      "max_workers": 16,
      "cache_size": "10GB"
    }
  }'

GraphQL API

GraphQL Endpoint

The GraphQL endpoint is available at:

http://localhost:8000/api/v2/graphql

Query Examples

# Search documents
query SearchDocuments {
  searchDocuments(
    query: "network security",
    filters: {
      documentType: ["pdf"],
      dateRange: {
        start: "2024-01-01",
        end: "2024-12-31"
      }
    },
    limit: 10
  ) {
    id
    filename
    content
    metadata {
      pages
      size
      processingTime
    }
    entities {
      type
      value
      confidence
    }
  }
}

# Get document with versions
query GetDocument {
  document(id: "doc_123456") {
    id
    filename
    currentVersion
    versions {
      version
      createdAt
      comment
      size
    }
    chunks {
      id
      content
      embedding
      metadata
    }
  }
}

Mutations

# Process document
mutation ProcessDocument {
  processDocument(
    file: "document.pdf",
    options: {
      enableOCR: true,
      enableKG: true,
      enableVector: true
    }
  ) {
    documentId
    status
    estimatedTime
  }
}

# Update document metadata
mutation UpdateMetadata {
  updateDocumentMetadata(
    id: "doc_123456",
    metadata: {
      tags: ["security", "network"],
      category: "architecture",
      confidential: true
    }
  ) {
    success
    document {
      id
      metadata
    }
  }
}

Subscriptions

# Subscribe to processing updates
subscription ProcessingUpdates {
  documentProcessing(documentId: "doc_123456") {
    status
    progress
    currentStep
    errors
    warnings
  }
}

# Subscribe to search updates
subscription SearchUpdates {
  searchResults(sessionId: "search_789") {
    newResults {
      id
      content
      score
    }
    totalFound
    processingTime
  }
}

WebSocket Real-time

Connecting to WebSocket

// JavaScript WebSocket client
const ws = new WebSocket('ws://localhost:8000/api/v2/ws');

ws.onopen = () => {
  console.log('Connected to WebSocket');

  // Authenticate
  ws.send(JSON.stringify({
    type: 'auth',
    token: 'your-jwt-token'
  }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  switch(message.type) {
    case 'processing_update':
      console.log(`Processing: ${message.progress}%`);
      break;
    case 'search_result':
      console.log(`Found: ${message.result}`);
      break;
    case 'error':
      console.error(`Error: ${message.error}`);
      break;
  }
};

WebSocket Events

Available WebSocket event types:

  • processing_started: Document processing started
  • processing_progress: Processing progress update
  • processing_completed: Processing completed
  • processing_error: Processing error occurred
  • search_started: Search initiated
  • search_result: New search result
  • search_completed: Search completed
  • kg_update: Knowledge graph updated
  • vector_indexed: Vectors indexed in Milvus

Subscribing to Events

// Subscribe to specific document processing
ws.send(JSON.stringify({
  type: 'subscribe',
  channel: 'document',
  documentId: 'doc_123456'
}));

// Subscribe to search results
ws.send(JSON.stringify({
  type: 'subscribe',
  channel: 'search',
  searchId: 'search_789'
}));

// Unsubscribe
ws.send(JSON.stringify({
  type: 'unsubscribe',
  channel: 'document',
  documentId: 'doc_123456'
}));

Error Handling

Error Response Format

All API errors follow a consistent format:

{
  "error": {
    "code": "ERR_2001",
    "message": "Document not found",
    "details": {
      "document_id": "doc_123456",
      "suggestion": "Check if the document ID is correct"
    },
    "timestamp": "2024-09-22T10:30:00Z",
    "request_id": "req_abc123"
  }
}

Error Codes

Code Range Category Description
ERR_1xxx Authentication Auth/permission errors
ERR_2xxx Document Document processing errors
ERR_3xxx Vector/Milvus Vector database errors
ERR_4xxx Knowledge Graph KG operation errors
ERR_5xxx Search Search/query errors
ERR_6xxx System System/configuration errors

Common Error Codes

  • ERR_1001: Invalid authentication token
  • ERR_1002: Token expired
  • ERR_1003: Insufficient permissions
  • ERR_2001: Document not found
  • ERR_2002: Document processing failed
  • ERR_2003: Invalid document format
  • ERR_3001: Milvus connection failed
  • ERR_3002: Collection not found
  • ERR_3003: Vector dimension mismatch
  • ERR_4001: FalkorDB connection failed
  • ERR_4002: Invalid Cypher query
  • ERR_5001: Search query invalid
  • ERR_5002: No results found
  • ERR_6001: Service unavailable
  • ERR_6002: Rate limit exceeded

Handling Errors in Code

import requests

def safe_api_call(url, headers=None, json=None):
    try:
        response = requests.post(url, headers=headers, json=json)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.HTTPError as e:
        if response.status_code == 401:
            # Refresh token and retry
            refresh_token()
            return safe_api_call(url, headers, json)
        elif response.status_code == 429:
            # Rate limited, wait and retry
            time.sleep(60)
            return safe_api_call(url, headers, json)
        else:
            error_data = response.json().get('error', {})
            print(f"Error {error_data.get('code')}: {error_data.get('message')}")
            raise
    except requests.exceptions.RequestException as e:
        print(f"Network error: {e}")
        raise

Rate Limiting

The API implements multiple rate limiting strategies:

Rate Limit Headers

All responses include rate limit information:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1695384000
X-RateLimit-Strategy: sliding_window

Rate Limit Strategies

  1. Fixed Window: Fixed time windows (e.g., 100 requests per minute)
  2. Sliding Window: Rolling time window
  3. Token Bucket: Burst capacity with refill rate
  4. Leaky Bucket: Smooth rate limiting

Handling Rate Limits

def handle_rate_limit(response):
    if response.status_code == 429:
        reset_time = int(response.headers.get('X-RateLimit-Reset', 0))
        wait_time = max(0, reset_time - time.time())
        print(f"Rate limited. Waiting {wait_time} seconds...")
        time.sleep(wait_time)
        return True
    return False

Best Practices

1. Use Batch Operations

Instead of individual requests, batch operations when possible:

# Good: Batch insert
vectors = [generate_embedding(doc) for doc in documents]
response = api.insert_batch(vectors)

# Avoid: Individual inserts
for doc in documents:
    vector = generate_embedding(doc)
    api.insert_single(vector)  # Multiple API calls

2. Implement Exponential Backoff

def exponential_backoff(func, max_retries=5):
    for attempt in range(max_retries):
        try:
            return func()
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt
            time.sleep(wait_time)

3. Use WebSocket for Real-time Updates

For long-running operations, use WebSocket instead of polling:

# Good: WebSocket subscription
ws.subscribe('document', document_id)

# Avoid: Polling
while True:
    status = api.get_status(document_id)
    if status == 'completed':
        break
    time.sleep(5)  # Polling every 5 seconds

4. Cache Authentication Tokens

class APIClient:
    def __init__(self):
        self._token = None
        self._token_expiry = 0

    def get_token(self):
        if time.time() >= self._token_expiry:
            self._refresh_token()
        return self._token

    def _refresh_token(self):
        response = self.login()
        self._token = response['access_token']
        self._token_expiry = time.time() + response['expires_in'] - 60

5. Use Appropriate Search Strategy

Choose the right search strategy based on your use case:

  • Vector Search: For semantic similarity
  • Keyword Search: For exact matches
  • Hybrid Search: For best of both worlds
  • Knowledge Graph: For relationship queries

Examples and Use Cases

Example 1: Complete Document Processing Pipeline

import asyncio
import aiohttp

async def process_document_pipeline(file_path):
    async with aiohttp.ClientSession() as session:
        # 1. Upload document
        with open(file_path, 'rb') as f:
            data = aiohttp.FormData()
            data.add_field('file', f, filename='document.pdf')
            data.add_field('enable_ocr', 'true')
            data.add_field('enable_kg', 'true')
            data.add_field('enable_vector', 'true')

            async with session.post(
                'http://localhost:8000/api/v2/documents/upload',
                headers={'Authorization': f'Bearer {token}'},
                data=data
            ) as resp:
                result = await resp.json()
                document_id = result['document_id']

        # 2. Monitor processing via WebSocket
        async with session.ws_connect('ws://localhost:8000/api/v2/ws') as ws:
            await ws.send_json({'type': 'auth', 'token': token})
            await ws.send_json({
                'type': 'subscribe',
                'channel': 'document',
                'documentId': document_id
            })

            async for msg in ws:
                if msg.type == aiohttp.WSMsgType.TEXT:
                    data = msg.json()
                    if data['type'] == 'processing_completed':
                        break
                    print(f"Progress: {data.get('progress', 0)}%")

        # 3. Search the processed document
        async with session.post(
            'http://localhost:8000/api/v2/search/advanced',
            headers={'Authorization': f'Bearer {token}'},
            json={
                'query': 'network architecture',
                'filters': {'document_id': document_id},
                'search_options': {
                    'enable_semantic': True,
                    'enable_kg': True
                }
            }
        ) as resp:
            search_results = await resp.json()

        return search_results

# Run the pipeline
results = asyncio.run(process_document_pipeline('document.pdf'))

Example 2: Knowledge Graph Analysis

def analyze_network_topology(api_client):
    # 1. Initialize KG if needed
    api_client.kg_initialize(graph_name='network_topology')

    # 2. Find all critical paths
    critical_paths = api_client.kg_cypher(
        """
        MATCH p = (s:Server)-[*]->(d:Database)
        WHERE s.critical = true AND d.sensitive = true
        RETURN p
        ORDER BY length(p)
        LIMIT 10
        """
    )

    # 3. Identify security vulnerabilities
    vulnerabilities = api_client.kg_cypher(
        """
        MATCH (n:NetworkDevice)
        WHERE NOT (n)-[:PROTECTED_BY]->(:Firewall)
        RETURN n.name, n.type, n.ip_address
        """
    )

    # 4. Find single points of failure
    spof = api_client.kg_cypher(
        """
        MATCH (n:NetworkDevice)
        WHERE size((n)-[:CONNECTS_TO]-()) > 5
        AND NOT exists(n.redundancy)
        RETURN n
        """
    )

    return {
        'critical_paths': critical_paths,
        'vulnerabilities': vulnerabilities,
        'single_points_of_failure': spof
    }

Troubleshooting

Common Issues and Solutions

1. Milvus Connection Failed

# Check Milvus status
curl http://localhost:8000/api/v2/health/dependencies

# Solution: Ensure Milvus is running
docker run -d --name milvus \
  -p 19530:19530 \
  -p 9091:9091 \
  milvusdb/milvus:latest

2. Authentication Issues

# Test authentication
curl -X POST http://localhost:8000/api/v2/auth/verify \
  -H "Authorization: Bearer $TOKEN"

# Solution: Refresh token if expired
curl -X POST http://localhost:8000/api/v2/auth/refresh \
  -d "refresh_token=$REFRESH_TOKEN"

3. Slow Search Performance

# Optimize search with proper indexing
api_client.create_index(
    collection='documents',
    field='embedding',
    index_type='IVF_SQ8',  # Use for large datasets
    params={'nlist': 2048}
)

# Use filters to reduce search space
results = api_client.search(
    query='security',
    filters='date >= "2024-01-01" AND type == "network_diagram"',
    top_k=10
)

Performance Optimization

1. Caching Configuration

Configure multi-tier caching for better performance:

# config.yml
cache:
  memory:
    enabled: true
    size: 1GB
    ttl: 3600
  redis:
    enabled: true
    host: localhost
    port: 6379
    ttl: 86400
  strategy: hybrid  # memory -> redis -> source

2. Connection Pooling

# Use connection pooling for better performance
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

session = requests.Session()
retry = Retry(total=3, backoff_factor=0.3)
adapter = HTTPAdapter(max_retries=retry, pool_connections=10, pool_maxsize=10)
session.mount('http://', adapter)
session.mount('https://', adapter)

3. Batch Processing Settings

# Optimal batch settings for large datasets
curl -X POST http://localhost:8000/api/v2/documents/batch \
  -d '{
    "parallel_workers": 16,
    "batch_size": 100,
    "checkpoint_interval": 500,
    "memory_limit": "8GB"
  }'

Security Best Practices

1. API Key Rotation

Regularly rotate API keys and tokens:

def rotate_api_key():
    # Generate new API key
    new_key = api_client.generate_api_key()

    # Update applications
    update_application_configs(new_key)

    # Revoke old key after grace period
    schedule_revocation(old_key, delay_hours=24)

2. Request Signing

Sign sensitive requests:

import hmac
import hashlib

def sign_request(payload, secret):
    message = json.dumps(payload, sort_keys=True)
    signature = hmac.new(
        secret.encode(),
        message.encode(),
        hashlib.sha256
    ).hexdigest()
    return signature

3. Audit Logging

Enable comprehensive audit logging:

# Configure audit logging
curl -X POST http://localhost:8000/api/v2/admin/audit/configure \
  -d '{
    "enabled": true,
    "log_level": "detailed",
    "include_request_body": true,
    "include_response_body": false,
    "retention_days": 90
  }'

Next Steps

  1. Explore MCP Integration: See the MCP Integration Guide
  2. Learn about Milvus: Read the Milvus Vector Database Guide
  3. Deploy to Production: Follow the Production Deployment Guide
  4. Configure Authentication: See the Authentication & Security Guide

API Reference

For complete API reference documentation, visit: - Swagger UI: http://localhost:8000/docs - ReDoc: http://localhost:8000/redoc - OpenAPI Schema: http://localhost:8000/openapi.json