Skip to content

Changelog

Version 0.1.17 - Latest Release - Hierarchical CLI & Knowledge Graph

Released: 2025-09-12

🎯 Major Feature - Hierarchical CLI Structure

  • NEW Command Structure: Complete redesign with 8 command groups for better organization
  • Command Groups:
  • process - Document processing (pdf, batch, watch)
  • server - Server operations (api, mcp, worker, all, dev, health)
  • db - Database management (query, merge, stats, cleanup, import, migrate)
  • kg - Knowledge Graph (18+ commands for graph operations)
  • model - Model management (list, set-default, preload, ollama)
  • project - Project initialization (templates: small, medium, large, enterprise)
  • config - Configuration management (profiles, templates, environment variables)
  • system - System utilities (check, diagnose, version, health, metrics)
  • Breaking Change: Old syntax netintel-ocr document.pdf → New syntax netintel-ocr process pdf document.pdf

🧠 Knowledge Graph System (Major Enhancement)

Core KG Features

  • FalkorDB Integration: Redis-based graph database for storing entities and relationships
  • Automatic Entity Extraction: Identifies network components, flow elements, and their relationships
  • PyKEEN Embeddings: 8 state-of-the-art models for knowledge graph embeddings (200-dim):
  • TransE (fast, simple relationships)
  • RotatE (complex relationships, default)
  • ComplEx (symmetric relationships)
  • DistMult, ConvE, TuckER, HolE, RESCAL
  • Default Enabled: KG features active by default in v0.1.17, use --no-kg to disable

Hybrid Retrieval System

  • 4 Retrieval Strategies:
  • Vector-first: Start with Milvus, expand with graph
  • Graph-first: Start with FalkorDB, enhance with vectors
  • Parallel: Execute both simultaneously with RRF
  • Adaptive: Auto-select based on query classification
  • Query Intent Classification: 6 query types for optimal routing:
  • Entity-centric, Relational, Topological
  • Semantic, Analytical, Exploratory
  • Reciprocal Rank Fusion (RRF): Advanced result merging for parallel search
  • Performance Metrics:
  • 92% query accuracy (vs 72% vector-only)
  • <150ms response time for hybrid queries
  • 25% storage reduction with unified storage

Enhanced MiniRAG Integration

  • 3 Query Modes:
  • minirag_only: Traditional RAG with vector search
  • kg_embedding_only: Pure KG embedding similarity
  • hybrid: Combined graph + vector context
  • Context Enrichment: Graph traversal adds related entities to context
  • Answer Generation: LLM with graph-aware context for better accuracy

KG CLI Commands (18+ new commands)

  • Initialization: kg init, kg check-requirements
  • Processing: kg process, kg train-embeddings
  • Querying: kg query, kg rag-query, kg hybrid-search
  • Analysis: kg path-find, kg find-similar, kg cluster
  • Visualization: kg visualize, kg embedding-stats
  • Management: kg export, kg batch-query, kg stats

✨ New Features

  • Configuration Templates: 6 pre-built templates (minimal, development, staging, production, enterprise, cloud)
  • Profile Management: Multiple configuration profiles with easy switching
  • Environment Variables: Complete configuration override capability
  • 18+ KG Commands: Including kg init, kg train-embeddings, kg hybrid-search, kg path-find
  • Visualization Tools: 2D/3D embedding visualization and clustering
  • Batch KG Processing: Automatic KG extraction during batch operations

🛠️ Technical Improvements

  • 50+ New Commands: Organized into intuitive hierarchical structure
  • Click Framework: Modern CLI framework for better command organization
  • Template System: Pre-configured templates for different deployment scenarios
  • Configuration Validation: Comprehensive validation with helpful error messages
  • Better Error Handling: Improved error messages and recovery

📚 Documentation

  • Complete CLI reference with all new commands
  • Migration guide from v0.1.16 to v0.1.17
  • Configuration template documentation
  • Knowledge Graph implementation guides
  • Hybrid retrieval architecture documentation
  • PyKEEN model selection guide

🔄 KG Implementation Details

What Gets Extracted

  • Network Components: Routers, switches, firewalls, servers, load balancers
  • Flow Elements: Process steps, decision points, data stores
  • Relationships: CONNECTS_TO, DEPENDS_ON, ROUTES_THROUGH, CONTAINS
  • Properties: IP addresses, VLANs, protocols, ports, bandwidth
  • Context: Security zones, business services, applications

Storage Architecture

  • FalkorDB: Graph structure + 200D KG embeddings as node properties
  • Milvus: 4096D text embeddings for semantic search
  • Unified Interface: Single query API for both graph and vector search

Example Usage

# Process with KG (default in v0.1.17)
netintel-ocr process pdf network-architecture.pdf

# Query the knowledge graph
netintel-ocr kg query "MATCH (n:NetworkDevice) RETURN n"

# Natural language query with MiniRAG
netintel-ocr kg rag-query "What are the security vulnerabilities?"

# Find paths between entities
netintel-ocr kg path-find "Router-A" "Database-Server"

# Visualize embeddings
netintel-ocr kg visualize --method tsne --output network-graph.html

Version 0.1.16.15

Released: 2025-09-01

🐛 Bug Fixes

  • Fixed DEFAULT token parser error by replacing 'default' with 'DefaultZone'
  • Enhanced connection handling to replace 'default' in arrow connections
  • Improved keyword conflict resolution for Mermaid reserved words

Version 0.1.16.14

Released: 2025-09-01

🐛 Bug Fixes

  • Fixed flow diagram parse errors with node-subgraph concatenation patterns
  • Enhanced Mermaid fixer to handle 'default' keyword issues in subgraphs
  • Improved preprocessing to separate concatenated node and subgraph definitions

Version 0.1.16.13

Released: 2025-09-01

✨ Features

  • Applied Mermaid validation fixes to flow diagrams
  • Added context extraction to flow diagrams using surrounding text
  • Enhanced flow processor with RobustMermaidValidator for auto-correction

Version 0.1.16.12

Released: 2025-09-01

✨ Features

  • Added context extraction for diagrams using surrounding text paragraphs
  • Enhanced validation to auto-correct LLM-generated Mermaid syntax issues

🐛 Bug Fixes

  • Fixed Mermaid diagram parsing errors with malformed subgraph/zone syntax

Version 0.1.16 - Major Release

Released: 2025-08-31

🎯 Major Features

  • Unified Diagram Detection: Automatic detection for network/flow/hybrid diagrams
  • Comprehensive Flow Processing: Full Mermaid generation for flow diagrams
  • Context-Aware Analysis: Uses surrounding text (2 paragraphs before/after)
  • Prompt Management System: Full customization without code changes
  • Default Model Update: NetIntelOCR-7B-0925 as default vision model

✨ New Capabilities

  • Flow diagram element extraction and Mermaid generation
  • Context extraction using document surrounding text
  • Complete prompt import/export system
  • Enhanced syntax validation and auto-correction

Version 0.1.15

Released: 2025-08-30

🚀 Performance Improvements

  • Milvus Integration: 20-60x faster search, 70% less memory usage
  • Qwen3-8B Embeddings: 4096-dimensional vectors via Ollama
  • Binary Vectors: Enhanced deduplication with SimHash
  • Simplified Deployment: One-command initialization with scales

🔧 Infrastructure

  • IVF_SQ8 indexing for CPU-optimized search
  • Distributed architecture support
  • Enhanced C++ deduplication core with AVX2 SIMD

Version 0.1.13

Released: 2025-08-25

✨ New Features

  • REST API Server: Full API mode with --api flag
  • MCP Server: Model Context Protocol support with --mcp
  • All-in-One Mode: Combined services with --all-in-one
  • Deployment Scales: Small/medium/large/enterprise configurations
  • Kubernetes Support: Helm charts and manifests generation

Version 0.1.12

Released: 2025-08-20

🎯 Major Features

  • Centralized Database: Unified LanceDB management
  • Advanced Query Engine: Multi-field filtering and reranking
  • Parallel Batch Processing: Progress tracking and resumability
  • Cloud Storage: S3/MinIO integration
  • Enhanced Embeddings: Multiple providers with intelligent caching

Version 0.1.10

Released: 2025-08-15

✨ Features

  • Hybrid Detection: Automatic network/flow diagram classification
  • Improved Accuracy: Enhanced component extraction algorithms
  • Better Error Handling: Graceful fallbacks for processing failures

Version 0.1.7

Released: 2025-08-10

🎯 Major Features

  • Vector Database: Automatic LanceDB file generation
  • RAG Optimization: Minimal metadata for optimal search
  • Chunk Management: Intelligent document chunking

Version 0.1.4

Released: 2025-08-01

✨ New Features

  • Multi-Model Support: Different models for different tasks
  • Model Optimization: Task-specific model selection
  • Performance Modes: Fast/balanced/accurate processing

Version 0.1.0

Released: 2025-07-01

🎉 Initial Release

  • Network diagram detection and extraction
  • Mermaid.js generation
  • PDF processing with OCR
  • Basic CLI interface
  • Ollama integration

Upcoming Features

Version 0.2.0 (Planned)

  • Web UI interface
  • Real-time collaboration
  • Custom model training
  • Enterprise SSO integration
  • Advanced analytics dashboard

Version 0.3.0 (Planned)

  • AutoML for model selection
  • Federated learning support
  • Multi-language support
  • Graph database integration
  • Compliance reporting

Migration Guides

From 0.1.15 to 0.1.16

  • Update default model to NetIntelOCR-7B-0925
  • Export and update prompts using new management system
  • Test flow diagram processing with new validator

From 0.1.12 to 0.1.15

  • Migrate from LanceDB to Milvus
  • Update embedding dimensions to 4096
  • Regenerate vector indices

From 0.1.7 to 0.1.12

  • Update batch processing scripts
  • Configure cloud storage backends
  • Migrate to centralized database

Deprecation Notices

Deprecated in 0.1.16

  • Old flow diagram processor (use enhanced version)
  • Manual prompt editing in code (use prompt management)

Deprecated in 0.1.15

  • LanceDB backend (use Milvus)
  • 768-dimension embeddings (use 4096)

Will be removed in 0.2.0

  • Legacy CLI arguments
  • Old configuration format
  • Direct Ollama API calls

Support

For issues and questions: - GitHub: https://github.com/VisionMLNet/NetIntelOCR/issues - Documentation: https://visionml.net/docs - PyPI: https://pypi.org/project/netintel-ocr/ - Discord: https://discord.gg/netintel-ocr