Skip to main content

Status Tracking

Operational Visibility

Status tracking in MatsuDB provides visibility into the processing state of nodes throughout their lifecycle. Each node can have multiple status records, one per operation, enabling fine-grained tracking of different processing workflows.

Definition

Status tracking in MatsuDB provides visibility into the processing state of nodes throughout their lifecycle. Each node can have multiple status records, one per operation, enabling fine-grained tracking of different processing workflows.

Status tracking operates at the operation level, where each operation—parsing, embedding, augmentation—maintains its own status record for each node it processes. This granular tracking enables operators to understand processing progress, identify bottlenecks, diagnose failures, and monitor system health. Status information includes state transitions, error messages, processing times, and timestamps that provide complete visibility into processing operations.

The status system enables proactive monitoring and rapid response to processing issues. Operators can query node statuses to understand which nodes are being processed, which have completed, and which have failed. This visibility enables targeted fixes rather than blanket reprocessing, supporting the bonsai philosophy of selective, attentive processing.

Core Philosophy: Operational Visibility

Status tracking embodies the principle that processing should be observable and debuggable. Rather than treating document processing as a black box, the status system provides real-time visibility into what is happening at every moment. This visibility enables operators to understand system behavior, diagnose problems, and optimize performance.

The status system recognizes that processing involves multiple independent operations that can succeed or fail independently. A node might have completed parsing but failed embedding, or completed embedding but failed augmentation. Status tracking maintains separate records for each operation, enabling operators to understand the complete processing state without conflating different operation outcomes.

Status information serves multiple purposes: monitoring system health, debugging processing failures, optimizing performance, and providing user feedback. The system updates status information throughout processing, ensuring that visibility remains current and actionable rather than historical and retrospective.

Status States

Status tracking uses five distinct states that represent the complete lifecycle of a processing operation:

Indicates that a node awaits processing for a specific operation. This state represents nodes that have been queued but not yet started, enabling operators to understand processing backlogs and queue depths.

Status Identity

Each status record is uniquely identified by four coordinates that together specify a particular operation on a particular node:

  1. Namespace identifier: Ensures that status records remain isolated within namespace boundaries, maintaining multi-tenant separation
  2. Root node identifier: Groups status records by document, enabling queries that understand processing state at the document level
  3. Node identifier: Specifies which node the status applies to, enabling node-level status queries
  4. Operation identifier: Specifies which operation the status tracks, enabling operation-level status queries

This four-coordinate identity enables flexible querying patterns. Operators can query all statuses for a document, all statuses for a node across operations, all statuses for an operation across nodes, or specific status records for precise operations. This flexibility enables comprehensive monitoring and targeted debugging.

Status Information

Status records carry comprehensive information beyond the state itself:

  • Error messages: Provide detailed information about failures, enabling operators to understand what went wrong and take corrective action
  • Processing time metrics: Track how long operations take, providing visibility into performance and enabling optimization
  • Timestamps: Provide temporal context for status transitions
    • Launched timestamp: Records when processing began
    • Completed timestamp: Records when processing finished

Status Updates

Status information is updated automatically by workflows throughout processing execution. When a workflow begins processing nodes, it updates their statuses to RUNNING, providing immediate visibility into active processing. As workflows complete activities, they update statuses to reflect progress, ensuring that status information remains current.

When workflows encounter errors, they update statuses to FAILED and include error messages that describe what went wrong. This automatic error capture ensures that failure information is preserved and accessible, enabling operators to diagnose problems without requiring manual error logging or investigation.

When workflows are cancelled, they update statuses to CANCELLED, distinguishing intentional termination from failures. This distinction enables operators to understand whether processing stopped due to errors or intentional cancellation, supporting different response strategies for different scenarios.

Status updates occur in bulk operations, enabling efficient updates for multiple nodes simultaneously. This bulk capability ensures that status updates do not become performance bottlenecks, even when processing large numbers of nodes. The system deduplicates status updates, ensuring that each node-operation pair has a single current status record.

Multiple Operations

Nodes can have multiple status records simultaneously, one for each operation that processes them. A node might have a COMPLETED status for parsing, a RUNNING status for embedding, and a PENDING status for augmentation. This multi-operation tracking enables operators to understand the complete processing state without conflating different operation outcomes.

The multi-operation model recognizes that processing involves independent workflows that can succeed or fail independently. Parsing might complete successfully while embedding fails, or embedding might complete while augmentation is still pending. Status tracking maintains separate records for each operation, enabling operators to understand which operations have completed and which are still in progress.

Targeted Reprocessing

This model enables targeted reprocessing. If embedding fails, operators can reprocess only the embedding operation without re-running parsing. If augmentation fails, operators can reprocess only augmentation without re-running parsing or embedding. This targeted approach supports efficient error recovery and selective reprocessing.

Status Queries

Status information is queryable through status APIs. Operators can query statuses by namespace to understand processing state across an organization, by root node to understand document-level processing, by node to understand element-level processing, or by operation to understand workflow-level processing.

Status queries support filtering by state, enabling operators to find all pending operations, all running operations, all completed operations, or all failed operations. This filtering capability enables targeted monitoring that focuses on specific processing states rather than requiring operators to filter results manually.

Status queries can combine multiple filters, enabling complex monitoring patterns. Operators might query for all failed embedding operations in a specific namespace, or all running parsing operations for a specific document. This flexibility enables comprehensive monitoring and targeted debugging.

Error Handling

Status tracking provides detailed error information that enables effective failure diagnosis. When operations fail, status records capture error messages that describe what went wrong, enabling operators to understand failure causes without requiring manual investigation or log analysis.

Error messages include exception details, service errors, and processing failures that provide actionable information for debugging. These messages enable operators to identify root causes, understand failure patterns, and take corrective action. Error information is preserved in status records, ensuring that failure details remain accessible even after processing completes.

Error Analysis

The error handling system distinguishes between different failure types, enabling operators to understand whether failures are transient or permanent, configuration-related or content-related, service-related or processing-related. This distinction enables targeted response strategies that match failure characteristics.

Performance Monitoring

Status tracking provides performance metrics that enable system optimization. Processing time measurements track operation durations, enabling operators to identify slow operations, understand performance bottlenecks, and optimize workflow performance. These metrics enable data-driven optimization that focuses on actual performance characteristics rather than assumptions.

Processing time metrics enable performance analysis across different dimensions. Operators can compare processing times across operations, identify operations that consistently take longer than expected, and understand performance variations across different content types or document characteristics. This analysis enables targeted optimization that improves overall system performance.

The performance monitoring system enables capacity planning and resource allocation. By understanding processing times and operation frequencies, operators can plan resource requirements, identify scaling needs, and optimize system configuration. This planning capability ensures that systems can handle processing loads efficiently.

Effective Usage Principles

Status monitoring should be integrated into operational workflows to enable proactive problem identification. Regular status queries enable operators to identify processing issues before they become critical, understand system health, and respond to problems rapidly. Monitoring dashboards that aggregate status information provide comprehensive visibility into system behavior.

Error messages should be reviewed regularly to identify failure patterns and root causes. Systematic error analysis enables operators to understand common failure modes, identify configuration issues, and improve system reliability. Error information enables targeted fixes that address root causes rather than symptoms.

Performance metrics should be analyzed to identify optimization opportunities. Processing time analysis enables operators to understand performance characteristics, identify bottlenecks, and optimize workflow performance. This analysis enables data-driven optimization that improves system efficiency.

Relationship to Other Concepts

Status tracking is updated by workflows throughout processing execution, providing visibility into workflow progress and outcomes. Status information enables operators to monitor workflow execution, diagnose workflow failures, and optimize workflow performance. The status system transforms workflows from opaque batch operations into monitorable, debuggable processes.

Status tracking operates within namespace boundaries, ensuring that status information remains isolated per organization. Status queries respect namespace isolation, enabling operators to monitor processing within their namespace without accessing other organizations' status information. This isolation maintains multi-tenant separation while providing comprehensive visibility.

Status tracking provides the operational visibility that enables the bonsai philosophy of selective, attentive processing. By understanding processing state, operators can identify nodes that need attention, prioritize processing resources, and apply selective treatment based on status information. This visibility enables the sophisticated processing strategies that make MatsuDB a knowledge platform.

Learn More

To understand how workflows update status information, see the Workflows concept documentation. For information about how workflows are triggered, see Rules & Triggers.