Rules & Triggers

Event-Driven Automation

Rules and triggers form MatsuDB's event-driven automation system, enabling automatic actions in response to system events. Triggers are built-in actions that execute at specific extension points in the entity lifecycle, while rules are user-configured filters that determine when and how triggers should execute within a namespace.

Definition

This system enables organizations to customize their document processing pipelines without modifying core system code. When a document is uploaded, a trigger can automatically initiate parsing. When nodes are created, triggers can start embedding workflows. When nodes are updated, triggers can initiate augmentation processes. Rules allow fine-grained control over which events trigger which actions, creating flexible, namespace-specific processing configurations.

Core Philosophy: Event-Driven Automation

MatsuDB operates on an event-driven architecture where significant operations emit events at predefined extension points. Triggers listen to these events and execute actions automatically. This design separates core system operations from processing workflows, enabling extensibility without core system modifications.

Rules provide the configuration layer that makes triggers adaptable to different organizational needs. A research institution might configure parsing triggers to process only scientific papers, while a legal firm might configure embedding triggers to process only contract documents. The same trigger infrastructure serves both, with rules determining behavior.

Customization Without Code Changes

This system enables organizations to customize their processing pipelines without modifying core system code. Rules provide the configuration layer that adapts triggers to organizational needs.

Hook Points

Hook points represent extension points in the entity lifecycle where events occur and triggers can execute. These points are fixed and part of the system architecture, defining when automation can occur. Common hook points include:

Root node creation: When a new document is uploaded and a corpus node is created
Node batch creation: When multiple nodes are created during parsing
Node updates: When node content or attributes are modified
Corpus deletion: When a document is removed from the system

Each hook point carries contextual information about the event, such as the node that was created or the updates that were applied. Triggers receive this context and use it to determine whether to execute and what actions to perform. The hook point system ensures that automation occurs at the right moment in the processing lifecycle.

Triggers

Triggers are built-in actions that execute automatically at hook points. Each trigger has a unique identifier, registers for specific hook points, and defines execution behavior. Triggers encapsulate common processing operations such as initiating parsing workflows, starting embedding processes, or triggering augmentation tasks.

Triggers operate independently of user configuration—they exist as part of the system and execute whenever their conditions are met. However, rules can filter trigger execution, determining which events should actually cause the trigger to run. This separation allows the system to provide sensible defaults while enabling customization through rules.

Each trigger defines its own filtering logic, examining event context to determine relevance. A parsing trigger might check whether the created node is a corpus node. An embedding trigger might verify that the node contains text content. This built-in filtering ensures triggers execute only when appropriate, even without rule configuration.

Execution Modes

Triggers execute in one of two modes:

Critical Mode
Asynchronous Mode

Synchronous execution within the operation that triggered the event. If a critical trigger fails, the entire operation fails, ensuring data consistency. Critical triggers are used for operations that must complete before the operation succeeds, such as initiating parsing workflows for new corpus nodes.

This dual-mode system balances reliability with performance. Critical operations ensure data integrity, while asynchronous operations enable responsive system behavior without blocking user operations.

Rules

Rules are user-configured filters that customize trigger behavior within a namespace. A rule associates a trigger with a namespace and provides filter configuration that determines when the trigger should execute. Rules enable organizations to adapt the same trigger infrastructure to their specific needs.

Each rule specifies a trigger identifier and filter configuration. The filter configuration is trigger-specific, allowing fine-grained control over trigger execution. For example, a parsing rule might filter by document type, ensuring only PDFs trigger parsing. An embedding rule might filter by node type, ensuring only text nodes receive embeddings.

Rules are namespace-scoped, meaning each namespace can configure its own trigger behavior independently. This enables different organizations using the same MatsuDB instance to apply different processing strategies without interference.

Namespace Isolation

Rules are namespace-scoped, ensuring that automation configurations are isolated per organization. Each namespace can configure its own trigger behavior independently.

API Operations

The rules API provides operations for managing trigger configurations:

Listing triggers: Returns metadata about all available triggers, including their identifiers and filter configuration schemas
Creating or updating rules: Associates a trigger with a namespace and provides filter configuration
Retrieving rules: Returns rule configuration for a specific trigger and namespace
Deleting rules: Removes rule configuration, disabling trigger execution for that namespace

The system validates configuration against the trigger's schema, ensuring compatibility. Rules can be retrieved, updated, or deleted, enabling dynamic reconfiguration of processing pipelines.

Examples

Consider a research institution that wants to automatically parse PDF documents but skip Word documents. They configure a parsing rule with a filter that matches only PDF document types. When a PDF corpus is uploaded, the parsing trigger executes and initiates the parsing workflow. When a Word document is uploaded, the trigger's filter determines it should not execute, and no parsing occurs.

A legal firm might configure an embedding rule that processes only contract nodes. The rule specifies filters for node type and classification, ensuring embeddings are generated only for contract content. Other node types are ignored, optimizing processing resources for the firm's specific needs.

An organization might configure multiple rules for the same trigger, each with different filter criteria. The system evaluates all rules, and the trigger executes if any rule's filters match. This enables complex processing strategies where different document types receive different treatment.

Effective Usage Principles

Rule configuration should align with organizational processing needs. Understanding which triggers are available and what they do enables effective rule design. Filter configurations should be specific enough to avoid unnecessary processing while remaining flexible enough to handle organizational variety.

Critical triggers should be configured carefully, as their failure affects operation outcomes. Asynchronous triggers provide more flexibility, allowing background processing without blocking operations. Understanding execution modes helps design robust processing pipelines.

Rules should be tested with representative data to ensure filter configurations work as intended. The system provides trigger discovery and schema information to aid in rule design, enabling informed configuration decisions.

Relationship to Other Concepts

Rules and triggers operate within namespace boundaries, ensuring that automation configurations are isolated per organization. Triggers initiate workflows that process nodes, creating a connection between the event-driven automation system and the processing infrastructure. The hook point system integrates with the node lifecycle, ensuring automation occurs at appropriate moments.

The rules system enables customization without core system modification, supporting the bonsai philosophy of selective, configurable processing. Different namespaces can apply different processing strategies through rule configuration, enabling organizational independence within a shared infrastructure.

Learn More

To understand how workflows are triggered and executed, see the Workflows concept documentation. For information about monitoring processing operations, see Status Tracking.

Definition​

Core Philosophy: Event-Driven Automation​

Hook Points​

Triggers​

Execution Modes​

Rules​

API Operations​

Examples​

Effective Usage Principles​

Relationship to Other Concepts​