Skip to main content

What is MatsuDB?

Getting Started

This guide introduces MatsuDB and its core concepts. For hands-on examples, see the Getting Started guide.

The Problem with Traditional Document Storage

Traditional document storage treats files as black boxes. When you upload a PDF, it remains a single, indivisible unit. To find information, you must download the entire file and search manually. There's no way to link a specific paragraph to related content in another document, or to search across thousands of documents semantically.

MatsuDB solves this problem.

How MatsuDB Works

MatsuDB uses a Document Parsing and Knowledge Graph Engine to transform your documents into a structured, searchable knowledge tree. When you upload a document, MatsuDB automatically:

  1. Decomposes the document into atomic units called nodes
  2. Preserves the hierarchical structure (sections, paragraphs, tables)
  3. Generates semantic embeddings for intelligent search
  4. Connects related content across document boundaries (knowledge graph)

Each piece of information, a paragraph, an image, a table cell, becomes an independent, searchable entity while maintaining its relationship to the original context.

From Document to Knowledge Tree

Here's how a simple research paper transforms into a MatsuDB knowledge structure:

Node Types

For a complete reference of all node types (CORPUS, ARTIFACT, SECTION, TEXT, IMAGE, TABLE, FORMULA, FORM, etc.), see the Node Types documentation.

What This Enables

This atomic approach unlocks powerful capabilities:

Search across all your documents by meaning, not just keywords. Find "climate impacts on marine ecosystems" even when documents use different terminology like "ocean warming effects."

The REST API

The MatsuDB REST API provides programmatic access to these capabilities:

OperationDescription
Upload documentsCreate corpus nodes that trigger automatic parsing
Search contentFind nodes using semantic, lexical, or exact search
Navigate structuresTraverse hierarchical relationships between nodes
Configure automationSet up rules and triggers for custom workflows

All operations use Bearer token authentication that encapsulates your namespace context, ensuring complete data isolation.

API Reference

For complete endpoint documentation, see the API Reference.

Next Steps

  1. Upload your first document and see the transformation in action
  2. Perform your first search to discover content semantically
  3. Navigate document structures to explore the knowledge tree