Skip to main content

File Indexation

From Raw Files to Indexed Documents

This guide covers creating EliseFiles from provider items, understanding the indexation lifecycle, and working with AI-extracted ontologies. Make sure you have files in a provider first -- see Managing Provider Items.

The Indexation Flow

Creating an EliseFile is the step that transforms a raw file in storage into an intelligent, searchable document. The flow looks like this:

  1. You create an EliseFile by referencing a provider item (provider ID, path, name)
  2. The platform indexes the file asynchronously (parses content, extracts structure)
  3. Ontologies are automatically computed (AI-extracted concepts and metadata)

Creating an EliseFile

Create an EliseFile from a file that already exists in a provider. The request references the source provider and file location.

curl -X POST "https://<api-domain>/api/core/files" \
-H "Authorization: Bearer <your-pat>" \
-H "Content-Type: application/json" \
-d '{
"providerId": "550e8400-e29b-41d4-a716-446655440000",
"key": "reports/research-paper.pdf"
}'
Supported Extensions

The file must have one of the supported extensions: pdf, txt, html, htm, doc, docx, csv, xlsx, xls, ods, jpg, jpeg, png, webp, bmp, tiff, json, xml. Unsupported extensions will return a 400 error.

Getting File Information

Retrieve the current state of an EliseFile, including its indexation status.

curl -s "https://<api-domain>/api/core/files/${FILE_ID}" \
-H "Authorization: Bearer <your-pat>"

Indexation Status

The lastIndexationInfos.status field tracks the indexation progress:

StatusDescription
PENDINGIndexation has been queued but not started
RUNNINGIndexation is in progress
SUCCESSIndexation completed successfully
FAILEDIndexation encountered an error
ABORTEDIndexation was cancelled

When the status is FAILED, the errorMessage field provides details about what went wrong.

Retrieving Ontologies

Ontologies are AI-extracted metadata and concepts computed automatically during indexation. They represent structured knowledge extracted from the file content: topics, entities, categories, and other semantic information.

curl -s "https://<api-domain>/api/core/files/${FILE_ID}/ontologies" \
-H "Authorization: Bearer <your-pat>"

Reindexing a File

Force reindexation of an already indexed file. This is useful when you want to re-process the document, for example after platform updates that improve parsing quality.

curl -X POST "https://<api-domain>/api/core/files/${FILE_ID}/reindex" \
-H "Authorization: Bearer <your-pat>"
Asynchronous Operation

Reindexation runs asynchronously. The endpoint returns 202 Accepted immediately. Poll the file info endpoint to monitor progress via lastIndexationInfos.status.

Recomputing Ontologies

Force recomputation of AI-extracted ontologies and metadata without reindexing the full file. This is useful when ontology models have been updated.

curl -X POST "https://<api-domain>/api/core/files/${FILE_ID}/recompute-ontologies" \
-H "Authorization: Bearer <your-pat>"

Deleting a File

Delete an EliseFile. This removes the indexed document from the Elise system. The original file in the provider is not affected.

curl -X DELETE "https://<api-domain>/api/core/files/${FILE_ID}" \
-H "Authorization: Bearer <your-pat>"
Provider Item Not Deleted

Deleting an EliseFile removes it from the Elise index only. The underlying file in the storage provider remains untouched. To also remove the file from storage, use the Delete Item endpoint on the provider.

Listing Indexed Files

List all EliseFiles registered for a given provider, with pagination and optional sorting. This returns only files that have already been created as EliseFiles — not raw provider items.

curl -s "https://<api-domain>/api/core/files?providerId=${PROVIDER_ID}&page=0&pageSize=20" \
-H "Authorization: Bearer <your-pat>"

Query Parameters

ParameterRequiredDescription
providerIdYesUUID of the provider to list files for
pageYesZero-based page number
pageSizeYesNumber of results per page
sortPropertyNoField to sort by (e.g. name, path)
sortOrderNoASC or DESC
EliseFiles only

This endpoint lists files that have been created as EliseFiles via POST /files. Raw files that exist in the provider but have not been indexed do not appear here. To browse raw provider content, use the Provider Items endpoints.

Next Steps

Now that you can index files and extract metadata:

  1. Organize into collections to group your indexed documents
  2. Common Patterns for pagination and error handling
API Reference

See the Files endpoints in the API Reference for complete request/response schemas.