> ## Documentation Index
> Fetch the complete documentation index at: https://docs.evermuse.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Lake

> How ingested data is stored, deduplicated, and monitored in the Evermuse Data Lake.

The Data Lake is where all your ingested data lives. Every record you send to the [Ingestion API](/api-reference) is validated, stored, and made available for processing by Evermuse.

## How It Works

When you send records to the Ingestion API, each batch goes through the following steps:

<Steps>
  <Step title="Validated" icon="check">
    Each record is checked against the Integration Envelope schema. Records that pass are accepted; records that fail
    are rejected with error details so you can fix and resend them.
  </Step>

  <Step title="Normalized" icon="sliders">
    Accepted records are cleaned up for consistency. For example, `_type` is lowercased and timestamps
    are rounded to the nearest second.
  </Step>

  <Step title="Deduplicated" icon="copy">
    Each record is identified by its `_type`, `_vendor_ids`, and `_event_at` fields. If you send the same record again,
    it overwrites the previous version instead of creating a duplicate.
  </Step>

  <Step title="Stored" icon="database">
    Your data is written to secure cloud storage, organized by workspace, source, type, and date.
  </Step>

  <Step title="Processed" icon="microchip">
    Your data is picked up by Evermuse's AI workflows and turned into product data signals like user needs, feedback,
    and feature requests.
  </Step>
</Steps>

## Attachments

Records can include `_attachments` referencing vendor-hosted files. After a batch lands, Evermuse asynchronously downloads each attachment to secure cloud storage. Attachment download progress is tracked per-batch and visible in the monitoring dashboard.

## Monitoring

The Data Lake dashboard provides real-time visibility into ingestion activity. You can view all batches with their status, source, type, record counts, and error details.

<Frame caption="The Data Lake dashboard shows ingestion batches in real time with status, source, and record counts.">
  <img src="https://mintcdn.com/evermuse/vemGUR1Q1H4K7tq7/images/data-lake-light.png?fit=max&auto=format&n=vemGUR1Q1H4K7tq7&q=85&s=366d0d76c6a28ad9c670082e44c44e92" alt="Data Lake dashboard" className="block dark:hidden" width="2982" height="1858" data-path="images/data-lake-light.png" />

  <img src="https://mintcdn.com/evermuse/vemGUR1Q1H4K7tq7/images/data-lake-dark.png?fit=max&auto=format&n=vemGUR1Q1H4K7tq7&q=85&s=248c3c652cb388cfc8e7d1d30226625b" alt="Data Lake dashboard" className="hidden dark:block" width="2982" height="1854" data-path="images/data-lake-dark.png" />
</Frame>

Clicking a batch opens a detail panel with:

* **Batch metadata** — Status, source, type, timestamps, and storage URIs.
* **Failed records** — Individual validation errors with the rejected payload.
* **Attachment downloads** — Status of each attachment download attempt.

<Frame caption="The batch detail panel shows metadata, failed records, and attachment download status.">
  <img src="https://mintcdn.com/evermuse/vemGUR1Q1H4K7tq7/images/batch-details-light.png?fit=max&auto=format&n=vemGUR1Q1H4K7tq7&q=85&s=5713e67f428efbf0bb3cf90a4b74b9f5" alt="Batch details panel" className="block dark:hidden" width="2986" height="1858" data-path="images/batch-details-light.png" />

  <img src="https://mintcdn.com/evermuse/vemGUR1Q1H4K7tq7/images/batch-details-dark.png?fit=max&auto=format&n=vemGUR1Q1H4K7tq7&q=85&s=85df37fc903ab4d20e9fc10b6113b7e7" alt="Batch details panel" className="hidden dark:block" width="2984" height="1860" data-path="images/batch-details-dark.png" />
</Frame>

You can also check batch status programmatically via the [batch status endpoint](/api-reference/ingestion/get-batch-status).

## Batch Lifecycle

Each ingestion request creates a batch that progresses through the following statuses:

| Status       | Description                                    |
| ------------ | ---------------------------------------------- |
| `LANDING`    | Batch received and being written to storage.   |
| `LANDED`     | Records archived and indexed successfully.     |
| `PROCESSING` | Downstream processors are consuming the batch. |
| `COMPLETE`   | All processing finished successfully.          |
| `FAILED`     | Processing encountered an unrecoverable error. |
