Detailed Explanation: The retention period of aggregated (asynchronous) materialized view data cannot exceed that of the base table?

Background

A detailed table was created to log API call details, and a materialized view was created based on the detailed table. The materialized view aggregates metrics like failed calls and average latency at the api + hour granularity. The detailed table contains a column event_time, which is partitioned by day. The materialized view is aggregated at the event_time hour level.

The table and materialized view are configured as follows:

Detailed Table Partition Configuration:
“dynamic_partition.time_unit” = “DAY”, – Partitioned by day
“dynamic_partition.start” = “-2”, – Retain data for the last 2 days

Materialized View TTL Setting:

“partition_ttl” = “30 DAY”,

Issue

The retention period of the materialized view aligns with that of the detailed table (2 days), rather than the configured 30-day TTL. The desired outcome is for the materialized view to retain data for a longer period (30 days) without being constrained by the base table’s retention period.

Simplified Table and Materialized View Models

Detailed Table (table_A):
CREATE TABLE table_A (
api_id bigint(20) NULL COMMENT “API ID”,
event_time datetime NULL COMMENT “Event Time”,
trace_id varchar(65533) NULL COMMENT “Trace ID”,
latency bigint(20) NULL COMMENT “”
) ENGINE=OLAP
DUPLICATE KEY(api_id, event_time)
PARTITION BY RANGE (event_time) (
PARTITION p20250110 VALUES LESS THAN (“2025-01-10”)
)
DISTRIBUTED BY HASH (trace_id) – Default distribution strategy by trace_id
PROPERTIES (
“replication_num” = “2”, – Set replication factor to 2
“dynamic_partition.enable” = “true”, – Enable dynamic partitioning
“dynamic_partition.time_unit” = “DAY”, – Partitioned by day
“dynamic_partition.start” = “-2”, – Retain data for the last 2 days
“dynamic_partition.end” = “1”, – Current day
“dynamic_partition.prefix” = “p” – Partition prefix
);
Materialized View (view_A):
CREATE MATERIALIZED VIEW view_A (api_id, event_hour, total_count, avg_latency)
PARTITION BY (event_hour)
DISTRIBUTED BY RANDOM
REFRESH ASYNC START(“2025-01-01 00:00:00”) EVERY(INTERVAL 10 SECOND)
PROPERTIES (
“replicated_storage” = “true”,
“partition_ttl” = “30 DAY”, – Retain aggregated data for 30 days
“replication_num” = “2”
)
AS SELECT
api_id,
date_trunc(‘hour’, event_time) AS event_hour,
count(*) AS total_count,
avg(latency) AS avg_latency
FROM table_A
GROUP BY api_id, event_hour;

Environment Details

Cluster type: Compute and storage separation

StarRocks version: 3.2.9

Cluster size: FE (4 cores, 8 GB) x 1 + BE (4 cores, 8 GB) x 3

Contact: (gaogeng581@gmail.com)

It’s not supported yet. The retention between MV and base table need to be the same.