Starrocks Consuming Masses of Network Bandwidth When Idle

Fusing1 · March 7, 2025, 4:33pm

We’re using a small Starrocks cluster to query an Iceberg table setup with around 200GB of data, and using an Iceberg REST catalog.

The cluster is containerized and running on a single server. It consists of 1 FE node and a pair of BE nodes (our non-production test configuration)

We have been firing single queries at Starrocks with large gaps in time between queries. Every time a query is presented to Starrocks (simple single table SELECT) it answers with the correct results then idles. Approximately 5 seconds later, each of the 2 backends pegs a pair of CPU cores at 100% and starts consuming massive amounts of network bandwidth. In our case we’re seeing a total of around 10GB of network bandwidth consumed over several minutes by each BE, followed by a return to idle. Everything remains in a quiescent state until the next query is fired, and then the whole process repeats. I want to emphasize that this greedy behavior starts AFTER the query is satiated with results.

Note that neither of the BE processes (nor the FE process) is memory, CPU, or network constrained. There is no swapping or anything like that going (in fact, disk/block storage use is minimal).

Config is shared-nothing and running on a single 16 CPU server (AMD EPYC) - 32GB RAM - 500GB SSD. The Iceberg table is hosted on S3 compatible storage (linode).

Can anyone tell us what might be going on, and what settings we should be looking at to prevent this from happening?

Our FE is using all default settings except the usual suspects for S3 configuration, and:

default_replication = 1

The BE’s have all default settings, with the exception of separate network ports.

Thanks for any assistance

allen · March 12, 2025, 5:43pm

You can have a look of the flamegraph. To get the CPU usage.

perf record -F 99 -ag -p 54614 -- sleep 30
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > perf-kernel.svg

Topic		Replies	Views
Challenges in Scaling StarRocks for High-Concurrency Real-Time Analytics Infrastructure and Operations	1	124	March 20, 2025
Loading Data From Kafka to Starrocks Slow Infrastructure and Operations	1	206	May 24, 2024
Spike in Memory and CPU usage killing BE After Restart Working with Data	1	215	May 13, 2024
StarRocks query performance with Apache Iceberg Open Table Formats (Iceberg, Hudi, Hive, Delta)	1	217	January 26, 2024
Slowness in Iceberg query with concurrent execution Working with Data	0	35	March 28, 2025

Starrocks Consuming Masses of Network Bandwidth When Idle

Related topics