Hello, I’m testing StarRocks as a potential replacement for Spark.
At first glance, StarRocks’ performance has been impressive. However, I encountered a specific query where StarRocks performs significantly slower.
Can someone point me out how to improve?
1) query pattern and etc
- query pattern
SELECT {dimension}, COUNT(*) FROM iceber.db.tab1 GROUP BY {dimension}
- Iceberg catalog with parquet file
tab1
has 1,413,914,103 (1.4B) rows- all fields are of type
STRING
- StarRocks version:
3.3.8-e3816ec
2) test result
case | type | dimension | num output rows | Spark (sec) | StarRocks (sec) |
---|---|---|---|---|---|
case 1 | high cardinality | c,u |
121,374,662 (1.2B) rows | 44 | 36 |
case 2 | low cardinality | ec,f,g,a,d1,d2,...,d10 |
11,563 rows | 28 | 12 |
case 3 | high + low cardinality | c,u,ec,f,g,a,d1,d2,...,d10 |
328,798,206 (3.2B) rows | 111 | 178 |
3) problem
StarRocks outperforms Spark when grouping by either high-cardinality fields or low-cardinality fields individually (case 1 and case 2).
However, when grouping by a combination of “high + low cardinality” fields in a single query (case 3), StarRocks shows poor performance.
I don’t understand why this happen
4) question
Is there a way to optimize the query for case 3 to improve performance?
5) query profiles
- case 1
| TotalTime: 35s890ms | | ExecutionTime: 35s835ms [Scan: 1s472ms (4.11%), Network: 38.923ms (0.11%), ResultDeliverTime: 0ns (0.00%), ScheduleTime: 10s0ms (27.91%)] | | CollectProfileTime: 1ms | | FrontendProfileMergeTime: 17.345ms | | QueryPeakMemoryUsage: ?, QueryAllocatedMemoryUsage: 1.491 TB | | Top Most Time-consuming Nodes: | | 1. AGGREGATION (id=4) [finalize, merge]: 24s688ms (47.23%) | | 2. AGGREGATION (id=2) [serialize, update]: 19s136ms (36.61%) | | 3. EXCHANGE (id=3) [SHUFFLE]: 5s730ms (10.96%) | | 4. ICEBERG_SCAN (id=0) : 2s640ms (5.05%) | | 5. PROJECT (id=1) : 60.589ms (0.12%) | | 6. PROJECT (id=5) : 5.855ms (0.01%) | | 7. AGGREGATION (id=6) [serialize, update]: 3.469ms (0.01%) | | 8. EXCHANGE (id=7) [GATHER]: 1.102ms (0.00%) | | 9. RESULT_SINK: 113.661us (0.00%) | | 10. AGGREGATION (id=8) [finalize, merge]: 82.216us (0.00%) | | Top Most Memory-consuming Nodes: | | NonDefaultVariables: | | connector_sink_compression_codec: uncompressed -> snappy | | enable_adaptive_sink_dop: false -> true | | enable_profile: false -> true | | enable_spill: false -> true | | query_mem_limit: 0 -> 6442450944 | | query_timeout: 300 -> 3600 | | spill_mode: auto -> force |
- case 2
| TotalTime: 12s317ms | | ExecutionTime: 12s249ms [Scan: 1s678ms (13.70%), Network: 4.231ms (0.03%), ResultDeliverTime: 0ns (0.00%), ScheduleTime: 9s998ms (81.62%)] | | CollectProfileTime: 2ms | | FrontendProfileMergeTime: 19.762ms | | QueryPeakMemoryUsage: ?, QueryAllocatedMemoryUsage: 398.331 GB | | Top Most Time-consuming Nodes: | | 1. AGGREGATION (id=2) [serialize, update]: 8s884ms (74.05%) | | 2. ICEBERG_SCAN (id=0) : 2s947ms (24.56%) | | 3. PROJECT (id=1) : 119.493ms (1.00%) | | 4. AGGREGATION (id=4) [finalize, merge]: 33.417ms (0.28%) | | 5. EXCHANGE (id=3) [SHUFFLE]: 10.724ms (0.09%) | | 6. EXCHANGE (id=7) [GATHER]: 802.820us (0.01%) | | 7. RESULT_SINK: 85.428us (0.00%) | | 8. AGGREGATION (id=6) [serialize, update]: 63.621us (0.00%) | | 9. AGGREGATION (id=8) [finalize, merge]: 50.800us (0.00%) | | 10. PROJECT (id=5) : 16.366us (0.00%) | | Top Most Memory-consuming Nodes: | | NonDefaultVariables: | | connector_sink_compression_codec: uncompressed -> snappy | | enable_adaptive_sink_dop: false -> true | | enable_profile: false -> true | | enable_spill: false -> true | | query_mem_limit: 0 -> 6442450944 | | query_timeout: 300 -> 3600 |
- case 3
| TotalTime: 2m58s | | ExecutionTime: 2m58s [Scan: 3s235ms (1.82%), Network: 9.028ms (0.01%), ResultDeliverTime: 0ns (0.00%), ScheduleTime: 234.810ms (0.13%)] | | CollectProfileTime: 0 | | FrontendProfileMergeTime: 12.752ms | | QueryPeakMemoryUsage: ?, QueryAllocatedMemoryUsage: 5.194 TB | | Top Most Time-consuming Nodes: | | 1. AGGREGATION (id=4) [finalize, merge]: 2m42s (69.59%) | | 2. AGGREGATION (id=2) [serialize, update]: 44s999ms (19.23%) | | 3. EXCHANGE (id=3) [SHUFFLE]: 20s640ms (8.82%) | | 4. ICEBERG_SCAN (id=0) : 5s400ms (2.31%) | | 5. PROJECT (id=1) : 227.832ms (0.10%) | | 6. PROJECT (id=5) : 31.266ms (0.01%) | | 7. AGGREGATION (id=6) [serialize, update]: 14.065ms (0.01%) | | 8. EXCHANGE (id=7) [GATHER]: 1.577ms (0.00%) | | 9. RESULT_SINK: 100.160us (0.00%) | | 10. AGGREGATION (id=8) [finalize, merge]: 74.986us (0.00%) | | Top Most Memory-consuming Nodes: | | NonDefaultVariables: | | connector_sink_compression_codec: uncompressed -> snappy | | enable_adaptive_sink_dop: false -> true | | enable_profile: false -> true | | enable_spill: false -> true | | query_mem_limit: 0 -> 6442450944 | | query_timeout: 300 -> 3600 | | spill_mode: auto -> force |