Starrocks Version 3.2.6-2585333
Im running this locally in a docker container - 64GB ram (all of the containers are using around 15GB ram during loading - and only 13% of the available 1600% CPU (16 cores))
I have a table with around 40mil rows - Im using the starrocks sink connector to load data from postgres to starrocks.
This spesific table is loading at around 2000 rows per second which is way to slow (other table that is even larger loaded at almost 50k rows per second).
Schema of the table that im loading to:
CREATE TABLE IF NOT EXISTS statement(
id BIGINT,
statement_date int,
vat_percentage FLOAT,
amount DOUBLE,
unallocated_amount DOUBLE,
allocation_status BIGINT,
branch_id BIGINT,
provider_id BIGINT,
statement_mapping_id BIGINT,
description VARCHAR(255),
imported CHAR,
source BIGINT,
payment_reference VARCHAR(255),
created_at int,
updated_at int,
created_by_id BIGINT,
updated_by_id BIGINT,
remittance_date int,
division VARCHAR(255),
financial_source_id BIGINT,
bank_recon_id BIGINT,
source_currency_code VARCHAR(3),
source_currency_exchange_rate DOUBLE,
payment_currency_code VARCHAR(3),
reference_number VARCHAR(50),
bank_recon_ignored CHAR,
ignore_item_vat_errors CHAR,
ignore_item_vat_errors_reason VARCHAR(255),
statement_date2 date NULL AS str_to_date(from_unixtime(statement_date * 86400), ‘%Y-%m-%d’) COMMENT “”
) PRIMARY KEY (id)
DISTRIBUTED BY HASH(id)
order by (branch_id)
PROPERTIES (
“replication_num” = “1”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);
(The table that was loading quicker does not included a generated column or an order by clause)
My question then: Is this a configuration on the starrocks side that needs to be tweaked or is this a kafka issue?