Failure to BROKER LOAD

Hi All,

Installed a shared-nothing internal cluster using Starrocks 3.2.6 in ubuntu 24.04 using JDK 11 and now I am trying to load parquet data into a table either locally or from Azure Blob.

Steps to reproduce:

create database test;
use test;
create table test_table (
[FIELD LIST]
)
PARTITION BY date_trunc(“month”,TimeStamp)
properties (
“replication_num” = “1” );

load label test.test_table_load
(
DATA INFILE (“file:///home/biuser/test-data.parquet”)
INTO TABLE test_table
FORMAT AS Parquet
( [FIELD_LIST] )
)
WITH BROKER
properties ( “timeout” = “3600” );

Expected behavior

Broker load is queued and eventually data is loaded into table test_table;

Real behavior

BE node crashes and no data is load

version number

Starrocks 3.2.6-2585333

BE/FE logs

be.out:
unsupported call jni function, JAVA_HOME is required
F0603 16:35:03.934142 47828 threadpool.cpp:257] Check failed: 1 == _tokens.size() (1 vs. 2) Threadpool automatic_partition destroyed with 2 allocated tokens
*** Check failure stack trace: ***
3.2.6 RELEASE (build 2585333)
query_id:21c54cc3-2cc4-4f89-a7ee-6c736ead136c, fragment_instance:21c54cc3-2cc4-4f89-a7ee-6c736ead136d
Load file path: file:/home/biuser/episode_ia_0-20240115.parquet
tracker:process consumption: 13695824
tracker:query_pool consumption: 0
tracker:query_pool/connector_scan consumption: 78319616
tracker:load consumption: 0
tracker:metadata consumption: 89789
tracker:tablet_metadata consumption: 0
tracker:rowset_metadata consumption: 0
tracker:segment_metadata consumption: 0
tracker:column_metadata consumption: 0
tracker:tablet_schema consumption: 0
tracker:segment_zonemap consumption: 0
tracker:short_key_index consumption: 0
tracker:column_zonemap_index consumption: 0
tracker:ordinal_index consumption: 0
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 0
tracker:compaction consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 0
tracker:update consumption: 0
tracker:chunk_allocator consumption: 0
tracker:clone consumption: 0
tracker:consistency consumption: 0
tracker:datacache consumption: 0
tracker:replication consumption: 0
*** Aborted at 1717443303 (unix time) try “date -d @1717443303” if you are using GNU date ***
PC: @ 0xe144d29eb1c pthread_kill
*** SIGABRT (@0x3eb0000b9ed) received by PID 47597 (TID 0xe141c0006c0) from PID 47597; stack trace: ***
@ 0x7fb5eda google::(anonymous namespace)::FailureSignalHandler()
@ 0xe144d245320 (unknown)
@ 0xe144d29eb1c pthread_kill
@ 0xe144d24526e gsignal
@ 0xe144d2288ff abort
@ 0x3d7bbd2 starrocks::failure_function()
@ 0x7fa4471 google::LogMessage::Fail()
@ 0x7fa6acf google::LogMessage::SendToLog()
@ 0x7fa3fb0 google::LogMessage::Flush()
@ 0x7fa713d google::LogMessageFatal::~LogMessageFatal()
@ 0x6b7a0ee starrocks::ThreadPool::~ThreadPool()
@ 0x3ebaab9 starrocks::ExecEnv::~ExecEnv()
@ 0xe144d247a66 (unknown)
@ 0xe144d247bae exit
@ 0xe144d6fa912 (unknown)
@ 0x787ce8b getJNIEnv
@ 0x787fb47 hdfsBuilderConnect
@ 0x3e527c5 starrocks::HdfsFsCache::get_connection()
@ 0x3e4f690 starrocks::GetHdfsFileReadOnlyHandle::getOrCreateFS()
@ 0x3e4d594 _ZNSt17_Function_handlerIFN9starrocks6StatusEvEZNS0_15HdfsInputStream8get_sizeEvEUlvE_E9_M_invokeERKSt9_Any_data
@ 0x6a712ac starrocks::call_hdfs_scan_function_in_pthread()
@ 0x3e4dab0 starrocks::HdfsInputStream::get_size()
@ 0x5e56391 starrocks::ParquetChunkFile::GetSize()
@ 0x5e545c9 starrocks::ParquetReaderWrap::size()
@ 0x5e4cb88 starrocks::ParquetScanner::open_next_reader()
@ 0x5e4d131 starrocks::ParquetScanner::next_batch()
@ 0x5e5105f starrocks::ParquetScanner::get_next()
@ 0x6786539 starrocks::connector::FileDataSource::get_next()
@ 0x5fdb59f starrocks::pipeline::ConnectorChunkSource::_read_chunk()
@ 0x5fe1d46 starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking()
@ 0x5f55406 _ZZN9starrocks8pipeline12ScanOperator18_trigger_next_scanEPNS_12RuntimeStateEiENKUlvE_clEv
@ 0x6275d1c starrocks::workgroup::ScanExecutor::worker_thread()

I have already forced JAVA_HOME in /etc/profile. Forcing JAVA_HOME in be.conf as per installationg guide yields “Ignored unknown config: JAVA_HOME”. I have even tried to force JAVA_HOME inside start_backend.sh script to no avail.

Can someone give me a hint here?

Thanks!!!

Mind sharing your start_backend.sh? How do you force to set JAVA_HOME.

Hi Zhang,

Just added the line in bold… all “echo” commands were just to make sure alterations were definitely applied… :slight_smile:

START_BE_CMD="env JAVA_HOME=/usr/lib/jvm/jdk-11-oracle-x64 ${START_BE_CMD}"
echo “----> $START_BE_CMD”

if [ ${RUN_LOG_CONSOLE} -eq 1 ] ; then
# force glog output to console (stderr)
export GLOG_logtostderr=1
else
# redirect stdout/stderr to ${LOG_FILE}
exec &>> ${LOG_FILE}
fi

echo “----CMD—> ${START_BE_CMD}”
echo "start time: "$(date)

if [ ${RUN_DAEMON} -eq 1 ]; then
echo “–1-> $@”
nohup ${START_BE_CMD} “$@” </dev/null &
#nohup ${START_BE_CMD} “$@” &
else
echo “–2-> $@”
exec ${START_BE_CMD} “$@” </dev/null
fi

Not sure if it works.

how about trying add a single line before restarting BE

“export JAVA_HOME=xxx”?

tried that as well… :frowning:

in fact, it seems to me that the actual binary (starrocks_be) is missing this variable altogether and I don’t know why.

One doubt that’s hit me now as I write this… Is it ok to run the BE node as a non-privileged user, or am I forced to run it as root? documentation doesn’t say anything especific.

yes. it’s okay to run it as non-root user.

So I upgraded to 3.2.7 and the problem sorted itself out.

I don’t know whether other people will face this, but it seems that BROKER LOAD is somewhat broken with 3.2.6.

Regards!