Tested on both versions 3.3.9 and 3.4.1, in a setup with a single FE and single BE, nothing shared mode, about 160GB of database data, zstd compressed.
This setup is running on x86 architecture with no particular issue. Now, moving to an ARM machine, I observe the BE is regularly crashing with this stack trace :
W20250406 17:07:33.542100 281456209117312 stack_util.cpp:347] 2025-04-06 17:07:33.227865, query_id=00000000-0000-0000-0000-000000000000, fragment_instance_id=00000000-0000-0000-
0000-000000000000 throws exception: std::bad_alloc, trace:
@ 0x6549740 __wrap___cxa_throw
@ 0xd7d7298 operator new(unsigned long)
@ 0x6d5ae2c starrocks::FixedMutableIndex<20ul>::load_snapshot(phmap::BinaryInputArchive&)
@ 0x6d17698 starrocks::ShardByLengthMutableIndex::load_snapshot(phmap::BinaryInputArchive&, std::set<unsigned int, std::less<unsigned int>, std::allocator<unsigned
int> > const&)
@ 0x6d2f948 starrocks::ShardByLengthMutableIndex::load(starrocks::MutableIndexMetaPB const&)
@ 0x6d3a4c4 starrocks::PersistentIndex::_load(starrocks::PersistentIndexMetaPB const&, bool)
@ 0x6d3b06c starrocks::PersistentIndex::load(starrocks::PersistentIndexMetaPB const&)
@ 0x6d42acc starrocks::PersistentIndex::_load_by_loader(starrocks::TabletLoader*)
@ 0x6d441f0 starrocks::PersistentIndex::load_from_tablet(starrocks::Tablet*)
@ 0x6867f2c starrocks::PrimaryIndex::_do_load(starrocks::Tablet*)
@ 0x6868bd8 starrocks::PrimaryIndex::load(starrocks::Tablet*)
@ 0x69373a8 starrocks::UpdateManager::on_rowset_finished(starrocks::Tablet*, starrocks::Rowset*)
@ 0x71ca5b8 starrocks::DeltaWriter::commit()
@ 0x8136388 starrocks::AsyncDeltaWriter::_execute(void*, bthread::TaskIterator<starrocks::AsyncDeltaWriter::Task>&)
@ 0x989e6ac bthread::ExecutionQueueBase::_execute(bthread::TaskNode*, bool, int*)
@ 0x989f638 bthread::ExecutionQueueBase::_execute_tasks(void*)
@ 0x83236a4 starrocks::ThreadPool::dispatch_thread()
@ 0x831b69c starrocks::Thread::supervise_thread(void*)
@ 0xffff9455d5c8 (/usr/lib/aarch64-linux-gnu/libc.so.6+0x7d5c7)
@ 0xffff945c5edc (/usr/lib/aarch64-linux-gnu/libc.so.6+0xe5edb)
Before it crashes, I can access all the data, and there is no visible corruption. But after 10 minutes about it crashes like this.
One thing to note, the data volume of the BE was copied from the X86 machine to the ARM machine as is. Are the data architecture specific ? Maybe I should rebuild from scratch the data instead ?