How to connect to Hive metastore using external catalog

I am trying to connect to hive tables from StraRocks using external catalog.

StartRocks version
version info
Version: 3.1.5
Git: 5d8438a
Build Info: StarRocks@localhost
Build Time: 2023-11-24 03:58:58

Note: our hadoop infrastructure using kerberos authentication and we are using a service ID to connect to the database/tables in Hive.
We are successfully generated kerberos ticket on bottn fe and be nodes.

Steps performed: Ran below queries from MySQL client after connecting the DB using admin ID

CREATE EXTERNAL CATALOG OPRHQA PROPERTIES
( “type”=“hive”,
“hive.metastore.uris”=“thrift://XXXX:9083,thrift://$$$$$$:9083” );

post this i am unable to create any database under this catalog (OPRHQA).

But able create databases on “default_catalog”. So tried using “default_catalog” instead of “OPRHQA”.
executed below command before creating the external tables.

SET CATALOG default_catalog;

USE default_catalog.starrocks_test;

After this we trued to create external tables by executing below query, but it fails.

CREATE EXTERNAL TABLE hive_external_table (
ctry_cd int,
ctry_nm string,
sregn_cd string,
sregn_nm string,
regn_cd string,
regn_nm string,
ctry_cd_alpha_cd string,
zone_cd int,
actv_ind string,
insrt_ts string,
upd_ts string
)
ENGINE=HIVE
PROPERTIES (
“resource” = “hive”,
“database” = “opcode”,
“table” = “tedc_ctry”,
“hive.metastore.uris” = “thrift://XXXXXX:10000”
);

Error:-

[Code: 1064, SQL State: 42000] Unexpected exception: HIVE resource [hive] not exists

Could you please guide me what i am doing wrong ?

Thanks !
Vineeth

Hive catalog | StarRocks You can refer this link. The catalog framework is default now.

Thank you for the reply Allen. I tried what you suggested but getting below message

“Can not find hive <db_name>.<table_name> from the resource hive0”
am I missing some configuration ?

Thanks !
Vineeth

Did few changes in configuration like changing the port number for metastore uri. Now i am seeing the below error

2024-04-10 11:53:08,883 ERROR (starrocks-mysql-nio-pool-0|150) [HiveMetadata.getTable():115] Failed to get hive table [resource_mapping_inside_catalog_hive_hive_zk.opcode.tedc_ctry]
com.starrocks.connector.exception.StarRocksConnectorException: Failed to get table [opcode.tedc_ctry], msg: Could not read password file: /etc/security/cert/ldaptruststorepwd.txt

I am trying to connect to a Kerberos enabled Hive cluster

There is not file in my system like “/etc/security/cert/ldaptruststorepwd.txt”

Am i missing some configuration ??

Below are the entries in the fe.conf file
JAVA_OPTS = “-Djava.security.krb5.conf:/etc/krb5.conf -Dlog4j2.formatMsgNoLookups=true -Xmx8192m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:${LOG_DIR}/fe.gc.log.$DATE -XX:+PrintConcurrentLocks”

JAVA_OPTS=“-Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/usr/mware/mstr/StarRocks-3.1.5/fe/conf/jaas.conf $JAVA_OPTS”

#JAVA_OPTS = “-Djava.security.krb5.conf=/etc/krb5.conf”
hive_meta_store_uris = “thrift://XXXXX:10000,thrift://XXXXX:10000,thrift://XXXX:9083,thrift:/.XXXX:9083,thrift://XXXXX:9083,thrift://XXXXX:9083,thrift://XXXX:9083”

Please advise.

Thanks !
Vineeth

@allen could you please help me with above query ?

Thanks !
Vineeth

The hive catalog doesn’t support olap yet.

Yes but I am not using LDAP, my metastore is enabled with Kerberos and I am using Kerberos related config, you can refer my above posts

Hello

first, don’t use external table any more. if you can create catalog, use catalog directly.

when access hadoop ecosystem, it will load xml files like hive-site.xml/core-site.xml/hdfs-site.xml from system. So make sure those files are conrrectly configured.

if you hit " /etc/security/cert/ldaptruststorepwd.txt" not found problem, that means " /etc/security/cert/ldaptruststorepwd.txt" must be speicifed in those xml files. Find it and check if it’s correctly configured.

1 Like

@yan_zhang thank you for your input. I am able to connect to Hive metastore after commenting the below entries from the core-site.xml file

hadoop.security.group.mapping.ldap.ssl.truststore /etc/security/cert/ldap_truststore.jks
<property>
  <name>hadoop.security.group.mapping.ldap.ssl.truststore.password.file</name>
  <value>/etc/security/cert/ldaptruststorepwd.txt</value>
</property>

Thanks !
Vineeth

Now the connectivity issue is resolved, but not able to query any tables, getting error like hdfsOpenFile failed. Simple select statement is also getting failed.

[Code: 1064, SQL State: 42000] hdfsOpenFile failed, file=hdfs://XXXXX/data/publish/OPCODE/TEDC_CTRY_hst/2024/03/21/1711034937919_0001. err_msg: error=Error(255): Unknown error 255, root_cause=NullPointerException:

Thanks !
Vineeth

could you check what’s the output of be.out?

do you update core-site.xml in be/conf? and after you update that, probably have to restart be too.

@yan_zhang, good catch I missed updating the core-site.xml filing be node like I did in fe node. Post update and restart I am able to query the tables.

Thanks !
Vineeth