Data Load from S3 - Error message: failed to get file schema

first query is not working in starrocks while second query is working absolutely fine, the only between these two queries is the path in the first query we have provided path as /year=2023/month=5/day=1/hour=0/.

Could someone please help to resolve this issue.

starrocks version : Latest-3.3

SELECT * FROM FILES
(
“path” = “s3://bucket-abc01-data/somepath/data/table/year=2023/month=5/day=1/hour=0/000000_0”,
“format” = “parquet”,
“aws.s3.access_key” = “AAAAAAAAAAAAAAAAA”,
“aws.s3.secret_key” = “BBBBBBBBBBBBBBBBBBBBBBB”,
“aws.s3.region” = “us-west-1”,
“aws.s3.endpoint” = “https://some_endpoint.com:9021”,
“aws.s3.enable_path_style_access” = “true”,
“aws.s3.enable_ssl” = “true”
)
LIMIT 3;

SELECT * FROM FILES
(
“path” = “s3://bucket-abc01-data/somepath/data/table/000000_0”,
“format” = “parquet”,
“aws.s3.access_key” = “AAAAAAAAAAAAAAAAA”,
“aws.s3.secret_key” = “BBBBBBBBBBBBBBBBBBBBBBB”,
“aws.s3.region” = “us-west-1”,
“aws.s3.endpoint” = “https://some_endpoint.com:9021”,
“aws.s3.enable_path_style_access” = “true”,
“aws.s3.enable_ssl” = “true”
)
LIMIT 3;

ERROR:
SQL Error [1064] [42000]: Access storage error. Error message: failed to get file schema, path: s3://bucket-abc01-data/somepath/data/table/year=2023/month=5/day=1/hour=0/000000_0, error: [Init parquet reader fail. IOError: BE access S3 file failed, SdkResponseCode=403, SdkErrorType=15, SdkErrorMessage=No response body., filename: s3://bucket-abc01-data/somepath/data/table/year=2023/month=5/day=1/hour=0/000000_0]

Could you mind to give the parquet to us? I can have a reproduce.

1 Like

Thank you allen for reply.
You can create any sample parquet file. The thing is when i copied the same parquet file to another location where location does not contain any “=” operator it started working.

For eg : let suppose the path is /table_name/year=2024/month=9/day=1/file.snappy.parquet

above path will not be going to work it will throw mentioned error but when i use below path it will absolutely going to work fine.

/table_name/2024/9/1/file.snappy.parquet

So as a work around we copied data to path with above structure for now so that we can carry out same tests but then its a bug i believe it needs to be fixed.

Best,
Rahul Vishwakarma