opened 11:08AM - 22 Jan 24 UTC
      
      
     
    
        
          type/feature-request
        
    
   
 
  
    > Refer to roadmap  [2023](https://github.com/StarRocks/starrocks/issues/16445) …[2022](https://github.com/StarRocks/starrocks/issues/1244)
# Shared-data & StarOS
- Align with all functionalities to shared-nothing
    - [ ] Sync materialized view
    - [ ] Generated column
    - [ ] Partial update with column mode
    - [ ] Optimize table and manual compaction
- Better cache system 
    - [ ] Multi-layer cache
    - [ ] Global cache
    - [ ] Cache Auto warmup
    - [ ] Cache black/whitelist
    - [ ] Refine evict algorithm
- StarOS internal optimization
    - [ ] Multi-replicas for shard management
    - [ ] Shard schedule optimization for large scale (more than 10M shards)
    - [ ] Local storage for StarOS
- [ ] Open API for StarRocks table format  (sink and source)
- [ ] Time Travel
- [ ] Backup support 
   
# Performance 
- [ ] Full columnar Json index 
- [ ] Cost model with primary key and foreign key constrains
- [ ] Arm optimization for codecs
- [ ] Adaptive DOP and adaptive query engine
- [ ] Global dictionary encoding
- [ ] Enhance IO schedule framework
- [ ] JIT / Codegen
# Easy to use
- [ ] List partition optimization
- Improve `files` table function 
    - [ ] Improve schema inference
    - [ ] CSV and json format support
    - [ ] Other format: Avro, Arrow, Protobuf
    - [ ] Better performance for read, predicates pushdown
- Insert statement improvement (on duplicate key, insert properties)
- Unified data ingestion with Pipe
    - [ ] Pipe for continuous ingestion from Kafka
    - [ ] Read from external stream table(Kafka) 
    - [ ] Continues data ingestion from SQS with Pipe
- [ ] Out-of-the-box parameters
# Data lake analytics
- Better lake format support
Lake | Query | Insert | DDL | Update/Delete/Merge into | MV
--- | --- | --- | --- | --- | ---
Hive | 1.18 | 3.2 |   |   | 2.5 
Iceberg | 2.1  | 3.1 | 3.3 | 3.3 | 3.0 
Hudi | 2.2 |   |   |   | 3.0
Paimon | 3.0 |   |   |   | 3.2
Delta lake | 3.0 |   |   |   | 3.2
- Materialized view improvement 
  - [ ] Improve partition mapping (list partition, expression partition)
  - [ ] Task scheduler DAG & Lineage
  - [ ] Better query rewrite 
- [ ] JDBC catalog improvement
- [ ] Enhance JNI reader and implement JNI writer
- [ ] Text File format support
- [ ] Presto/Trino/Spark/Hive SQL compatibility 
- [ ] Presto/Trino/Spark/Hive UDF compatibility
- [ ] Automatic cooldown to lake format
- [ ] Lake metadata optimization for Iceberg / Hudi
# Data warehousing(batch and streaming)
## Batch processing & ETL improvement 
- [ ] Enable spilling by default globally
- [ ] Multi-statement transaction
- [ ] Temporary table
- [ ] Group execution
- [ ] Task auto retry
## Streaming processing & real-time update
- [ ] Schemaless partial update
- [ ] Merge into statement
- [ ] Binlog to flink and spark streaming
- [ ] Transaction level incremental refresh in materialized view (Aggregation, Join, functions)
- [ ] Incremental refresh for iceberg/Hudi/Paimon materialized view
## Metadata 
- [ ] Fine granularity Fe lock(from db level to table level)
- [ ] Decoupled storage for FE (kv store)
# All-in-one scenarios
- [ ] Search: Optimize full text inverted index
- [ ] Row store: Optimize row store for high concurrent point lookup
- [ ] Time series db: Asof join,  high concurrent ingestion
- [ ] Vector database:  vector index
# Release 
- 3.3 release plan
- 3.4 release plan
-