-
Notifications
You must be signed in to change notification settings - Fork 285
implement the data branch #22636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
implement the data branch #22636
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
|||||||||||||||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
|||||||||||||||||||||
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue ##21979
What this PR does / why we need it:
Implement the data branch.
PR Type
Enhancement, Feature, Tests
Description
Implements comprehensive data branch functionality for snapshot-based table versioning and management
Adds core data branch operations: diff detection between snapshots, merge with conflict resolution (FAIL, SKIP, ACCEPT), and branch hierarchy management
Implements adaptive hashmap with disk spillover support for efficient change tracking and storage
Adds branch metadata system table (
mo_branch_metadata) to track clone operations and branch relationshipsIntroduces DAG (Directed Acyclic Graph) structure with LCA (Lowest Common Ancestor) algorithm for branch hierarchy
Extends SQL grammar and parser to support DATA BRANCH statements (CREATE TABLE, CREATE DATABASE, DELETE TABLE, DELETE DATABASE, DIFF, MERGE)
Refactors clone operations to track branch metadata and support data branch workflows
Adds comprehensive test coverage for diff, merge, and conflict handling scenarios with various data types and dataset sizes
Updates privilege logic to support data branch operations with appropriate access controls
Standardizes keyword capitalization throughout codebase (Exists vs exists) for consistency
Adds support for commit timestamp column tracking in distributed TAE engine
Extends type system with generic value comparison function and decimal/timestamp support in result sets
Diagram Walkthrough
File Walkthrough
7 files
data_branch.go
Implement data branch diff and merge operationspkg/frontend/data_branch.go
merge operations between table snapshots
constructing change handles, and managing branch metadata
ACCEPT)
branch DAG construction
branch_hashmap.go
Implement adaptive hashmap with disk spillover supportpkg/frontend/databranchutils/branch_hashmap.go
memory allocation fails
records with key-value semantics
management
resource cleanup
data_branch.go
Add data branch statement type definitions and implementationspkg/sql/parsers/tree/data_branch.go
DataBranchCreateTable,DataBranchDeleteTable,DataBranchCreateDatabase,DataBranchDeleteDatabaseDataBranchDiffandDataBranchMergestatement types for branchoperations
using reuse pattern
branch_dag.go
Implement data branch DAG with LCA algorithmpkg/frontend/databranchutils/branch_dag.go
DataBranchDAGfor managing branch hierarchyrelationships
parent branches
branch_change_handle.go
Add branch change handle for data collection and filteringpkg/frontend/databranchutils/branch_change_handle.go
BranchChangeHandlewrapper for engine changeshandling
CollectChangesfunction for gathering changes betweentimestamps
self_handle.go
Add data branch statement execution handlerpkg/frontend/self_handle.go
CreateTable, DeleteTable, DeleteDatabase, CreateDatabase)
handleDataBranchfunction call with proper frontend printtracking
keywords.go
Add data branch and conflict handling keywordspkg/sql/parsers/dialect/mysql/keywords.go
branch,diff,conflictfail,skip,accept15 files
pitr.go
Standardize capitalization of "Exists" in PITR operationspkg/frontend/pitr.go
Existsinstead of lowercase
existsdoCreatePitr,doDropPitr,doAlterPitr,doRestorePitr(Point-In-Time Recovery) operations
publication_subscription_test.go
Standardize capitalization in publication subscription testspkg/frontend/publication_subscription_test.go
Existsinstead of lowercaseexistsTest_doAlterPublication,Test_doAlterPublication2, andTest_doDropPublicationsnapshot_restore_with_ts.go
Standardize capitalization of "Exists" in snapshot restore operationspkg/frontend/snapshot_restore_with_ts.go
Existsinstead of lowercase
existsrestoreToAccountFromTS,recreateTableFromTS,and
restoreViewsFromTSsnapshot.go
Standardize capitalization in snapshot rebuild loggingpkg/vm/engine/tae/logtail/snapshot.go
RebuildAObjectDelfunction to use lowercaseexistsinstead of capitalizedExistssnapshot.go
Standardize keyword capitalization in snapshot operationspkg/frontend/snapshot.go
Existskeyword
Existsinstead ofexistsauthenticate_test.go
Update test descriptions with standardized keyword capitalizationpkg/frontend/authenticate_test.go
Existskeyword
session.go
Standardize keyword capitalization in session authenticationpkg/frontend/session.go
Existskeywordsteps
query_result.go
Standardize keyword capitalization in query result handlingpkg/frontend/query_result.go
Existskeyword for consistencysystem_initialize.go
Standardize keyword capitalization in system initializationpkg/frontend/system_initialize.go
Existskeyword for consistencymysql_protocol_predefines.go
Standardize keyword capitalization in protocol definitionspkg/frontend/mysql_protocol_predefines.go
Existskeyword for consistencypublication_subscription.go
Standardize keyword capitalization in publication subscriptionpkg/frontend/publication_subscription.go
Existskeyword forconsistency
cdc_options.go
Standardize keyword capitalization in CDC optionspkg/frontend/cdc_options.go
Existskeyword for consistencytxn.go
Fix comment capitalization in TxnHandlerpkg/frontend/txn.go
always"
util.go
Fix comment capitalization in utility functionspkg/frontend/util.go
Exists" and "true/false - exists" to "true/false - Exists"
pitr_test.go
Fix test case name capitalizationpkg/frontend/pitr_test.go
"SubscriptionOption Exists"
14 files
authenticate.go
Add branch metadata table support and update privilege logicpkg/frontend/authenticate.go
catalog.MO_BRANCH_METADATAto system account tables andprivilege maps
Existskeyword forconsistency
MoCatalogBranchMetadataDDLto catalog initialization(CreateTable, DeleteTable, Merge, Diff)
PrivilegeTypeTableAllinstead ofPrivilegeTypeInsertclone.go
Implement data branch metadata tracking for clone operationspkg/frontend/clone.go
cloneReceiptstructgetOpAndToAccountIdto useToAccountOptinstead of stringparameter
resolveSnapshothelper function for timestamp resolutionupdateBranchMetaTableto record clone operations in branchmetadata table
database context
encoding.go
Add generic value comparison function for all typespkg/container/types/encoding.go
CompareValuesfunction to compare values of different typesand special types
build_alter_table.go
Improve variable naming for fake primary key trackingpkg/sql/plan/build_alter_table.go
hasFakePKtocopyFakePKColfor better clarityduring alter table copy operations
local_disttae_datasource.go
Add commit timestamp column support to disttae datasourcepkg/vm/engine/disttae/local_disttae_datasource.go
DefaultCommitTS_Attrcolumn in filtered in-memorycommitted inserts
when present
ddl.go
Mark deleted tables in branch metadata during drop operationspkg/sql/compile/ddl.go
table_deletedflag inmo_branch_metadatatable
reader.go
Add commit timestamp attribute support to reader utilitypkg/vm/engine/readutil/reader.go
DefaultCommitTS_Attrcolumn in column update logictype
clone.go
Refactor clone statement to use ToAccountOpt structurepkg/sql/parsers/tree/clone.go
ToAccountOptstruct to encapsulate account optioninformation
CloneTableandCloneDatabaseto useToAccountOptinstead ofdirect identifier
operations
stmt_kind.go
Enable data branch statements in uncommitted transactionspkg/frontend/stmt_kind.go
statementCanBeExecutedInUncommittedTransactionfunctionDataBranchDeleteTable, and DataBranchDeleteDatabase to execute in
uncommitted transactions
types.go
Add data branch frontend print type and standardize keywordspkg/frontend/types.go
FPDataBranchconstant to frontend print typesExistskeywordresultset.go
Add decimal and timestamp type support to result setpkg/frontend/resultset.go
types in
GetStringmethodtype code
func_mo.go
Register MO_BRANCH_METADATA in function catalogpkg/sql/plan/function/func_mo.go
catalog.MO_BRANCH_METADATAconstant to the function registrywith value 0
types.go
Define MO_BRANCH_METADATA system table constantpkg/catalog/types.go
MO_BRANCH_METADATA = "mo_branch_metadata"to systemtable definitions
mysql_sql.y
Implement data branch SQL grammar and parsingpkg/sql/parsers/dialect/mysql/mysql_sql.y
toAccountOpt,conflictOpt, anddiffAsOptforbranch operations
BRANCH,LOG,REVERT,REBASE,DIFF,CONFLICT,CONFLICT_FAIL,CONFLICT_SKIP,CONFLICT_ACCEPTbranch_stmtgrammar rule supporting DATA BRANCH operations(CREATE TABLE, CREATE DATABASE, DELETE TABLE, DELETE DATABASE, DIFF,
MERGE)
to_account_opt,conflict_opt,diff_as_optforoptional parameters
create_table_stmtto useto_account_optinstead of hardcodedTO ACCOUNT syntax
DATAandBRANCHto algorithm_type alternatives23 files
branch_hashmap_test.go
Add comprehensive tests for branch hashmap implementationpkg/frontend/databranchutils/branch_hashmap_test.go
BranchHashmapfunctionality
disk, and partial deletion
allocator for testing spill behavior
branch_dag_test.go
Add tests for data branch DAG and LCA algorithmpkg/frontend/databranchutils/branch_dag_test.go
supporting branch hierarchy
algorithm
diff_3.result
Add data branch diff test results for mixed typestest/distributed/cases/snapshot/branch/diff/diff_3.result
mixed data types
types, string primary keys with decimal/binary columns, and temporal
primary keys with float/text updates
structures
diff_3.sql
Add data branch diff test cases for mixed typestest/distributed/cases/snapshot/branch/diff/diff_3.sql
types
binary data, and sensor events with temporal keys
merge_1.result
Add data branch merge test resultstest/distributed/cases/snapshot/branch/merge/merge_1.result
with updates, and merges with cloned tables
diff_1.result
Add data branch diff test results with snapshotstest/distributed/cases/snapshot/branch/diff/diff_1.result
same/different branch times, and LCA relationships
configurations
diff_2.result
Add data branch diff test results for large datasetstest/distributed/cases/snapshot/branch/diff/diff_2.result
and complex primary keys
key handling
diff_1.sql
Add data branch diff test cases with snapshotstest/distributed/cases/snapshot/branch/diff/diff_1.sql
Ancestor) scenarios
LCA relationships
patterns
merge_1.sql
Add data branch merge test casestest/distributed/cases/snapshot/branch/merge/merge_1.sql
tables
diff_2.sql
Add data branch diff test cases for large datasetstest/distributed/cases/snapshot/branch/diff/diff_2.sql
scenarios
composite primary keys
merge_2.result
Add data branch merge conflict handling test resultstest/distributed/cases/snapshot/branch/merge/merge_2.result
CONFLICT_ACCEPT options
merge_2.sql
Add data branch merge conflict handling test casestest/distributed/cases/snapshot/branch/merge/merge_2.sql
(skip, accept)
conflicting changes
tenant.result
Update tenant test results for branch metadata tabletest/distributed/cases/tenant/tenant.result
relname not like '__mo_index_unique__%'torelname not like '__mo_index%'mo_branch_metadatain system tablelistings
starlark.result
Update starlark procedure test results with escaped keywordstest/distributed/cases/procedure/starlark.result
keywords:
moandmo.foosp_table.result
Update sp_table test results for branch metadatatest/distributed/cases/dml/select/sp_table.result
relname not like '__mo_index_unique__%'torelname not like '__mo_index%'mo_branch_metadatato expected system table resultssp_table.sql
Update sp_table query filter for index matchingtest/distributed/cases/dml/select/sp_table.sql
relname not like '__mo_index_unique__%'torelname not like '__mo_index%'tenant.test
Update tenant test query filter for index matchingtest/distributed/cases/tenant/tenant.test
relname not like '__mo_index_unique__%'torelname not like '__mo_index%'starlark.sql
Update starlark procedure definitions with escaped keywordstest/distributed/cases/procedure/starlark.sql
keywords:
moandmo.foorestore_cluster_table.result
Update cluster restore test results for branch metadatatest/distributed/cases/snapshot/cluster/restore_cluster_table.result
mo_branch_metadatato expected system table listings in twolocations
table
cluster_level_snapshot_restore_cluster.result
Update cluster snapshot restore test results for branch metadatatest/distributed/cases/snapshot/cluster_level_snapshot_restore_cluster.result
mo_branch_metadatato expected system table listings in twolocations
table
clone_sys_db_table_to_new_db_table.result
Update clone system table test results for branch metadatatest/distributed/cases/snapshot/clone/clone_sys_db_table_to_new_db_table.result
mo_branch_metadatato expected system table listings in twolocations
show.result
Update show tables test results for branch metadatatest/distributed/cases/dml/show/show.result
mo_branch_metadatato expected system table listingstable
create_user_default_role.result
Update privilege test results for branch metadatatest/distributed/cases/tenant/privilege/create_user_default_role.result
mo_branch_metadatato expected system table listingstable
4 files
cluster_upgrade_list.go
Add branch metadata table to v4.0.0 cluster upgradepkg/bootstrap/versions/v4_0_0/cluster_upgrade_list.go
mo_branch_metadatatablecluster_upgrade_list.go
Add branch metadata table to v3.0.0 cluster upgradepkg/bootstrap/versions/v3_0_0/cluster_upgrade_list.go
mo_branch_metadatatableprocess
predefined.go
Add branch metadata table DDL definitionpkg/frontend/predefined.go
MoCatalogBranchMetadataDDLconstant defining the branch metadatatable schema
level, and table_deleted flag
lib.go
Add math library linking flag to CGOcgo/lib.go
#cgo LDFLAGS: -lmcompiler flag for linking the math library1 files