-
Notifications
You must be signed in to change notification settings - Fork 762
Block Database #4027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Block Database #4027
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces BlockDB, a specialized on-disk database optimized for blockchain block storage with improved write performance and automatic recovery.
- Implements dedicated tests for writing, reading, concurrency, and error cases.
- Introduces recovery logic and index management for efficient block lookups, along with detailed documentation in the README.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
x/blockdb/writeblock_test.go | Adds comprehensive tests verifying block writes, error conditions, and concurrency scenarios. |
x/blockdb/recovery.go | Introduces recovery logic to reconcile the data and index file contents after crashes. |
x/blockdb/readblock_test.go | Provides test coverage for reading full blocks, headers, and bodies in various conditions. |
x/blockdb/index.go | Implements fixed-size index entries and header serialization/deserialization. |
x/blockdb/database.go | Sets up file handling, header initialization, recovery trigger, and connection closure. |
x/blockdb/block.go | Implements block header serialization, writing/reading blocks, and ensuring data integrity. |
x/blockdb/config.go | Defines default and custom configuration options for the BlockDB. |
x/blockdb/README.md | Documents design, file formats, recovery, and usage of BlockDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still unreviewed: recovery code and some of the block allocation logic, but there is enough here to get started with some changes.
x/blockdb/README.md
Outdated
│ Min Block Height │ 8 bytes │ | ||
│ Max Contiguous Height │ 8 bytes │ | ||
│ Data File Size │ 8 bytes │ | ||
│ Reserved │ 24 bytes│ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a reserved area here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to account for we might add features that will require us to store more data in the header in future versions. If this happens, we can add it here without needing to reindex.
x/blockdb/block.go
Outdated
} | ||
} | ||
|
||
if s.nextDataWriteOffset.CompareAndSwap(currentOffset, newOffset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice way of doing this! Presumably this is faster in the non-contention case than a mutex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this should be more lightweight
x/blockdb/block.go
Outdated
fileIndex := int(currentOffset / maxDataFileSize) | ||
localOffset := currentOffset % maxDataFileSize | ||
|
||
if localOffset+totalSize > maxDataFileSize { | ||
writeOffset = (uint64(fileIndex) + 1) * maxDataFileSize | ||
newOffset = writeOffset + totalSize | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this means that files other than the first one will not contain a header. Is this intentional? If so, it means the first file is always going to be opened and can never be deleted which should be mentioned in the README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not every block will contain the header (blockSize includes the metadata header + block). We are only splitting the data files here, not the index file. This is just calculating the global next write offset if the current data file cannot fit the block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable names could have been better. I updated this method to be a clearer in terms of what its doing.
x/blockdb/database.go
Outdated
} | ||
|
||
func (s *Database) getOrOpenDataFile(fileIndex int) (*os.File, error) { | ||
if handle, ok := s.fileCache.Load(fileIndex); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some limit on the fileCache size, otherwise we could run out of file descriptors if the maxFileSize is pretty small and/or blocks are really big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea. i can set a 10k limit
Why this should be merged
This PR introduces BlockDB, a specialized database optimized for block storage.
Avalanche VMs currently store blocks in a key-value database (LevelDB or PebbleDB). This approach is no optimal for block storage because large blocks trigger frequent compactions causing write amplification that degrades performance as the database grows, and KV databases are designed for random key-value access rather than the sequential patterns typical of blockchain operations.
For how BlockDB works see README.md.
Changes
blockdb
tox/
.onEvict
. This is used by the blockdb for storing opened file descriptors for the data files.How this was tested
Units tests for now
Todos
MaxDataFileSize
is reached