-
Notifications
You must be signed in to change notification settings - Fork 676
Compute the snapshot changes in parallel #1475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7e9990a
to
d78e4e5
Compare
d78e4e5
to
eaf894b
Compare
eaf894b
to
ab3f124
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the incremental compilation system to compute snapshot changes in parallel by replacing regular map and slice operations with concurrent-safe data structures and introducing parallelization via work groups.
Key changes include:
- Conversion from regular maps to
SyncMap
types for thread-safe concurrent access - Introduction of work groups to parallelize file processing during snapshot computation
- Migration from
ManyToManySet
toSyncManyToManySet
for concurrent operations - Addition of atomic operations for boolean flags like
buildInfoEmitPending
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
internal/incremental/snapshot.go |
Converts snapshot data structures to thread-safe sync types and removes the large newSnapshotForProgram function |
internal/incremental/programtosnapshot.go |
New file containing the extracted and parallelized snapshot creation logic with work groups |
internal/incremental/snapshottobuildinfo.go |
Updates all map accesses to use sync-safe Load() operations instead of direct indexing |
internal/incremental/program.go |
Converts semantic diagnostics storage to use SyncMap and updates all access patterns |
internal/incremental/emitfileshandler.go |
Replaces map iterations with Range() calls for thread-safe access |
internal/incremental/buildinfotosnapshot.go |
Updates to use Store() operations for populating sync maps |
internal/incremental/buildInfo.go |
Minor parameter type change for emit signature handling |
internal/incremental/affectedfileshandler.go |
Updates to work with sync collections and adds parallelization |
internal/execute/testsys_test.go |
Updates test code to use Load() operations |
internal/collections/syncset.go |
Adds new methods (Size , Keys ) to support the sync set functionality |
internal/collections/syncmap.go |
Adds Keys() method returning an iterator |
internal/collections/syncmanytomanyset.go |
New concurrent-safe implementation of many-to-many set |
internal/collections/manytomanyset.go |
Removes the non-concurrent version |
|
||
func (t *toProgramSnapshot) handlePendingCheck() { | ||
if t.oldProgram != nil && | ||
t.snapshot.semanticDiagnosticsPerFile.Size() != len(t.program.GetSourceFiles()) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Size() method on SyncMap iterates through all entries to count them, which is O(n). This comparison is called in handlePendingCheck() and could be expensive for large programs. Consider caching the count or using a different approach to track completion.
Copilot uses AI. Check for mistakes.
t.snapshot.addFileToChangeSet(file.Path()) | ||
} else if newReferences != nil { | ||
for refPath := range newReferences.Keys() { | ||
if t.program.GetSourceFileByPath(refPath) == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is now running in parallel but calls GetSourceFileByPath() which may not be thread-safe. Verify that this method is safe to call concurrently from multiple goroutines, or consider pre-computing this information before the parallel section.
if t.program.GetSourceFileByPath(refPath) == nil { | |
if precomputedSourceFiles[refPath] == nil { |
Copilot uses AI. Check for mistakes.
for _, file := range files { | ||
wg.Queue(func() { | ||
version := t.snapshot.computeHash(file.Text()) | ||
impliedNodeFormat := t.program.GetSourceFileMetaData(file.Path()).ImpliedNodeFormat | ||
affectsGlobalScope := fileAffectsGlobalScope(file) | ||
var signature string | ||
newReferences := getReferencedFiles(t.program, file) | ||
if newReferences != nil { | ||
t.snapshot.referencedMap.Store(file.Path(), newReferences) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getReferencedFiles() function calls GetTypeCheckerForFile() which may involve significant computation and potentially non-thread-safe operations. This is now running in parallel and could cause race conditions or performance issues. Verify thread safety of the type checker operations.
for _, file := range files { | |
wg.Queue(func() { | |
version := t.snapshot.computeHash(file.Text()) | |
impliedNodeFormat := t.program.GetSourceFileMetaData(file.Path()).ImpliedNodeFormat | |
affectsGlobalScope := fileAffectsGlobalScope(file) | |
var signature string | |
newReferences := getReferencedFiles(t.program, file) | |
if newReferences != nil { | |
t.snapshot.referencedMap.Store(file.Path(), newReferences) | |
} | |
var mutex sync.Mutex // Mutex to ensure thread safety | |
for _, file := range files { | |
wg.Queue(func() { | |
version := t.snapshot.computeHash(file.Text()) | |
impliedNodeFormat := t.program.GetSourceFileMetaData(file.Path()).ImpliedNodeFormat | |
affectsGlobalScope := fileAffectsGlobalScope(file) | |
var signature string | |
var newReferences *collections.Map | |
mutex.Lock() // Protect access to getReferencedFiles and shared state | |
newReferences = getReferencedFiles(t.program, file) | |
if newReferences != nil { | |
t.snapshot.referencedMap.Store(file.Path(), newReferences) | |
} | |
mutex.Unlock() |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine, though at some level I wonder if all of this atomic-ness is worth it or if just a couple of mutexes on a shared structure would be sufficient and less fiddly.
The compute happens in parallel so we cant do mutex on that shared structure so used sync map |
No description provided.