Skip to content

Compute the snapshot changes in parallel #1475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 4, 2025
Merged

Conversation

sheetalkamat
Copy link
Member

No description provided.

@sheetalkamat sheetalkamat force-pushed the snapshotChangeParallel branch 2 times, most recently from 7e9990a to d78e4e5 Compare July 29, 2025 20:02
@sheetalkamat sheetalkamat force-pushed the snapshotChangeParallel branch from d78e4e5 to eaf894b Compare July 29, 2025 21:15
@sheetalkamat sheetalkamat force-pushed the snapshotChangeParallel branch from eaf894b to ab3f124 Compare July 29, 2025 21:31
@sheetalkamat sheetalkamat marked this pull request as ready for review July 29, 2025 21:36
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the incremental compilation system to compute snapshot changes in parallel by replacing regular map and slice operations with concurrent-safe data structures and introducing parallelization via work groups.

Key changes include:

  • Conversion from regular maps to SyncMap types for thread-safe concurrent access
  • Introduction of work groups to parallelize file processing during snapshot computation
  • Migration from ManyToManySet to SyncManyToManySet for concurrent operations
  • Addition of atomic operations for boolean flags like buildInfoEmitPending

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
internal/incremental/snapshot.go Converts snapshot data structures to thread-safe sync types and removes the large newSnapshotForProgram function
internal/incremental/programtosnapshot.go New file containing the extracted and parallelized snapshot creation logic with work groups
internal/incremental/snapshottobuildinfo.go Updates all map accesses to use sync-safe Load() operations instead of direct indexing
internal/incremental/program.go Converts semantic diagnostics storage to use SyncMap and updates all access patterns
internal/incremental/emitfileshandler.go Replaces map iterations with Range() calls for thread-safe access
internal/incremental/buildinfotosnapshot.go Updates to use Store() operations for populating sync maps
internal/incremental/buildInfo.go Minor parameter type change for emit signature handling
internal/incremental/affectedfileshandler.go Updates to work with sync collections and adds parallelization
internal/execute/testsys_test.go Updates test code to use Load() operations
internal/collections/syncset.go Adds new methods (Size, Keys) to support the sync set functionality
internal/collections/syncmap.go Adds Keys() method returning an iterator
internal/collections/syncmanytomanyset.go New concurrent-safe implementation of many-to-many set
internal/collections/manytomanyset.go Removes the non-concurrent version


func (t *toProgramSnapshot) handlePendingCheck() {
if t.oldProgram != nil &&
t.snapshot.semanticDiagnosticsPerFile.Size() != len(t.program.GetSourceFiles()) &&
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Size() method on SyncMap iterates through all entries to count them, which is O(n). This comparison is called in handlePendingCheck() and could be expensive for large programs. Consider caching the count or using a different approach to track completion.

Copilot uses AI. Check for mistakes.

t.snapshot.addFileToChangeSet(file.Path())
} else if newReferences != nil {
for refPath := range newReferences.Keys() {
if t.program.GetSourceFileByPath(refPath) == nil {
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is now running in parallel but calls GetSourceFileByPath() which may not be thread-safe. Verify that this method is safe to call concurrently from multiple goroutines, or consider pre-computing this information before the parallel section.

Suggested change
if t.program.GetSourceFileByPath(refPath) == nil {
if precomputedSourceFiles[refPath] == nil {

Copilot uses AI. Check for mistakes.

Comment on lines +88 to +97
for _, file := range files {
wg.Queue(func() {
version := t.snapshot.computeHash(file.Text())
impliedNodeFormat := t.program.GetSourceFileMetaData(file.Path()).ImpliedNodeFormat
affectsGlobalScope := fileAffectsGlobalScope(file)
var signature string
newReferences := getReferencedFiles(t.program, file)
if newReferences != nil {
t.snapshot.referencedMap.Store(file.Path(), newReferences)
}
Copy link
Preview

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getReferencedFiles() function calls GetTypeCheckerForFile() which may involve significant computation and potentially non-thread-safe operations. This is now running in parallel and could cause race conditions or performance issues. Verify thread safety of the type checker operations.

Suggested change
for _, file := range files {
wg.Queue(func() {
version := t.snapshot.computeHash(file.Text())
impliedNodeFormat := t.program.GetSourceFileMetaData(file.Path()).ImpliedNodeFormat
affectsGlobalScope := fileAffectsGlobalScope(file)
var signature string
newReferences := getReferencedFiles(t.program, file)
if newReferences != nil {
t.snapshot.referencedMap.Store(file.Path(), newReferences)
}
var mutex sync.Mutex // Mutex to ensure thread safety
for _, file := range files {
wg.Queue(func() {
version := t.snapshot.computeHash(file.Text())
impliedNodeFormat := t.program.GetSourceFileMetaData(file.Path()).ImpliedNodeFormat
affectsGlobalScope := fileAffectsGlobalScope(file)
var signature string
var newReferences *collections.Map
mutex.Lock() // Protect access to getReferencedFiles and shared state
newReferences = getReferencedFiles(t.program, file)
if newReferences != nil {
t.snapshot.referencedMap.Store(file.Path(), newReferences)
}
mutex.Unlock()

Copilot uses AI. Check for mistakes.

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, though at some level I wonder if all of this atomic-ness is worth it or if just a couple of mutexes on a shared structure would be sufficient and less fiddly.

@sheetalkamat
Copy link
Member Author

The compute happens in parallel so we cant do mutex on that shared structure so used sync map

@sheetalkamat sheetalkamat added this pull request to the merge queue Aug 4, 2025
Merged via the queue into main with commit a57f4e0 Aug 4, 2025
22 checks passed
@sheetalkamat sheetalkamat deleted the snapshotChangeParallel branch August 4, 2025 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants