Skip to content

[client] Don't deactivate upstream resolvers on failure #4128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

lixmal
Copy link
Collaborator

@lixmal lixmal commented Jul 9, 2025

Describe your changes

  • Improve upstream timeout based on the amount of upstream servers (raised from 2s default)
  • Add management resolver that caches management related domains (management, signal, relay, flow, turn, stun)
    • Populated before dns server start
    • Updated on network map change
  • Remove upstream resolver deactivation based on failed requests
  • Add debug output if upstream timed out: which peer routes this nameserver and what's its state

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)
  • Extended the README / documentation, if necessary

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

@lixmal lixmal force-pushed the dns-fail-count branch 4 times, most recently from 2b66e95 to f95b367 Compare July 10, 2025 19:48
@lixmal lixmal marked this pull request as ready for review July 11, 2025 16:23
@Copilot Copilot AI review requested due to automatic review settings July 11, 2025 16:23
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors DNS handling to improve robustness, adds caching for management domains, and simplifies configuration management.

  • Introduces a fixed DNS timeout and detailed debug logging for upstream timeouts, removing dynamic deactivation of resolvers.
  • Extends login and engine startup flows to carry management configuration through DNS cache population.
  • Refactors DNS server and upstream resolver constructors to use structured config types and adds comprehensive management-domain caching and tests.

Reviewed Changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
client/internal/routemanager/dnsinterceptor/handler.go Use a constant DNS timeout, add context-based ExchangeWithFallback, and debugPeerTimeout logic
client/internal/login.go Updated doMgmLogin to return login response, adjusted callers to handle new return values
client/internal/engine.go (+ engine_test.go) Changed Engine.Start signature to accept NetbirdConfig and mgmtURL, added PopulateNetbirdConfig
client/internal/dns/upstream.go (+ upstream_*.go + tests) Refactored upstream resolver: unified timeouts, removed deactivation logic, added client timeouts
client/internal/dns/server.go (+ server_test.go) Switched NewDefaultServer to accept DefaultServerConfig, integrated management cache resolver
client/internal/dns/mgmt/mgmt.go (+ mgmt_test.go) Added management-domain caching resolver with full test coverage
client/internal/dns/config/domains.go (+ domains_test.go) Introduced ServerDomains extraction and robust URL-to-domain parsing with tests
client/internal/connect.go Updated ConnectClient.run to pass NetbirdConfig and management URL to engine.Start
Comments suppressed due to low confidence (1)

client/internal/routemanager/dnsinterceptor/handler.go:597

  • nbdns.FormatPeerStatus does not exist in the nbdns module; the FormatPeerStatus function lives in the internal dns package. Import and call the correct package (e.g., client/internal/dns) or move FormatPeerStatus into nbdns to prevent a missing-reference compilation error.
	return fmt.Sprintf(" (peer %s)", nbdns.FormatPeerStatus(&peerState))

@hurricanehrndz
Copy link
Contributor

Patch is very good, it does solve the issue but introduces system DNS cache poisoning. It might not be worth it at this time.

Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants