Skip to content

GA changes to docs for Agent #2390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
layout: src/layouts/Default.astro
pubDate: 2024-05-08
modDate: 2024-05-30
modDate: 2024-08-05
title: Troubleshooting
description: How to troubleshoot common Kubernetes Agent issues
navOrder: 40
Expand Down Expand Up @@ -44,6 +44,14 @@ If the Agent install command fails with a timeout error, it could be that:
- (if using the NFS storage solution) The NFS CSI driver has not been installed
- (if using a custom Storage Class) the Storage Class name doesn't match

#### 404 error when setting up the NFS Pod.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@APErebus The 404 error when setting up the NFS pod should also just a troubleshooting node
did you mean to put it here under Installation Issues?


Check if your version of help is up to date. In versions the error message you might be experiencing is [not shown]([url](https://github.com/helm/helm/blob/1ec0aacb8865d5b1f7ef1cb884bbf9b12579ecef/pkg/action/install.go#L753-L769)).

Once you version of help is up to date, run `helm repo update` and try again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__

Suggested change
Once you version of help is up to date, run `helm repo update` and try again.
Once your version of help is up to date, run `helm repo update` and try again.


If you're still having issues where Helm fails to retrieve a remote chart if there are [local repos that are not cached](https://github.com/helm/helm/issues/11961) look at workarounds provided on that helm-issue page. If that doesn't help, please [get in touch](https://octopus.com/support).

## Script Execution Issues

### `Unexpected Script Pod log line number, expected: expected-line-no, actual: actual-line-no`
Expand All @@ -62,3 +70,43 @@ If you are using the default NFS storage however, then the script pod would be d

- being evicted due to exceeding its storage quota
- being moved or restarted as part of routine cluster operation

## Frequently Asked Questions {#FAQ}

### Can the agent work with Octopus running in an HA Cluster setup?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Can the agent work with Octopus running in an HA Cluster setup?
### Can the agent work with Octopus running in a HA Cluster setup?

Yes! See the [Kubernetes agent HA Cluster Support](/docs/infrastructure/deployment-targets/kubernetes/kubernetes-agent/ha-cluster-support) page.


### Can a proxy be specified when setting up the Kubernetes agent?
Yes! Proxy servers for the polling connection that takes place between the agent and Octopus Server. These can be supplied for setup via the `.pollingProxy.*` helm values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the second sentence feels incomplete - update to be:
Yes! Proxy servers for the polling connection that takes place between the agent and Octopus Server are supported - their configuration can be set via the .pollingProxy.* helm values.


Define the polling proxy server through the `agent.pollingProxy.host`, `agent.pollingProxy.port`, `agent.pollingProxy.username` and `agent.pollingProxy.password` values via the [octopusdeploy/kubernetes-agent](https://hub.docker.com/r/octopusdeploy/kubernetes-agent) helm chart.

### When trying to install the Kubernetes Agent on an existing cluster, I get an 401: Unauthorized error.

```
Error: GET "https://registry-1.docker.io/v2/octopusdeploy/kubernetes-agent/tags/list":
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is based on this exchange:

image

GET "https://auth.docker.io/token?scope=repository%3Aoctopusdeploy%2Fkubernetes-agent%3Apull&service=registry.docker.io": unexpected status code 401: Unauthorized
```
1. If you are running this command locally are you logged in?
2. If this is running from another automation, does that process have valid authentication and authorization?

### Do I need to have the NFS CSI Driver?
Not for all configurations. It depends, the installation wizard will guide you.

If you are using `azurefile` then you don't need the NFS CSI driver.

If you're using a new/clean AKS instance then you will need to install the NFS CSI Driver, as that AKS instance will not have the NFS CSI driver installed.

### I have unexpected behavior with the polling endpoints in a HA configuration.
This could be a variety of issues. First check that different PORTS and/or URLs are used for each node.

Check what was supplied to `agent.serverCommsAddresses` as they must be unique for [each Octopus node being registered against](https://octopus.com/docs/administration/high-availability/maintain/polling-tentacles-with-ha#connecting-polling-tentacles).

If that doesn't help, please [get in touch](https://octopus.com/support).

### I'm having strange behavior relating to ingress in a HA configuration.
Carefully look and see that there is a `serverCommsAddress` property for backwards compatibility, and a `serverCommsAddresses` the latter supporting an array input, mistyping these can happen. This has presented itself as a variety of errors depending on the broader configuration, e.g. you may see "it failed to allocate the public ip" if using load balancers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Carefully look and see ...' felt odd to me - potential reword?
"Carefully check that both the 'serverCommsAddress' and 'ServerCommsAddress' properties exist. The latter supports an array input - errors in these fields may result in erroneous HA behaviour.


### The Script Pod seems to hang during a deployment
The times this has been brought up it's been specific to the deployment process being executed. Run subsets of your process to narrow down the cause, or [get in touch](https://octopus.com/support) with info on how we can reproduce what you're seeing.