diff --git a/ref/general/reference/troubleshooting-mg.md b/ref/general/reference/troubleshooting-mg.md new file mode 100644 index 0000000..202e689 --- /dev/null +++ b/ref/general/reference/troubleshooting-mg.md @@ -0,0 +1,51 @@ +--- + +copyright: + years: 2019, 2020 +lastupdated: "2020-02-25" +--- +layout: doc +doc-category: Reference +title: Using MustGather scripts to collect environment data +--- + + + +# Using MustGather scripts to collect environment data + +You can use the `mustgather` command to collect data about your environment. This information can be useful in troubleshooting and fixing problems. + +There are several MustGather scripts available in the `kabanero-foundation/scripts/mustgather` directory: + +* `appsody-mustgather.sh` +* `che-mustgather.sh` +* `istio-mustgather.sh` +* `kabanero-mustgather.sh` +* `kappnav-mustgather.sh` +* `knative-mustgather.sh` +* `mustgather-all.sh` +* `tekton-mustgather.sh` + +## Running a MustGather Script + +1. In a terminal session, navigate to the location where you cloned the *kabanero-foundation* repository and go to the `kabanero-foundation/scripts/mustgather` directory. + +1. Log in to your cluster. +``` +oc login https:// -u -p +``` + +1. Run a `mustgather` script. For example: +``` +./mustgather-all.sh +``` +**Tip**: You can set the environment variable `LOGS_DIR` to specify the data output location: +``` +LOGS_DIR= ./mustgather-all.sh +``` +By default, MustGather data is stored in the `kabanero-debug` directory. The command also stores compressed information in the `kabanero-debug-.tar.gz` file. diff --git a/ref/general/reference/troubleshooting.adoc b/ref/general/reference/troubleshooting.adoc deleted file mode 100644 index cbd704b..0000000 --- a/ref/general/reference/troubleshooting.adoc +++ /dev/null @@ -1,381 +0,0 @@ -:page-layout: doc -:page-doc-category: Reference -:linkattrs: -:sectanchors: -= Troubleshooting - -Learn how to isolate and resolve problems with Kabanero. - -Make sure that your issues are not related to the operating system, such as disk, memory, and CPU capacities. An OCP cluster should contain a minimum of three master and three worker nodes. - -* See https://github.com/kabanero-io/kabanero-foundation#cluster-hardware-capacity[Kabanero resource requirements, window="_blank"] - -== Using MustGather scripts to collect environment data - -You can use the `mustgather` command to collect data about your environment. This information can be useful in troubleshooting and fixing problems. - -There are several MustGather scripts available in the `kabanero-foundation/scripts/mustgather` directory: - -* `appsody-mustgather.sh` -* `che-mustgather.sh` -* `istio-mustgather.sh` -* `kabanero-mustgather.sh` -* `kappnav-mustgather.sh` -* `knative-mustgather.sh` -* `mustgather-all.sh` -* `tekton-mustgather.sh` - -=== Run a MustGather Script - -. In a terminal session, navigate to the location where you cloned the *kabanero-foundation* repository and go to the `kabanero-foundation/scripts/mustgather` directory. - -. Log in to your cluster. -+ -[source,bash] ----- -oc login https:// -u -p ----- - -. Run a `mustgather` script. -+ -For example: -+ -[source,bash] ----- -./mustgather-all.sh ----- - -**Tip**: You can set the environment variable `LOGS_DIR` to specify the data output location: -[source,bash] ----- -LOGS_DIR= ./mustgather-all.sh ----- - -By default, MustGather data is stored in the `kabanero-debug` directory. The command also stores compressed information in the `kabanero-debug-.tar.gz` file. - -== Errors when using a webhook to trigger pipelines - -When using a webhook to trigger a default pipeline, the last task is the `monitor-result-task`, which may fail in some cases. -This might also cause the webhook pod to be in the "Error" state. The following failures of `monitor-result-task` in the TaskRun view can be disregarded -at this time; they do not impact the application deployment: - -* `error creating GitHub client: error parsing PR number: pulls` -* `IOError: [Errno 2] No such file or directory: '/workspace/pull-request/pr.json'` - -== `Failed to create webhook` error - -An error creating webhooks will occur when the user selects a pipeline that does not have the associated TriggerTemplate and TriggerBindings when creating a webhook in the extension. -Each pipeline must have a number of triggers resources for the Webhooks Extension to work properly, as described here: https://github.com/tektoncd/experimental/blob/master/webhooks-extension/docs/GettingStarted.md#pipeline - ----- -Select the pipeline in the target Namespace from which a PipelineRun (and accompanying PipelineResources) will be created, as described in the section -'What does the Tekton Dashboard's Webhooks Extension do?', above. Note that you need to have related trigger resources installed in the install -namespace (rather than the target namespace), these resources being; a triggertemplate called -template and two triggerbindings --push-binding and -pullrequest-binding. ----- - -== `docker push` fails with gateway timeout error - -When using OpenShift Container Platform on a cloud with Kubernetes service and an internal Docker registry, performing a `docker push` into the internal Docker -registry might result in a gateway time-out error. This happens in cases where the input-output operations per second (IOPS) setting for the backing storage -of the registry's persistent volume (PV) is too low. - -To resolve this problem, change the IOPS setting of the PV's backing storage device. - -== Webhooks are not created in GitHub, even though the pipelines dashboard says webhook creation was successful - -When the webhooks extension sink does not run for extended periods of time, the KService might not be marked as `ready`, which can prevent a webhook from being created. - -=== To diagnose this problem: - -. Check that your pods in the `knative-serving`, `knative-sources`, and `knative-eventingandistio-system` namespaces are healthy, and that you have a `ServiceMeshMemberRoll` resource that includes your install namespace for the pipelines dashboard and webhooks extension. - -. Inspect the logs in the `github-controller-manager-0` pod in the `knative-eventing` namespace. For example, run: -+ ----- -oc logs github-controller-manager-0 -n knative-sources | grep -i 'creating GitHub webhook' -C 3 ----- -+ -. Check for multiple "Ready" KServices. For example, run: -+ ----- -oc get ksvc -n **. ----- -+ -You want the output to resemble the following example, but with different pod names: -+ ----- -mylaptop:config myusername$ oc get ksvc -NAME URL LATESTCREATED LATESTREADY READY -tekton-8hmwh-brxps http://tekton-8hmwh-brxps.mynamespace.apps.mysystem.com tekton-8hmwh-brxps-pj5jx tekton-8hmwh-brxps-pj5jx True -webhooks-extension-sink http://webhooks-extension-sink.mynamespace.apps.mysystem.com webhooks-extension-sink-5tx7l webhooks-extension-sink-5tx7l True ----- - -If the output does not resemble the example, use `oc describe` to describe the KService that is not marked as `True` for READY and your GitHubSource. - -An ideal `oc describe githubsource` includes: - ----- - Sink: - API Version: serving.knative.dev/v1alpha1 - Kind: Service - Name: webhooks-extension-sink -Status: - Conditions: - Last Transition Time: 2019-11-15T14:19:51Z - Status: True - Type: Ready - Last Transition Time: 2019-11-15T14:19:51Z - Status: True - Type: SecretsProvided - Last Transition Time: 2019-11-15T14:19:51Z - Status: True - Type: SinkProvided - Sink Uri: http://webhooks-extension-sink.myinstallnamespacesvc.cluster.local ----- - -If the data does not show a `Ready` state and the problem is persistent, there might be an issue with the webhooks extension code itself. - -=== To resolve the problem: - -. Ensure that the `ServiceMeshControlPlane` namespace is in the `Ready` state. - - Use the `oc -n istio-system describe ServiceMeshControlPlane basic-install` command. - - If the `ServiceMeshControlPlane` is not in the `Ready` state, restart all the pods in the `istio-system` namespace. -. Ensure that the `KnativeServing` namespace is in `Ready` state. - - Use the `oc -n knative-serving describe KnativeServing knative-serving` command. - - If the `KnativeServing` is in `Ready` is not in the `Ready` state, restart all the pods in the `knative-serving` namespace. -. Ensure that `webhooks-extension-sink` KService is in the `Ready` state. - - Use the `oc get kservice -n tekton-pipelines` command. - - If the `webhooks-extension-sink` KService is not in the `Ready` state, delete a revision object that is associated with the `webhooks-extension-sink`. For example, use the `oc delete rev -l serving.knative.dev/configuration=webhooks-extension-sink` command. - -== Changes to Kabanero CR instance are not persisted - -Avoid using the OpenShift Console to edit the Kabanero CR instance. The console may change the `apiVersion` of the Kabanero CR instance from `v1alpha2` to `v1alpha1`. The fields of the Kabanero CR instance which were not defined in the `v1alpha` version may be deleted. There is a description of the issue link:https://github.com/openshift/console/issues/4444[here]. - -== Debugging subcomponents - -You can use the `oc get` or the `oc describe` command to obtain the status of the subcomponents for the product operator. - -For example, run the following command: - ----- -oc -n kabanero get kabanero kabanero -o=yaml ----- - -If the command is successful, it returns information like the following example: - ----- -status: - admissionControllerWebhook: - ready: "True" - appsody: - ready: "True" - version: 0.3.0 - codereadyWorkspaces: - ready: True - message: Error message here - operator: - version: 2.0.0 - instance: - devfileRegistryImage: repository:tag - cheWorkspaceClusterRole: role-name - openShiftOAuth: false - selfSignedCert: false - tlsSupport: false - cli: - hostnames: - - kabanero-cli-kabanero.apps.mycluster.os.example.com - ready: "True" - collectionController: - ready: "True" - version: 0.6.0-alpha.1 - kabaneroInstance: - errorMessage: One or more resource dependencies are not ready. - ready: "False" - version: 0.6.0 - landing: - ready: "True" - version: 0.5.0 - serverless: - knativeServing: - ready: "True" - version: 0.10.0 - ready: "True" - version: 1.3.0 - stackController: - ready: "True" - version: 0.6.0-alpha.1 - tekton: - ready: "True" - version: v0.8.0 ----- - -When a subcomponent is not ready or is failing, the product returns an error message, which could come directly -from the subcomponent. You can gather more information on why a subcomponent is not ready or is failing by referring to the description of each -subcomponent that follows. Each description includes a command to determine the state for the component and a command to gather logs for the component. - -=== admissionControllerWebhook - -The admissionControllerWebhook component is part of the product operator and includes a deployment, a replicaset, and a pod. - -To obtain the state, enter the following command: - ----- -oc -n kabanero describe pods -l name=kabanero-operator-admission-webhook ----- - -To obtain the logs, enter the following command: - ----- -oc -n kabanero logs $(oc -n kabanero get pods -l name=kabanero-operator-admission-webhook -o=jsonpath={.items[0].metadata.name} ----- - -=== appsody - -The appsody component manages the deployment of the application container. - -To obtain the state, enter the following command: - ----- -oc -n openshift-operators describe pods -l name=appsody-operator ----- - -To obtain the logs, enter the following command: - ----- -oc -n openshift-operators logs $(oc -n openshift-operators get pods -l name=appsody-operator -o=jsonpath={.items[0].metadata.name}) ----- - -=== CodeReady Workspaces - -The CodeReady Workspaces component provides a hosted Integrated Development Environment (IDE). - -To obtain the state, enter the following command: - ----- -oc -n kabanero describe pods -l app=codeready-operator ----- - -To obtain the logs, enter the following command: - ----- -oc -n kabanero logs $(oc -n kabanero get pods -l app=codeready-operator -o=jsonpath={.items[0].metadata.name}) ----- - -=== cli - -The cli component provides a service for interacting with the product instance. - -To obtain the state, enter the following command: - ----- -oc -n kabanero describe pods -l app=kabanero-cli ----- - -To obtain the logs, enter the following command: - ----- -oc -n kabanero logs $(oc -n kabanero get pods -l app=kabanero-cli -o=jsonpath={.items[0].metadata.name}) ----- - -=== collectionController - -The collectionController component manages product Collection resources. - -To obtain the state, enter the following command: - ----- -oc -n kabanero describe pods -l app=kabanero-operator-collection-controller ----- - -To obtain the logs, enter the following command: - ----- -oc -n kabanero logs $(oc -n kabanero get pods -l app=kabanero-operator-collection-controller -o=jsonpath={.items[0].metadata.name}) ----- - -=== kabaneroInstance - -The kabaneroInstance component is the collective state of the product operator and its subcomponents. - -To obtain the state, enter the following command: - ----- -oc -n kabanero describe kabanero kabanero ----- - -To obtain the logs, enter the following command: - ----- -oc -n kabanero logs $(oc -n kabanero get pods -l name=kabanero-operator -o=jsonpath={.items[0].metadata.name}) ----- - -=== landing - -The landing component provides the landing page for the console. - -To obtain the state, enter the following command: - ----- -oc -n kabanero describe pods -l app=kabanero-landing ----- - -To obtain the logs, enter the following command: - ----- -oc -n kabanero logs $(oc -n kabanero get pods -l app=kabanero-landing -o=jsonpath={.items[0].metadata.name}) ----- - -=== serverless - -The serverless component manages the KnativeServing instances. - -To obtain the state, enter the following command: - ----- -oc -n openshift-operators describe pods -l name=knative-serving-operator ----- - -To obtain the instance of the state, enter the following command: - ----- -oc -n knative-serving describe knativeserving knative-serving ----- - -To obtain the logs, enter the following command: - ----- -oc -n openshift-operators logs $(oc -n openshift-operators get pods -l name=knative-serving-operator -o=jsonpath={.items[0].metadata.name}) ----- - -=== stackController - -The stackController component manages the application stack resources. - -To obtain the state, enter the following command: - ----- -oc -n kabanero describe pods -l app=kabanero-operator-stack-controller ----- - -To obtain the logs, enter the following command: - ----- -oc -n kabanero logs $(oc -n kabanero get pods -l app=kabanero-operator-stack-controller -o=jsonpath={.items[0].metadata.name}) ----- - -=== pipeline - -The pipeline component provides pipeline, task, and trigger resources used by stacks. - -To obtain the state, enter the following command: - ----- -oc -n openshift-operators describe pods -l name=openshift-pipelines-operator ----- - -To obtain the logs, enter the following command: - ----- -oc -n openshift-operators logs $(oc -n openshift-operators get pods -l name=openshift-pipelines-operator -o=jsonpath={.items[0].metadata.name}) ----- diff --git a/ref/general/reference/troubleshooting.md b/ref/general/reference/troubleshooting.md new file mode 100644 index 0000000..7a41e79 --- /dev/null +++ b/ref/general/reference/troubleshooting.md @@ -0,0 +1,103 @@ +--- + +copyright: + years: 2019, 2020 +lastupdated: "2020-03-03" +--- +layout: doc +doc-category: Reference +title: Troubleshooting +--- + + + +# Troubleshooting +{: #troubleshoot} + +Learn how to isolate and resolve problems. + +* [Errors when using a webhook to trigger pipelines](#errorwtp) +* ['docker push` fails with gateway timeout error](#dockerpushfails) +* [Webhooks are not created in GitHub, even though the pipelines dashboard says webhook creation was successful](#nowebhook) + +## Errors when using a webhook to trigger pipelines +{: #errorwtp} + +When using a webhook to trigger installed pipelines, the last task is the `monitor-result-task`, which may fail in some cases. This might also cause the webhook pod to be in the "Error" state. The following failures of `monitor-result-task` in the TaskRun view can be disregarded at this time; they do not impact the application deployment: + +* `error creating GitHub client: error parsing PR number: pulls` +* `IOError: [Errno 2] No such file or directory: '/workspace/pull-request/pr.json'` + + +## `docker push` fails with gateway timeout error +{: #dockerpushfails} + +When using OpenShift Container Platform on a cloud with Kubernetes service and an internal Docker registry, performing a `docker push` into the internal Docker registry might result in a gateway time-out error. This happens in cases where the input-output operations per second (IOPS) setting for the backing storage of the registry's persistent volume (PV) is too low. + +To resolve this problem, change the IOPS setting of the PV's backing storage device. + +## Webhooks are not created in GitHub, even though the dashboard says webhook creation was successful +{: #nowebhook} + +When the webhooks extension sink does not run for extended periods of time, the KService might not be marked as `ready`, which can prevent a webhook from being created. + +### To diagnose this problem: + +1. Check that your pods in the `knative-serving`, `knative-sources`, and `knative-eventingandistio-system` namespaces are healthy, and that you have a `ServiceMeshMemberRoll` resource that includes your install namespace for the pipelines dashboard and webhooks extension. + +1. Inspect the logs in the `github-controller-manager-0` pod in the `knative-eventing` namespace. For example, run: +``` +oc logs github-controller-manager-0 -n knative-sources | grep -i 'creating GitHub webhook' -C 3 +``` +1. Check for multiple "Ready" KServices. For example, run: +``` +oc get ksvc -n **. +``` +You want the output to resemble the following example, but with different pod names: +``` +mylaptop:config myusername$ oc get ksvc +NAME URL LATESTCREATED LATESTREADY READY +tekton-8hmwh-brxps http://tekton-8hmwh-brxps.mynamespace.apps.mysystem.com tekton-8hmwh-brxps-pj5jx tekton-8hmwh-brxps-pj5jx True +webhooks-extension-sink http://webhooks-extension-sink.mynamespace.apps.mysystem.com webhooks-extension-sink-5tx7l webhooks-extension-sink-5tx7l True +``` +If the output does not resemble the example, use `oc describe` to describe the KService that is not marked as `True` for READY and your GitHubSource. + +An ideal `oc describe githubsource` includes: + +``` + Sink: + API Version: serving.knative.dev/v1alpha1 + Kind: Service + Name: webhooks-extension-sink +Status: + Conditions: + Last Transition Time: 2019-11-15T14:19:51Z + Status: True + Type: Ready + Last Transition Time: 2019-11-15T14:19:51Z + Status: True + Type: SecretsProvided + Last Transition Time: 2019-11-15T14:19:51Z + Status: True + Type: SinkProvided + Sink Uri: http://webhooks-extension-sink.myinstallnamespacesvc.cluster.local +``` + +If the data does not show a `Ready` state and the problem is persistent, there might be an issue with the webhooks extension code itself. + +### To resolve the problem: + +1. Ensure that the `ServiceMeshControlPlane` namespace is in the `Ready` state. + - Use the `oc -n istio-system describe ServiceMeshControlPlane basic-install` command. + - If the `ServiceMeshControlPlane` is not in the `Ready` state, restart all the pods in the `istio-system` namespace. +1. Ensure that the `KnativeServing` namespace is in `Ready` state. + - Use the `oc -n knative-serving describe KnativeServing knative-serving` command. + - If the `KnativeServing` is in `Ready` is not in the `Ready` state, restart all the pods in the `knative-serving` namespace. +1. Ensure that `webhooks-extension-sink` KService is in the `Ready` state. + - Use the `oc get kservice -n tekton-pipelines` command. + - If the `webhooks-extension-sink` KService is not in the `Ready` state, delete a revision object that is associated with the `webhooks-extension-sink`. For example, use the `oc delete rev -l serving.knative.dev/configuration=webhooks-extension-sink` command.