Skip to content

Conversation

sedkis
Copy link
Contributor

@sedkis sedkis commented Aug 18, 2025

User description

…rove clarity on usage and configuration. Renamed section titles and expanded content to cover both liveness and readiness health checks for the Tyk Gateway.

Contributor checklist

  • Reviewed PR Code suggestions and updated accordingly
  • Tyklings: Labled the PR with the relevant releases
  • Tyklings: Added Jira DX PR ticket to the subject

New Contributors



PR Type

Documentation


Description

  • Add readiness /ready endpoint documentation

  • Clarify liveness /hello behavior and use

  • Provide Kubernetes probe configurations

  • Add config keys and troubleshooting guidance


Diagram Walkthrough

flowchart LR
  GW["Tyk Gateway"]
  Hello["/hello (Liveness)"]
  Ready["/ready (Readiness)"]
  LB["Load Balancer / Basic Monitoring"]
  K8s["Kubernetes Probes"]
  Redis["Redis"]
  Dash["Dashboard (optional)"]
  RPC["RPC (MDCB) (optional)"]

  GW -- "exposes" --> Hello
  GW -- "exposes" --> Ready
  Hello -- "200 always; body shows status" --> LB
  Ready -- "200 when ready; 503 otherwise" --> K8s
  GW -- "monitors" --> Redis
  GW -- "monitors" --> Dash
  GW -- "monitors" --> RPC
Loading

File Walkthrough

Relevant files
Documentation
health-check.md
Document readiness and liveness endpoints with probes       

tyk-docs/content/planning-for-production/ensure-high-availability/health-check.md

  • Retitle page to "Health Checks" and broaden scope
  • Add /ready readiness endpoint with behavior and config
  • Refine /hello liveness endpoint usage and examples
  • Include Kubernetes probe YAML and troubleshooting section
+147/-133

…rove clarity on usage and configuration. Renamed section titles and expanded content to cover both liveness and readiness health checks for the Tyk Gateway.
Copy link
Contributor

⚠️ Deploy preview for PR #6871 did not become live after 3 attempts.
Please check Netlify or try manually: Preview URL

Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Inaccuracy

Stating that /hello always returns HTTP 200 OK may need nuance if the endpoint itself is unreachable or disabled via config; also verify that MDCB/RPC presence in /hello details is consistent across versions.

### How it responds
- **Always returns `HTTP 200 OK`** (even when components are failing).  
- **Check the response body** to see which components are healthy or failing

### When to use `/hello`
- **Load balancers** - Route traffic to instances that respond
- **Basic monitoring** - Simple uptime checks
- **MDCB setups** - Monitor both Management and Worker Gateways

### Configuration
The endpoint runs on `/hello` by default. To change it:

```yaml
health_check_endpoint_name: "status"

Config Ref

Important Notes

  • Updates every 10 seconds - Health status is cached and refreshed automatically
  • Always responds with 200 - Even when Redis or Dashboard are down (check response body for details)
  • Use for load balancers - Perfect for HAProxy, NGINX, AWS ALB health checks

</details>

<details><summary><a href='https://github.com/TykTechnologies/tyk-docs/pull/6871/files#diff-1f2dc0abe0799b41bbf67dc8ed7db54ea185dda0ed315dff044d7b6f3abcaef7R114-R144'><strong>Config Reference</strong></a>

Confirm the config keys `readiness_check_endpoint_name` and `health_check_endpoint_name` and linked anchors exactly match the current configuration docs and OSS/Pro scopes.
</summary>

```markdown
### Configuration
The endpoint runs on `/ready` by default. To change it:

```yaml
readiness_check_endpoint_name: "status-ready"

config ref

The /hello Endpoint (Liveness Check)

Use this endpoint for basic health monitoring and load balancer health checks. This check returns 200 when the Gateway has started and is attempting to or has arrived to a stable condition.

How it responds

  • Always returns HTTP 200 OK (even when components are failing).
  • Check the response body to see which components are healthy or failing

When to use /hello

  • Load balancers - Route traffic to instances that respond
  • Basic monitoring - Simple uptime checks
  • MDCB setups - Monitor both Management and Worker Gateways

Configuration

The endpoint runs on /hello by default. To change it:

health_check_endpoint_name: "status"

Config Ref


</details>

<details><summary><a href='https://github.com/TykTechnologies/tyk-docs/pull/6871/files#diff-1f2dc0abe0799b41bbf67dc8ed7db54ea185dda0ed315dff044d7b6f3abcaef7R36-R49'><strong>K8s Probes</strong></a>

Kubernetes probe examples may need timeouts, initial delays, and failureThresholds for production; consider adding recommended values or a note.
</summary>

```markdown
### Kubernetes Deployments
```yaml
# Liveness probe - restarts pod if Gateway process is dead
livenessProbe:
  httpGet:
    path: /hello
    port: 8080

# Readiness probe - removes from service when not ready
readinessProbe:
  httpGet:
    path: /ready
    port: 8080

</details>

</td></tr>
</table>

Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Add robust Kubernetes probe timings

Provide sane probe timing defaults to avoid flapping during startup or transient
dependency issues. Include initialDelaySeconds, periodSeconds, and failureThreshold
to reduce false negatives.

tyk-docs/content/planning-for-production/ensure-high-availability/health-check.md [37-49]

 ### Kubernetes Deployments
 ```yaml
 # Liveness probe - restarts pod if Gateway process is dead
 livenessProbe:
   httpGet:
     path: /hello
     port: 8080
+  initialDelaySeconds: 10
+  periodSeconds: 10
+  failureThreshold: 3
 
 # Readiness probe - removes from service when not ready
 readinessProbe:
   httpGet:
     path: /ready
     port: 8080
+  initialDelaySeconds: 5
+  periodSeconds: 5
+  failureThreshold: 3
<details><summary>Suggestion importance[1-10]: 7</summary>

__

Why: Adding sane probe timing defaults reduces flapping and false negatives, improving deployability. It’s a practical enhancement consistent with the PR’s Kubernetes guidance, though not critical to correctness.


</details></details></td><td align=center>Medium

</td></tr><tr><td>



<details><summary>Clarify readiness 503 conditions</summary>

___


**Clarify that <code>/ready</code> may temporarily return 503 during startup, config reloads, or <br>graceful shutdown, to prevent misconfiguration of probes with too-aggressive failure <br>thresholds. Add guidance to set appropriate initialDelay and failureThreshold in <br>Kubernetes examples.**

[tyk-docs/content/planning-for-production/ensure-high-availability/health-check.md [18-19]](https://github.com/TykTechnologies/tyk-docs/pull/6871/files#diff-1f2dc0abe0799b41bbf67dc8ed7db54ea185dda0ed315dff044d7b6f3abcaef7R18-R19)

```diff
 | `/hello` | **Liveness check** | Load balancers, basic monitoring | Always 200 OK |
-| `/ready` | **Readiness check** | Kubernetes, traffic routing decisions | 200 OK when ready, 503 when not |
+| `/ready` | **Readiness check** | Kubernetes, traffic routing decisions | 200 OK when ready, 503 during startup/reload/shutdown |
Suggestion importance[1-10]: 6

__

Why: The clarification is accurate and prevents misconfiguration by explaining normal 503 conditions for /ready. It’s a helpful doc improvement but moderate in impact as it refines wording in a reference table.

Low
Add endpoint reachability checks

Include a non-destructive HTTP check to verify the /hello and /ready endpoints
themselves, helping distinguish endpoint exposure issues (ports/ingress) from
backend dependency failures.

tyk-docs/content/planning-for-production/ensure-high-availability/health-check.md [250-259]

 **Redis connection failed**:
 - Check Redis is running: `redis-cli ping`
 - Verify connection settings in Gateway config
 - Check network connectivity to Redis
+- Verify probes can reach endpoints: `curl -sS http://<pod-ip>:8080/hello` and `curl -i http://<pod-ip>:8080/ready`
Suggestion importance[1-10]: 5

__

Why: The added troubleshooting step is useful and accurate, helping distinguish ingress/port issues from backend failures. It’s a minor but relevant documentation enhancement.

Low

Copy link

netlify bot commented Aug 18, 2025

PS. Add to the end of url /docs/nightly

Name Link
🔨 Latest commit 45f14e6
🔍 Latest deploy log https://app.netlify.com/projects/tyk-docs/deploys/68a382230396470008352d33
😎 Deploy Preview https://deploy-preview-6871--tyk-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

netlify bot commented Aug 18, 2025

PS. Add to the end of url /docs/nightly

Name Link
🔨 Latest commit 52cbcc3
🔍 Latest deploy log https://app.netlify.com/projects/tyk-docs/deploys/68ad4cfc3dfb7800089c7c4b
😎 Deploy Preview https://deploy-preview-6871--tyk-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@sharadregoti
Copy link
Contributor

@andyo-tyk I am marking this PR as a draft.

@sharadregoti sharadregoti marked this pull request as draft August 26, 2025 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants