Skip to content

Implement comprehensive link checker with image focus and CI/CD integration #220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Jul 30, 2025

This PR implements a comprehensive link checking system for the OrionRobots website with a focus on detecting broken image links and internal link issues, addressing the need for automated link validation as outlined in the issue.

Key Features

🎯 Image-Focused Link Checking

  • Prioritizes broken image links that affect visual content
  • Categorizes links by type: images (high priority), internal links (high priority), external links (medium priority), and email links (low priority)
  • Generates professional HTML reports with clear priority indicators

🐳 Docker Integration

  • Enhanced Docker container based on Ubuntu 22.04 with linkchecker, Python 3, and Jinja2
  • Integrated with existing docker-compose.yml using the manual profile
  • Health checks and proper volume mounting for report generation
  • Robust error handling that continues checking even when broken links are found

🚀 Multiple Usage Options

  • Local development: Simple ./scripts/local_linkcheck.sh script for manual execution
  • PR validation: Triggered by adding the link-check label to pull requests
  • Production monitoring: Automated nightly runs at 2 AM UTC via GitHub Actions
  • Manual Docker: docker compose --profile manual up broken_links

📊 Professional Reporting

  • Styled HTML reports with categorized broken links
  • CSV output for programmatic analysis
  • Clear priority-based organization helping developers focus on critical issues first
  • Summary statistics showing counts by category

Implementation Details

The solution builds upon the existing draft linkchecker setup in _drafts/linkchecker/ and enhances it significantly:

  1. Enhanced Python filtering script (linkchecker/filter_csv.py) that categorizes links and generates rich HTML reports
  2. Comprehensive configuration (linkchecker/linkchecker.conf) optimized for the OrionRobots site
  3. GitHub Actions workflows for both nightly monitoring and PR-based validation
  4. Complete documentation including technical README and quick start guide

CI/CD Integration

Nightly Link Checking:

  • Runs automatically against the production site (https://orionrobots.co.uk)
  • Uploads detailed reports as GitHub Actions artifacts
  • Creates warnings when broken links are detected
  • Escalates to errors when many broken links are found

PR-Based Link Checking:

  • Triggered when a PR is labeled with link-check
  • Builds the site locally and runs link checking against the build
  • Comments results directly on the PR with broken link details
  • Provides actionable feedback for developers

Usage Examples

# Local development - one command execution
./scripts/local_linkcheck.sh

# Manual Docker usage
docker compose --profile manual up broken_links
open link_reports/link_check_report.html

# PR validation - just add the 'link-check' label

The implementation addresses all requirements from the issue:

  • ✅ Wraps the link checking docker setup for internal broken links view
  • ✅ Focuses on image links to help deal with broken images
  • ✅ Provides nightly runs on the real site
  • ✅ Enables PR runs when link-check label is added with proper teardown
  • ✅ Offers optional/manual docker compose command locally

Fixes #216.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@Copilot Copilot AI changed the title [WIP] Make use of linkchecker Implement comprehensive link checker with image focus and CI/CD integration Jul 30, 2025
@Copilot Copilot AI requested a review from dannystaple July 30, 2025 13:55
Copilot finished work on behalf of dannystaple July 30, 2025 13:55
Copy link
Contributor

github-actions bot commented Aug 1, 2025

🦙 MegaLinter status: ❌ ERROR

Descriptor Linter Files Fixed Errors Warnings Elapsed time
❌ ACTION actionlint 7 2 0 0.07s
✅ BASH bash-exec 3 0 0 0.01s
❌ BASH shellcheck 3 1 0 0.13s
⚠️ BASH shfmt 3 3 0 0.0s
❌ C cppcheck 8 8 0 0.27s
❌ CPP cppcheck 8 8 0 0.33s
✅ JSON npm-package-json-lint yes no no 0.47s
✅ JSON v8r 12 0 0 9.42s
❌ MARKDOWN markdown-link-check 738 213 0 35.22s
⚠️ PYTHON black 1 1 0 0.59s
❌ PYTHON flake8 1 11 0 0.52s
✅ PYTHON mypy 1 0 0 2.95s
✅ REPOSITORY gitleaks yes no no 14.77s
✅ REPOSITORY git_diff yes no no 1.16s
✅ REPOSITORY grype yes no no 35.58s
✅ REPOSITORY secretlint yes no no 43.76s
✅ REPOSITORY syft yes no no 2.0s
✅ REPOSITORY trivy-sbom yes no no 3.43s
✅ REPOSITORY trufflehog yes no no 4.64s
✅ XML xmllint 1 0 0 188.67s
✅ YAML v8r 14 0 0 6.16s

See detailed report in MegaLinter reports

MegaLinter is graciously provided by OX Security

@@ -0,0 +1,15 @@
FROM ubuntu:22.04
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the copy of this linkchecker folder from the _drafts folder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Removed the outdated linkchecker copy from _drafts/linkchecker/ since the comprehensive implementation is now in .github/linkchecker/. Commit c0a7eed

@Copilot Copilot AI requested a review from dannystaple August 1, 2025 14:04
Copilot finished work on behalf of dannystaple August 1, 2025 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make use of linkchecker
2 participants