Skip to content

feat: Enable custom skills in JSON scenarios for Browser Use Agent #645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

glglak
Copy link

@glglak glglak commented Jun 20, 2025

This commit adds support for deterministic execution of JSON-based browser automation scenarios using custom skills in the Browser Use Agent.

Key changes:

  • Dynamically register custom skills from src.custom_skills as controller actions
  • Fix Pydantic JSON schema errors by adding proper type annotations
  • Avoid leading underscores in function names to comply with Pydantic field naming rules
  • Add Wikipedia example scenario demonstrating cross-page data extraction
  • Create documentation explaining the integration approach and usage
  • Add integration tests to verify custom skills functionality
  • Update requirements.txt with lxml[html_clean] dependency

This enables users to create deterministic browser automation scenarios using skills like goto, clickCss, waitFor, and waitForUrl directly in JSON format, rather than relying solely on natural language instructions.


Summary by cubic

Added support for using custom skills in JSON-based browser automation scenarios with the Browser Use Agent. Users can now define steps like goto, clickCss, waitFor, and waitForUrl directly in JSON for more reliable and deterministic automation.

  • New Features

    • Dynamically register custom skills as controller actions for JSON scenarios.
    • Added example scenario, documentation, and integration tests for custom skills.
  • Dependencies

    • Added lxml[html_clean] to requirements.

This commit adds support for deterministic execution of JSON-based browser automation scenarios using custom skills in the Browser Use Agent.

Key changes:
- Dynamically register custom skills from src.custom_skills as controller actions
- Fix Pydantic JSON schema errors by adding proper type annotations
- Avoid leading underscores in function names to comply with Pydantic field naming rules
- Add Wikipedia example scenario demonstrating cross-page data extraction
- Create documentation explaining the integration approach and usage
- Add integration tests to verify custom skills functionality
- Update requirements.txt with lxml[html_clean] dependency

This enables users to create deterministic browser automation scenarios using skills like goto, clickCss, waitFor, and waitForUrl directly in JSON format, rather than relying solely on natural language instructions.
@CLAassistant
Copy link

CLAassistant commented Jun 20, 2025

CLA assistant check
All committers have signed the CLA.

@glglak
Copy link
Author

glglak commented Jun 20, 2025

feat: Enable custom skills in JSON scenarios for Browser Use Agent

This commit adds support for deterministic execution of JSON-based browser automation scenarios using custom skills in the Browser Use Agent.

Key changes:

  • Dynamically register custom skills from src.custom_skills as controller actions
  • Fix Pydantic JSON schema errors by adding proper type annotations
  • Avoid leading underscores in function names to comply with Pydantic field naming rules
  • Add Wikipedia example scenario demonstrating cross-page data extraction
  • Create documentation explaining the integration approach and usage
  • Add integration tests to verify custom skills functionality
  • Update requirements.txt with lxml[html_clean] dependency

This enables users to create deterministic browser automation scenarios using skills like goto, clickCss, waitFor, and waitForUrl directly in JSON format, rather than relying solely on natural language instructions.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic found 5 issues across 7 files. Review them in cubic.dev

React with 👍 or 👎 to teach cubic. Tag @cubic-dev-ai to give specific feedback.

- Pin lxml[html_clean] version to 4.9.3 in requirements.txt for deterministic builds
- Fix timeout handling in wait_for_url and wait_for_selector (remove division by 1000)
- Use correct state='detached' instead of 'hidden' for element removal detection
- Add missing imports (BrowserContext, Dict) in documentation code snippet
@glglak
Copy link
Author

glglak commented Jun 20, 2025

fix: Address AI reviewer feedback on custom skills implementation

  • Pin lxml[html_clean] version to 4.9.3 in requirements.txt for deterministic builds
  • Fix timeout handling in wait_for_url and wait_for_selector (remove division by 1000)
  • Use correct state='detached' instead of 'hidden' for element removal detection
  • Add missing imports (BrowserContext, Dict) in documentation code snippet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants