Skip to content
@xlang-ai

XLANG Lab

Developing embodied AI agents that empower users to use language to interact with digital and physical environments to carry out real-world tasks.

Welcome to the Executable Language Grounding (XLANG) Lab! We are part of the HKU NLP Group at the University of Hong Kong. XLang focuses on building language model agents that transform (“grounding”) language instructions into code or actions executable in real-world environments, including databases (data agent), web applications (plugins/web agent), and the physical world (robotic agent) etc,. It lies at the heart of language model agents or natural language interfaces that can interact with and learn from these real-world environments to facilitate human interaction with data analysis, web applications, and robotic instruction through conversation. Recent advances in XLang incorporate techniques such as LLM + external tools, code generation, semantic parsing, and dialog or interactive systems.

Pinned Loading

  1. OSWorld OSWorld Public

    [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

    Python 2.1k 282

  2. aguvis aguvis Public

    [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

    Python 352 23

  3. OpenAgents OpenAgents Public

    [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild

    Python 4.4k 489

  4. instructor-embedding instructor-embedding Public

    [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

    Python 2k 154

  5. text2reward text2reward Public

    [ICLR 2024 Spotlight] Code for the paper "Text2Reward: Reward Shaping with Language Models for Reinforcement Learning"

    Jupyter Notebook 173 10

  6. DS-1000 DS-1000 Public

    [ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".

    Python 253 26

Repositories

Showing 10 of 24 repositories
  • xlang-ai.github.io Public

    The official website of xlang.ai

    xlang-ai/xlang-ai.github.io’s past year of commit activity
    TypeScript 3 0 0 0 Updated Aug 25, 2025
  • OSWorld Public

    [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

    xlang-ai/OSWorld’s past year of commit activity
    Python 2,107 Apache-2.0 282 92 0 Updated Aug 22, 2025
  • OpenCUA Public

    OpenCUA: Open Foundations for Computer-Use Agents

    xlang-ai/OpenCUA’s past year of commit activity
    Python 366 MIT 44 8 0 Updated Aug 21, 2025
  • Spider2 Public

    [ICLR 2025 Oral] Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

    xlang-ai/Spider2’s past year of commit activity
    HTML 559 MIT 85 58 2 Updated Aug 6, 2025
  • OSWorld-G Public

    Scaling Computer-Use Grounding via UI Decomposition and Synthesis

    xlang-ai/OSWorld-G’s past year of commit activity
    TypeScript 103 2 6 0 Updated Jun 18, 2025
  • BRIGHT Public

    [ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

    xlang-ai/BRIGHT’s past year of commit activity
    Python 159 CC-BY-4.0 16 6 0 Updated May 19, 2025
  • computer-agent-arena Public

    Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!

    xlang-ai/computer-agent-arena’s past year of commit activity
    50 Apache-2.0 2 1 0 Updated Apr 7, 2025
  • aguvis Public

    [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

    xlang-ai/aguvis’s past year of commit activity
    Python 352 23 23 0 Updated Mar 7, 2025
  • AgentTrek Public

    [ICLR2025 Spotlight] Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

    xlang-ai/AgentTrek’s past year of commit activity
    Python 40 0 4 0 Updated Feb 21, 2025
  • verl Public Forked from volcengine/verl

    veRL: Volcano Engine Reinforcement Learning for LLM

    xlang-ai/verl’s past year of commit activity
    Python 1 Apache-2.0 2,190 0 0 Updated Jan 27, 2025

Most used topics

Loading…