Skip to content

[iree-benchmark] benchmark_module error with inputs as bfloat16 #22423

@oyazdanb

Description

@oyazdanb

What happened?

iree.runtime.benchmark_module gives error when having inputs as bfloat16

Steps to reproduce:

  1. Toy model MLIR is here

  2. compile MLIR with

iree-compile output.mlir \
        --iree-hip-target=gfx942 -o output.vmfb \
        --iree-hal-target-device=hip --iree-opt-level=O3 \
        --iree-hal-indirect-command-buffers=true \
        --iree-stream-resource-memory-model=discrete \
        --iree-hip-enable-tensor-ukernels \
        --iree-stream-affinity-solver-max-iterations=1024 \
        --iree-hal-memoization=true --iree-codegen-enable-default-tuning-specs=true \
        --iree-llvmgpu-test-combine-layout-transformation=false
  1. Test performance with (on MI300-3)
iree-benchmark-module \
          --hip_use_streams=true \
          --module=output.vmfb \
          --parameters=model=/shark-dev/gpt-oss/20b/weights/gpt-oss-20b_model_bfp16.irpa \
          --device=hip \
          --function=prefill_bs4 \
          --input="4x4096xsi64" \
          --input="4xsi64" \
          --input="4x64xsi64" \
          --input="513x786432xbf16" \
          --benchmark_repetitions=3 \
          --benchmark_out_format=json \
          --benchmark_out=<benchmark_dir>/gpt-oss-20B-BFP16_prefill_bs4_isl_4096.json

absolute paths are present in the sharkmi300x-3 machine
4. The error is

====================================================================
2025-10-27 15:00:15,466 - INFO - Appending Extra Benchmark Flags...
2025-10-27 15:00:15,466 - INFO - ['--iree-dispatch-creation-propagate-collapse-across-expands=true', '--iree-hip-specialize-dispatches']
2025-10-27 15:00:15,466 - INFO - 
[===================  Benchmark CMD] iree.runtime.benchmark_module(**{'module': '/home/sonbol/shark-ai/output_artifacts/output_gpt-oss-20b-bfp16/output.vmfb', 'entry_function': 'prefill_bs4', 'inputs': ['4x4096xsi64', '4xsi64', '4x64xsi64', '513x786432xbf16'], 'timeout': None, 'benchmark_repetitions': 3, 'benchmark_out_format': 'json', 'benchmark_out': '/home/sonbol/shark-ai/output_artifacts/output_gpt-oss-20b-bfp16/benchmark_module/gpt-oss-20b-bfp16_prefill_bs4_isl_4096.json', 'parameters': 'model=/shark-dev/gpt-oss/20b/weights/gpt-oss-20b_model_bfp16.irpa', 'device': 'hip://0', 'iree_dispatch_creation_propagate_collapse_across_expands=true': True, 'iree_hip_specialize_dispatches': True}) ====================

2025-10-27 15:01:13,287 - INFO - Benchmark results written to /home/sonbol/shark-ai/output_artifacts/output_gpt-oss-20b-bfp16/benchmark_module/gpt-oss-20b-bfp16_prefill_bs4_isl_4096.json
2025-10-27 15:01:13,287 - INFO - BenchmarkResult(benchmark_name='BM_prefill_bs4/process_time/real_time', time='16192 ms', cpu_time='16192 ms', iterations='1', user_counters='items_per_second=0.0617597/s')
2025-10-27 15:01:13,287 - INFO - BenchmarkResult(benchmark_name='BM_prefill_bs4/process_time/real_time', time='16310 ms', cpu_time='16311 ms', iterations='1', user_counters='items_per_second=0.0613127/s')
2025-10-27 15:01:13,288 - INFO - BenchmarkResult(benchmark_name='BM_prefill_bs4/process_time/real_time', time='16389 ms', cpu_time='21892 ms', iterations='1', user_counters='items_per_second=0.061015/s')
2025-10-27 15:01:13,288 - INFO - BenchmarkResult(benchmark_name='BM_prefill_bs4/process_time/real_time_mean', time='16297 ms', cpu_time='18132 ms', iterations='3', user_counters='items_per_second=0.0613625/s')
2025-10-27 15:01:13,288 - INFO - BenchmarkResult(benchmark_name='BM_prefill_bs4/process_time/real_time_median', time='16310 ms', cpu_time='16311 ms', iterations='3', user_counters='items_per_second=0.0613127/s')
2025-10-27 15:01:13,288 - INFO - BenchmarkResult(benchmark_name='BM_prefill_bs4/process_time/real_time_stddev', time='99.4 ms', cpu_time='3257 ms', iterations='3', user_counters='items_per_second=374.797u/s')
2025-10-27 15:01:13,288 - INFO - BenchmarkResult(benchmark_name='BM_prefill_bs4/process_time/real_time_cv', time='0.61 %', cpu_time='17.96 %', iterations='3', user_counters='items_per_second=0.61%')
2025-10-27 15:01:13,288 - INFO - Benchmark done
2025-10-27 15:01:13,288 - INFO - 
[===================  Benchmark CMD] iree.runtime.benchmark_module(**{'module': '/home/sonbol/shark-ai/output_artifacts/output_gpt-oss-20b-bfp16/output.vmfb', 'entry_function': 'decode_bs4', 'inputs': ['4x1xsi64', '4xsi64', '4xsi64', '4x65xsi64', '513x786432xbf16'], 'timeout': None, 'benchmark_repetitions': 3, 'benchmark_out_format': 'json', 'benchmark_out': '/home/sonbol/shark-ai/output_artifacts/output_gpt-oss-20b-bfp16/benchmark_module/gpt-oss-20b-bfp16_decode_bs4_isl_4096.json', 'parameters': 'model=/shark-dev/gpt-oss/20b/weights/gpt-oss-20b_model_bfp16.irpa', 'device': 'hip://0', 'iree_dispatch_creation_propagate_collapse_across_expands=true': True, 'iree_hip_specialize_dispatches': True}) ====================

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/sonbol/shark-ai/sharktank/sharktank/tools/e2e_model_test.py", line 595, in <module>
    main()
  File "/home/sonbol/shark-ai/sharktank/sharktank/tools/e2e_model_test.py", line 581, in main
    run_stage(
  File "/home/sonbol/shark-ai/sharktank/sharktank/tools/e2e_model_test.py", line 312, in run_stage
    results = iree.runtime.benchmark_module(**kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sonbol/venvs/sharkai_312/lib/python3.12/site-packages/iree/runtime/benchmark.py", line 133, in benchmark_module
    raise BenchmarkToolError(f"stderr:\n{err}\nstdout:\n{out}")
iree.runtime.benchmark.BenchmarkToolError: stderr:

Steps to reproduce your issue

What component(s) does this issue relate to?

No response

Version information

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug 🐞Something isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions