oss-llm-tools

OSS LLM Tools for Conversion, Evaluation, Numerical Debugging and Benchmarking

lm-eval bisect tools

Find first bad commit that dropped the accuracy of a model.

cd <target repo>
python ../oss-llm-tools/bisect_accuracy.py --good <Good Commit> --bad <Bad Commit> --model google/gemma-3-12b-it --task gsm8k --target 0.4 --limit 100 --bisect_log_file --model_args '{"tensor_parallel_size": 4}'  --eval_args '{"num_fewshot":5}' --stop_with_exception --bisect_log /tmp/bisect.log

create_dummy_model

Create a dummy transformer model with optional custom weights. This script combines model initialization from a config file or Hugging Face model with the ability to add or update specific tensors. This is useful for:

Testing code that requires a model with a specific architecture without needing the actual trained weights
Creating smaller test models by overriding parameters like number of layers
Generating placeholder models with specific tensor shapes for development and testing
Creating sharded safetensors files for large model testing

Usage

# Basic usage with local model directory path
python create_dummy_model.py --model_path /path/to/model/dir --output_dir /path/to/output

# Using a Hugging Face model ID directly (downloads necessary files automatically)
python create_dummy_model.py --model_path meta-llama/Llama-3-8B --output_dir ./llama3_dummy

# Create a smaller model with only 3 hidden layers
python create_dummy_model.py --model_path /path/to/model/dir --output_dir /path/to/output \
  --config_override '{"num_hidden_layers": 3}'

# Create a model with custom weights from a JSON file
python create_dummy_model.py --model_path /path/to/model/dir --output_dir /path/to/output \
  --weights_json example_weights.json

# Create a sharded model with custom weights
python create_dummy_model.py --model_path /path/to/model/dir --output_dir /path/to/output \
  --weights_json example_weights.json --max_shard_size "500MB"

Requirements

Python 3.6+
PyTorch
Hugging Face Transformers
huggingface_hub
safetensors

Parameters

--model_path: Path to a model directory or Hugging Face model ID (e.g., 'meta-llama/Llama-3-8B')
--output_dir: Directory to save the model
--config_override: (Optional) JSON string with config parameters to override
--weights_json: (Optional) JSON file containing weights info (name, shape, dtype)
--max_shard_size: (Optional) Maximum size of each shard (e.g., '2GB', '500MB')

Weights Format

When using --weights_json, the weights can be specified in several formats:

Full format with shape and dtype specified:

"model.embed_tokens.weight": {
  "shape": [151552, 5120],
  "dtype": "float16"
}

Simple format with just the shape as a list:

"model.layers.0.input_layernorm.weight": [5120]

String format for shapes using 'x' as separator:

"model.layers.0.self_attn.q_proj.weight": "12288x5120"

create_safetensors

Create safetensors files with specified tensor names and shapes. This is useful for:

Creating dummy models with specific tensor shapes and dtypes
Testing model loading and processing code without real weights
Generating sharded model files for large model testing
Creating placeholder weights for development and testing

Usage

# Basic usage with weights specified as a JSON string
python create_safetensors.py --weights_dict '{"model.layers.0.self_attn.q_proj.weight": [1024, 1024], "model.layers.0.self_attn.k_proj.weight": [1024, 1024]}'

# Using a JSON file containing weights information
python create_safetensors.py --weights_json example_weights.json --output_dir ./dummy_weights

# Creating sharded safetensors files for large models
python create_safetensors.py --weights_json example_weights.json --output_dir ./sharded_model --max_shard_size "2GB"

Requirements

Python 3.6+
PyTorch
safetensors

Parameters

--output_dir: Directory to save the safetensors file(s)
--weights_json: JSON file containing weights info (name, shape, dtype)
--weights_dict: JSON string with weights dictionary (name: shape)
--max_shard_size: Maximum size of each shard (e.g., '2GB', '500MB')

Weights Format

The weights can be specified in several formats as shown in the example_weights.json file:

Full format with shape and dtype specified:

"model.embed_tokens.weight": {
  "shape": [151552, 5120],
  "dtype": "float16"
}

Simple format with just the shape as a list:

"model.layers.0.input_layernorm.weight": [5120]

String format for shapes using 'x' as separator:

"model.layers.0.self_attn.q_proj.weight": "12288x5120"

The repository includes an example weights JSON file (example_weights.json) that demonstrates all supported formats for specifying tensor shapes and dtypes.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
bisect_accuracy.py		bisect_accuracy.py
create_dummy_model.py		create_dummy_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

oss-llm-tools

lm-eval bisect tools

create_dummy_model

Usage

Requirements

Parameters

Weights Format

create_safetensors

Usage

Requirements

Parameters

Weights Format

About

Uh oh!

Releases

Packages

Languages

luccafong/oss-llm-tools

Folders and files

Latest commit

History

Repository files navigation

oss-llm-tools

lm-eval bisect tools

create_dummy_model

Usage

Requirements

Parameters

Weights Format

create_safetensors

Usage

Requirements

Parameters

Weights Format

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages