Skip to content

[P1] AttributeError: 'CausalLMOutputWithPast' object has no attribute 'mean' #147

Open
@krishnardt

Description

@krishnardt

Hi,

I am training finetuning "meta-llama/Llama-2-7b-hf" with a dataset.
Without eval strategy it is working fine.
But with eval strategy I am facing the issue mentioned in the title.
Could anyone help me in fixing this issue?

This is the code I am using as reference:
https://github.com/stanfordnlp/pyreft/blob/main/examples/alpaca/train.py

Below is the error trace:

image

Below are the Training arguments I am passing:

training_args = TrainArguments( # output_dir=f"./checkpoints/rank{rank}", output_dir="./trained_model_wt_eval", learning_rate=3e-5, num_train_epochs=2, # evaluation_strategy="epoch", eval_strategy="steps", # do_eval=False, lr_scheduler_type="linear", # warmup_steps=warmup_steps, per_device_train_batch_size=2, per_device_eval_batch_size=2, # gradient_accumulation_steps=grad_acc, weight_decay=0.01, logging_dir=f"./logs/reft_rank{rank}", logging_strategy="steps", logging_steps = 2, save_strategy="steps", save_steps=500, save_total_limit=3, load_best_model_at_end=False, # fp16=True, bf16=True, remove_unused_columns=False )

Data arguments for training:

`def make_supervised_data_module(tokenizer: transformers.PreTrainedTokenizer, model, layers, training_args, data_args) -> Dict:
train_dataset = ReftSupervisedDataset(
"alpaca", data_args.data_path, tokenizer, data_split="train", seed=training_args.seed,
max_n_example=training_args.max_n_train_example,
input_field="input", instruction_field="instruction", output_field="output",
**{"num_interventions": len(layers), "position": training_args.position,
"share_weights": training_args.share_weights}
)

eval_dataset = ReftSupervisedDataset(
    "alpaca", data_args.eval_path, tokenizer, data_split="test", seed=training_args.seed,
    max_n_example=training_args.max_n_train_example,
    input_field="input", instruction_field="instruction", output_field="output",
    **{"num_interventions": len(layers), "position": training_args.position, 
       "share_weights": training_args.share_weights}
)
print(train_dataset)
print(eval_dataset)
data_collator_fn = transformers.DataCollatorForSeq2Seq(
    tokenizer=tokenizer,
    model=model,
    label_pad_token_id=-100,
    padding="longest"
)
data_collator = ReftDataCollator(data_collator=data_collator_fn)
return dict(train_dataset=train_dataset, eval_dataset=eval_dataset, data_collator=data_collator)

data_module = make_supervised_data_module(
tokenizer=tokenizer, model=model, layers=layers,
training_args=training_args, data_args=data_args)

`

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions