[P1] AttributeError: 'CausalLMOutputWithPast' object has no attribute 'mean'

Hi,

I am training finetuning "meta-llama/Llama-2-7b-hf" with a dataset.
Without eval strategy it is working fine.
But with eval strategy I am facing the issue mentioned in the title.
Could anyone help me in fixing this issue?

**This is the code I am using as reference**:
[https://github.com/stanfordnlp/pyreft/blob/main/examples/alpaca/train.py](https://github.com/stanfordnlp/pyreft/blob/main/examples/alpaca/train.py)

Below is the error trace:

<img width="928" alt="image" src="https://github.com/user-attachments/assets/4d16f51e-648f-41d1-a487-6996409ef159" />





### Below are the **Training arguments** I am passing:

`
training_args = TrainArguments(
    # output_dir=f"./checkpoints/rank{rank}",
    output_dir="./trained_model_wt_eval",
    learning_rate=3e-5,
    num_train_epochs=2,
    # evaluation_strategy="epoch",
    eval_strategy="steps",
    # do_eval=False,
    lr_scheduler_type="linear",
    # warmup_steps=warmup_steps,
    per_device_train_batch_size=2,    
    per_device_eval_batch_size=2,
    # gradient_accumulation_steps=grad_acc,
    weight_decay=0.01,
    logging_dir=f"./logs/reft_rank{rank}",
    logging_strategy="steps",
    logging_steps = 2,
    save_strategy="steps",
    save_steps=500,
    save_total_limit=3,
    load_best_model_at_end=False,
    # fp16=True,
    bf16=True,
    remove_unused_columns=False
)
`
### **Data arguments for training:**

`def make_supervised_data_module(tokenizer: transformers.PreTrainedTokenizer, model, layers, training_args, data_args) -> Dict:
    train_dataset = ReftSupervisedDataset(
        "alpaca", data_args.data_path, tokenizer, data_split="train", seed=training_args.seed,
        max_n_example=training_args.max_n_train_example,
        input_field="input", instruction_field="instruction", output_field="output",
        **{"num_interventions": len(layers), "position": training_args.position, 
           "share_weights": training_args.share_weights}
    )

    eval_dataset = ReftSupervisedDataset(
        "alpaca", data_args.eval_path, tokenizer, data_split="test", seed=training_args.seed,
        max_n_example=training_args.max_n_train_example,
        input_field="input", instruction_field="instruction", output_field="output",
        **{"num_interventions": len(layers), "position": training_args.position, 
           "share_weights": training_args.share_weights}
    )
    print(train_dataset)
    print(eval_dataset)
    data_collator_fn = transformers.DataCollatorForSeq2Seq(
        tokenizer=tokenizer,
        model=model,
        label_pad_token_id=-100,
        padding="longest"
    )
    data_collator = ReftDataCollator(data_collator=data_collator_fn)
    return dict(train_dataset=train_dataset, eval_dataset=eval_dataset, data_collator=data_collator)
    
data_module = make_supervised_data_module(
        tokenizer=tokenizer, model=model, layers=layers,
        training_args=training_args, data_args=data_args)

`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[P1] AttributeError: 'CausalLMOutputWithPast' object has no attribute 'mean' #147

Below are the Training arguments I am passing:

Data arguments for training:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[P1] AttributeError: 'CausalLMOutputWithPast' object has no attribute 'mean' #147

Description

Below are the Training arguments I am passing:

Data arguments for training:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions