Navigate to the datasets directory:
cd datasets/data_splits/data
The dataset is organized as follows:
├── AO3
├── narrativemagazine
├── newyorker
├── Reddit
└── Storium
Each source directory contains two subdirectories: profile and test, corresponding to the profiling and generation sets, respectively.
Each directory contains a list of files, with each file corresponding to an author. Each author file consists of:
- A list of
writing_prompt - The corresponding
url, which can be used to download the story.
For Storium, you need to request access to the dataset from the original authors and collect the game_pid story. You can find more details at Storium Dataset.
To regenerate writing prompts, follow instructions in the generate_prompts directory:
cd generate_prompts
Navigate to the experiments directory:
cd experiments
Run the following command to see details of available arguments:
python methods.py --help
Use the --source argument to specify a data source (e.g., Reddit).
- Default GPT-4o
--llamafor LLaMA 3.1 8B--llama70for LLaMA 3.1 70B
Output directories:
- Personalized Stories:
results(GPT-4o)results_llama(LLaMA 3.1 8B)results_llama70(LLaMA 3.1 70B)
- Author Sheet/Summary:
user_profile - Story Rules:
story_rules - Persona Descriptions:
persona
python methods.py --choice 1
Output directory: vanilla
python methods.py --choice 1 --few_shot
Output directory: vanilla_few_shot
-
Generate Average Author Stories for the profiling set:
python methods.py --choice 1 --is_profile -
Generate rules for the profiling set:
python methods.py --extract_rules --is_profile -
Generate personalized stories using Delta:
python methods.py --choice 4
Output directory: delta
python methods.py --choice 3 --persona
Output directory: schema_persona
python methods.py --choice 5 --persona
Output directory: delta_schema_persona
For the nP variants of Summary and Sheet (i.e., ablation without using persona descriptions), omit the --persona flag.
- Summary nP output:
schema - Sheet nP output:
delta_schema
Navigate to the evaluation directory:
cd evaluation
-
Prompt LLM for evaluation:
python author_sheet_score_evaluation.py --model_choice 5 --source <source_name> --choice <choice_number> -
Compute win-rates:
python pool_author_sheet_score_evaluation.py --model_choice 5 --source <source_name> --choice <choice_number>
--model_choice 5 for using OpenAI o4-mini for evaluation.
-
Prompt LLM for evaluation:
python llm_evaluation_shuffle.py --model_choice 5 --source <source_name> --choice <choice_number> -
Compute win-rates:
python pool_llm_evaluation_score.py --model_choice 5 --source <source_name> --choice <choice_number>
<source_name>refers to the dataset source (e.g.,Reddit).<choice_number>corresponds to the method choice (see Experiments section).- Use the
--personaargument for Sheet and Summary evaluations. - Use
--llamaand--llama70for evaluating generations from LLaMA 3.1 8B and LLaMA 3.1 70B, respectively.
python consolidate_results.py
- Use
--faithfor Faithfulness to Writing History. - Use
--llamaand--llama70for LLaMA 3.1 8B and LLaMA 3.1 70B.
Navigate to the traditional evaluation directory:
cd traditional_evaluation
-
Compute evaluation scores:
python get_scores.py --source <source_choice> -
Consolidate results:
python consolidate_results.py
- Use the same arguments as in Experiments for selecting a specific method.
- Add
--compute_gtto compute scores for the ground-truth author story.
@article{kumar2025whose,
title={Whose story is it? Personalizing story generation by inferring author styles},
author={Kumar, Nischal Ashok and Pham, Chau Minh and Iyyer, Mohit and Lan, Andrew},
journal={arXiv preprint arXiv:2502.13028},
year={2025}
}