Skip to content

Add GEM/mlsum prompts for ES,DE #738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: eval-hackathon
Choose a base branch
from

Conversation

Shashi456
Copy link

No description provided.

@jzf2101 jzf2101 self-requested a review April 26, 2022 22:17
@jzf2101 jzf2101 self-assigned this Apr 26, 2022
@jzf2101 jzf2101 changed the base branch from main to eval-hackathon April 26, 2022 23:35
@jzf2101 jzf2101 added Non-English Dataset help wanted Extra attention is needed labels Apr 27, 2022
@Shashi456
Copy link
Author

@jzf2101 Waiting for a non-english reviewer would probably take too much time, The only prompt that is in the foreign language was taken from Sebastian (verified and written by the GEM eval group).

Could I ask for someone in summarization to review this ?

@jzf2101
Copy link
Collaborator

jzf2101 commented Apr 28, 2022

@sebastianGehrmann I will follow your formatting and try to include in spanish for consistency

@Shashi456
Copy link
Author

@jzf2101 could you take a look at this?

@Shashi456
Copy link
Author

@stephenbach could you take a look at this PR once?

@jzf2101 jzf2101 removed the request for review from sebastianGehrmann June 26, 2022 23:11
Copy link
Collaborator

@jzf2101 jzf2101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think these are mostly fine. Minor comments on Tweaks. The main concern I have is that we don't have 5 prompts in English for each task and 5 prompts in Spanish/German for the task. We should try to ensure that we're not mixing languages in evaluation.

a4f6a6c1-ce10-463b-932e-41a9336c3ecf: !Template
answer_choices: null
id: a4f6a6c1-ce10-463b-932e-41a9336c3ecf
jinja: "{{document}}\n ===\nWrite a summary of the text above : ||| {{summary}}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{{summary}} should be {{target}}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think target will produce prompts (having run the scripts to generate outputs)

e9c85265-fabb-412d-9082-99750212df27: !Template
answer_choices: null
id: e9c85265-fabb-412d-9082-99750212df27
jinja: 'I will first show a news article and then provide a summary of it in German:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording IMHO seems ambiguous in that it could be read as "I will first show a news article and then I will then provide a summary of it in German" as opposed to "I will first show a news article and then you will provide a summary of it in German" From what @awebson has suggested, this should be understandable to a person. Thus, I think this ambiguity could be resolved with perhaps "I will show a news article. Please provide a summary of it in German"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @sebastianGehrmann, This was just to give a different flavor of prompt, not sure if the meaning/prompt output quality has any meaningful change because of how we address the model

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand the comment. We are not treating this as a dialog system. Since BLOOM is a language model, it simulates generating both the article AND the summary. The implicit turn-taking of our evaluation is not of any consequence, and therefore, "I will show an article and I will show its summary" is correct.

answer_choices: null
id: 104b2645-9b2a-4d08-8885-3de18af246fc
jinja: "My college roommate asked me what this German article meant:\n {{text}}\n\
So I recapped it in layman''s terms in German: ||| {{target}}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are double apostrophes here. Should be one apostrophe. Furthermore I'm not sure of the use of I in the prompt. @awebson ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not add the \ explicitly, I think it was added by the formatter while creating the prompts

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to fix it using promptsource?

104b2645-9b2a-4d08-8885-3de18af246fc: !Template
answer_choices: null
id: 104b2645-9b2a-4d08-8885-3de18af246fc
jinja: "My college roommate asked me what this German article meant:\n {{text}}\n\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awebson use of the college roommate seem's a bit odd TBH?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jinja: 'My college roommate asked me what this article means:
This prompt was taken from xsum prompts that already existed in the repository.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should change both, I agree that this is a bit strange.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked @awebson for his opinion and he says this is fine.

e071fb9c-b000-417d-a405-4f6032532f87: !Template
answer_choices: null
id: e071fb9c-b000-417d-a405-4f6032532f87
jinja: "{{text}}\n ===\nGiven the above document, write few sentences in\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=== is rendered strangely. Moreover, I think German should be capitalized.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

answer_choices: null
id: 6fc70031-95ab-40fa-9cc7-e6eda42a4833
jinja: "My college roommate asked me what this Spanish article meant:\n {{text}}\n\
So I recapped it in layman''s terms in Spanish: ||| {{target}}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comments on this prompt in German

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prompt was added from the XSUM task which already existed in the repository and has been a template for summarization tasks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change it.

id: e9c85265-fabb-412d-9082-99750212df27
jinja: 'I will first show a news article and then provide a summary of it in German:

Fasse den folgenden Artikel zusammen: {{text}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it should be a separate prompt as opposed to restating the prompt in German

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prompt was developed by sebastian and the GEM group for this task so has been included as is.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GEM these are separate prompts. Let's do the same here and separate them.

answer_choices: null
id: e3c60771-5e99-49b1-b477-c2b69f645d59
jinja: "I will first show a news article and then provide a summary of it in Spanish:\n\
Resume el siguiente art\xEDculo: {{text}}\n ===\nResumen: ||| {{target}}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a separate prompt in Spanish as opposed to part of a prompt in English. Mixing languages doesn't seem to make sense or be in current scope of multilingual prompting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added by sebastian and GEM group for this task.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastianGehrmann could you explain this prompt?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - They were separate prompts at some point so we should separate them here (can just replace the roommate prompt with it or so)

e3c60771-5e99-49b1-b477-c2b69f645d59: !Template
answer_choices: null
id: e3c60771-5e99-49b1-b477-c2b69f645d59
jinja: "I will first show a news article and then provide a summary of it in Spanish:\n\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also comments about this prompt being ambiguous.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same response as above - besides, this is a prompt that worked for PaLM :)

id: 5e644239-d989-4531-b2ff-44b0e4310df6
jinja: '{{text}}

===
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rendered strangely, IMHO.

6ffadb8a-c670-4d6c-97fd-9eea35945452: !Template
answer_choices: null
id: 6ffadb8a-c670-4d6c-97fd-9eea35945452
jinja: "{{text}}\n TL;DR in Deutsch: ||| {{target}}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, let's keep TL;DR. It's not super common in German but the model should have learned the association.

e071fb9c-b000-417d-a405-4f6032532f87: !Template
answer_choices: null
id: e071fb9c-b000-417d-a405-4f6032532f87
jinja: "{{text}}\n ===\nGiven the above document, write few sentences in\

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

id: e9c85265-fabb-412d-9082-99750212df27
jinja: 'I will first show a news article and then provide a summary of it in German:

Fasse den folgenden Artikel zusammen: {{text}}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GEM these are separate prompts. Let's do the same here and separate them.


===

Write a summary of the text above : ||| {{target}}'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

answer_choices: null
id: 6fc70031-95ab-40fa-9cc7-e6eda42a4833
jinja: "My college roommate asked me what this Spanish article meant:\n {{text}}\n\
So I recapped it in layman''s terms in Spanish: ||| {{target}}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change it.

e3c60771-5e99-49b1-b477-c2b69f645d59: !Template
answer_choices: null
id: e3c60771-5e99-49b1-b477-c2b69f645d59
jinja: "I will first show a news article and then provide a summary of it in Spanish:\n\

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same response as above - besides, this is a prompt that worked for PaLM :)

answer_choices: null
id: e3c60771-5e99-49b1-b477-c2b69f645d59
jinja: "I will first show a news article and then provide a summary of it in Spanish:\n\
Resume el siguiente art\xEDculo: {{text}}\n ===\nResumen: ||| {{target}}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above - They were separate prompts at some point so we should separate them here (can just replace the roommate prompt with it or so)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed Non-English Dataset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants