Fine-tuning with label mask #410

epwalsh · 2024-01-17T00:26:38Z

Add support for fine-tuning with a label mask.
Add a script for preparing Tulu V2 for fine-tuning.
Add fine-tuning instructions to README.

- Add support for fine-tuning with a label mask. - Add a script for preparing Tulu V2 for fine-tuning. - Add fine-tuning instructions to README.

epwalsh · 2024-01-17T00:27:25Z

scripts/prepare_tulu_data.py

@@ -0,0 +1,111 @@
+"""


@hamishivi could you review this script?

epwalsh · 2024-01-17T00:27:59Z

scripts/prepare_tulu_data.py

+def preprocess(example, tokenizer: Tokenizer, max_seq_len: int):
+    parts = []
+    for msg in example["messages"]:
+        parts.append(f"<|{msg['role']}|>")
+        parts.append(msg["content"])
+
+    prompt = "\n".join(parts[:-1]) + "\n"
+    completion = parts[-1]
+
+    prompt_ids = tokenizer.encode(prompt, add_special_tokens=False)
+    completion_ids = tokenizer.encode(completion, add_special_tokens=True)
+
+    input_ids = (prompt_ids + completion_ids)[:max_seq_len]
+    label_mask = ([False] * len(prompt_ids) + [True] * len(completion_ids))[:max_seq_len]
+
+    if len(input_ids) < max_seq_len:
+        pad_len = max_seq_len - len(input_ids)
+        input_ids += [tokenizer.pad_token_id] * pad_len
+        label_mask += [False] * pad_len
+
+    assert len(input_ids) == len(label_mask)
+
+    return {"input_ids": input_ids, "label_mask": label_mask}


@hamishivi in particular this function for preprocessing/tokenizing each example.

This isn't quite right. Actually, the content for any message from the assistant role should be trained on, not just the final role. This is because we have some multi-turn dialogues in our dataset, and so this is important for that. This is a bit tricky to do but a reference is here: https://github.com/allenai/open-instruct/blob/main/open_instruct/finetune.py#L292

updated to match your script

epwalsh · 2024-01-17T00:28:47Z

README.md

@@ -10,3 +10,19 @@
 ```
 pip install ai2-olmo
 ```
+
+## Fine-tuning


@AkshitaB, fine-tuning instructions added here.

hamishivi · 2024-01-17T00:35:01Z

scripts/prepare_tulu_data.py

+def preprocess(example, tokenizer: Tokenizer, max_seq_len: int):
+    parts = []
+    for msg in example["messages"]:
+        parts.append(f"<|{msg['role']}|>")
+        parts.append(msg["content"])
+
+    prompt = "\n".join(parts[:-1]) + "\n"
+    completion = parts[-1]
+
+    prompt_ids = tokenizer.encode(prompt, add_special_tokens=False)
+    completion_ids = tokenizer.encode(completion, add_special_tokens=True)
+
+    input_ids = (prompt_ids + completion_ids)[:max_seq_len]
+    label_mask = ([False] * len(prompt_ids) + [True] * len(completion_ids))[:max_seq_len]
+
+    if len(input_ids) < max_seq_len:
+        pad_len = max_seq_len - len(input_ids)
+        input_ids += [tokenizer.pad_token_id] * pad_len
+        label_mask += [False] * pad_len
+
+    assert len(input_ids) == len(label_mask)
+
+    return {"input_ids": input_ids, "label_mask": label_mask}


This isn't quite right. Actually, the content for any message from the assistant role should be trained on, not just the final role. This is because we have some multi-turn dialogues in our dataset, and so this is important for that. This is a bit tricky to do but a reference is here: https://github.com/allenai/open-instruct/blob/main/open_instruct/finetune.py#L292

hamishivi · 2024-01-17T00:35:28Z

scripts/prepare_tulu_data.py

+    completion = parts[-1]
+
+    prompt_ids = tokenizer.encode(prompt, add_special_tokens=False)
+    completion_ids = tokenizer.encode(completion, add_special_tokens=True)


What special tokens does the olmo tokenizer add? There should be an eos token after every assistant message.

hamishivi · 2024-01-17T00:36:30Z

scripts/prepare_tulu_data.py

+    prompt = "\n".join(parts[:-1]) + "\n"
+    completion = parts[-1]
+
+    prompt_ids = tokenizer.encode(prompt, add_special_tokens=False)


I found it useful to add a bos token in training (or rather, using the eos as a bos marker), but I don't think its essential.

hamishivi · 2024-01-17T00:44:02Z

scripts/prepare_tulu_data.py

+    input_ids = (prompt_ids + completion_ids)[:max_seq_len]
+    label_mask = ([False] * len(prompt_ids) + [True] * len(completion_ids))[:max_seq_len]
+
+    if len(input_ids) < max_seq_len:


random q: what happens when the sequence length is over your training max_seq_len? just naive truncation? (this is fine just curious)

yea just naive truncation

2015aroras

Can't see anything obviously wrong.

Fine-tuning with label mask

321c499

- Add support for fine-tuning with a label mask. - Add a script for preparing Tulu V2 for fine-tuning. - Add fine-tuning instructions to README.

epwalsh commented Jan 17, 2024

View reviewed changes

README.md

@@ -10,3 +10,19 @@

```

pip install ai2-olmo

```

## Fine-tuning

Copy link

Member Author

epwalsh Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkshitaB, fine-tuning instructions added here.

epwalsh requested review from 2015aroras, dirkgr and AkshitaB January 17, 2024 00:28

hamishivi reviewed Jan 17, 2024

View reviewed changes

epwalsh added 6 commits January 16, 2024 17:57

Unmask all assistant messages, update format

5c3aa57

fix

b36f8fb

Filter out examples that don't have assistant

829f090

Clone then in place ops

437c838

Fix filter

8514a82

Update README

2899798

2015aroras approved these changes Jan 18, 2024

View reviewed changes

epwalsh added 4 commits January 18, 2024 11:44

Merge branch 'main' into epwalsh/fine-tune-with-label-masking

cfca552

update configs to use optimizer state

b69ea02

Update config

cfbb68f

update branch

f3298c6

epwalsh merged commit f36ac42 into main Jan 19, 2024
10 checks passed

epwalsh deleted the epwalsh/fine-tune-with-label-masking branch January 19, 2024 22:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuning with label mask #410

Fine-tuning with label mask #410

epwalsh commented Jan 17, 2024

epwalsh Jan 17, 2024

epwalsh Jan 17, 2024

hamishivi Jan 17, 2024

epwalsh Jan 17, 2024

epwalsh Jan 17, 2024

hamishivi Jan 17, 2024

hamishivi Jan 17, 2024

epwalsh Jan 17, 2024

hamishivi Jan 17, 2024

epwalsh Jan 17, 2024

hamishivi Jan 17, 2024

epwalsh Jan 17, 2024

2015aroras left a comment

Fine-tuning with label mask #410

Fine-tuning with label mask #410

Conversation

epwalsh commented Jan 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2015aroras left a comment

Choose a reason for hiding this comment