You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Whether in generate or finetune, once I set load_in_8bit=true, it cannot be generated normally. The model will output a bunch of question marks, just like the picture below:
I printed out its vector as shown in the picture
It looks like it wasn't generated properly at all, but when I set it load_in_8bit=false, it can be generated and fine-tuned normally.
I have installed bitsandbytes and accelerate correctly, and no errors will be reported during testing. I've been stuck on this problem for a week, so I wanted to ask for help, thank you! !
Below is my generate.py code
from peft import PeftModel
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
tokenizer = LlamaTokenizer.from_pretrained("llama1")
model = LlamaForCausalLM.from_pretrained(
"llama1",
load_in_8bit = True,
device_map="auto",
)
model = PeftModel.from_pretrained(model, "tloen/alpaca-lora")
def alpaca_talk(text):
inputs = tokenizer(
text,
return_tensors="pt",
)
input_ids = inputs["input_ids"].cuda()
generation_config = GenerationConfig(
temperature=0.9,
top_p=0.75,
)
print("Generating...")
generation_output = model.generate(
input_ids=input_ids,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=256,
)
for s in generation_output.sequences:
print(tokenizer.decode(s))
for input_text in [
"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
What steps should I ....?
### Response:
"""
]:
alpaca_talk(input_text)`
The text was updated successfully, but these errors were encountered:
Whether in generate or finetune, once I set load_in_8bit=true, it cannot be generated normally. The model will output a bunch of question marks, just like the picture below:
I printed out its vector as shown in the picture
It looks like it wasn't generated properly at all, but when I set it load_in_8bit=false, it can be generated and fine-tuned normally.
I have installed bitsandbytes and accelerate correctly, and no errors will be reported during testing. I've been stuck on this problem for a week, so I wanted to ask for help, thank you! !
Below is my generate.py code
The text was updated successfully, but these errors were encountered: