如何使用Huggingface transformers加载基于llama的fine-tuned peft/lora模型?

huangapple go评论209阅读模式
英文:

How to load a fine-tuned peft/lora model based on llama with Huggingface transformers?

问题

我已经按你的要求翻译了以下内容:

尝试加载本地保存的模型

  1. model = AutoModelForCausalLM.from_pretrained("finetuned_model")

结果为 Killed


尝试从hub加载模型:

  1. import torch
  2. from peft import PeftModel, PeftConfig
  3. from transformers import AutoModelForCausalLM, AutoTokenizer
  4. peft_model_id = "lucas0/empath-llama-7b"
  5. config = PeftConfig.from_pretrained(peft_model_id)
  6. model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
  7. tokenizer = AutoTokenizer.from_pretrained(cwd+"/tokenizer.model")
  8. # 加载Lora模型
  9. model = PeftModel.from_pretrained(model, peft_model_id)

结果为:

  1. AttributeError: /home/ubuntu/empath/lora/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats

完整的堆栈跟踪

模型创建:

我使用PEFT和LoRa进行了微调:

  1. model = AutoModelForCausalLM.from_pretrained(
  2. "decapoda-research/llama-7b-hf",
  3. torch_dtype=torch.float16,
  4. device_map='auto',
  5. )

我不得不下载并手动指定llama的分词器。

  1. tokenizer = LlamaTokenizer(cwd+"/tokenizer.model")
  2. tokenizer.pad_token = tokenizer.eos_token

至于训练:

  1. from peft import LoraConfig, get_peft_model
  2. config = LoraConfig(
  3. r=8,
  4. lora_alpha=16,
  5. target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
  6. lora_dropout=0.05,
  7. bias="none",
  8. task_type="CAUSAL_LM"
  9. )
  10. model = get_peft_model(model, config)
  11. data = pd.read_csv("my_csv.csv")
  12. dataset = Dataset.from_pandas(data)
  13. tokenized_dataset = dataset.map(lambda samples: tokenizer(samples["text"]))
  14. trainer = transformers.Trainer(
  15. model=model,
  16. train_dataset=tokenized_dataset,
  17. args=transformers.TrainingArguments(
  18. per_device_train_batch_size=4,
  19. gradient_accumulation_steps=4,
  20. warmup_steps=100,
  21. max_steps=100,
  22. learning_rate=1e-3,
  23. fp16=True,
  24. logging_steps=1,
  25. output_dir='outputs',
  26. ),
  27. data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
  28. )
  29. model.config.use_cache = True # 消除警告,请在推断时重新启用!
  30. trainer.train()

并且本地保存了模型:

  1. trainer.save_model(cwd+"/finetuned_model")
  2. print("saved trainer locally")

以及推送到hub:

  1. model.push_to_hub("lucas0/empath-llama-7b", create_pr=1)

如何加载我微调的模型?

英文:

I've followed this tutorial (colab notebook) in order to finetune my model.

Trying to load my locally saved model

  1. model = AutoModelForCausalLM.from_pretrained("finetuned_model")

yields Killed.


Trying to load model from hub:

yields

  1. import torch
  2. from peft import PeftModel, PeftConfig
  3. from transformers import AutoModelForCausalLM, AutoTokenizer
  4. peft_model_id = "lucas0/empath-llama-7b"
  5. config = PeftConfig.from_pretrained(peft_model_id)
  6. model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
  7. tokenizer = AutoTokenizer.from_pretrained(cwd+"/tokenizer.model")
  8. # Load the Lora model
  9. model = PeftModel.from_pretrained(model, peft_model_id)

yields

  1. AttributeError: /home/ubuntu/empath/lora/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats

full stacktrace

Model Creation:

I have finetuned a model using PEFT and LoRa:

  1. model = AutoModelForCausalLM.from_pretrained(
  2. "decapoda-research/llama-7b-hf",
  3. torch_dtype=torch.float16,
  4. device_map='auto',
  5. )

I had to download and manually specify the llama tokenizer.

  1. tokenizer = LlamaTokenizer(cwd+"/tokenizer.model")
  2. tokenizer.pad_token = tokenizer.eos_token

to the training:

  1. from peft import LoraConfig, get_peft_model
  2. config = LoraConfig(
  3. r=8,
  4. lora_alpha=16,
  5. target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
  6. lora_dropout=0.05,
  7. bias="none",
  8. task_type="CAUSAL_LM"
  9. )
  10. model = get_peft_model(model, config)
  11. data = pd.read_csv("my_csv.csv")
  12. dataset = Dataset.from_pandas(data)
  13. tokenized_dataset = dataset.map(lambda samples: tokenizer(samples["text"]))
  14. trainer = transformers.Trainer(
  15. model=model,
  16. train_dataset=tokenized_dataset,
  17. args=transformers.TrainingArguments(
  18. per_device_train_batch_size=4,
  19. gradient_accumulation_steps=4,
  20. warmup_steps=100,
  21. max_steps=100,
  22. learning_rate=1e-3,
  23. fp16=True,
  24. logging_steps=1,
  25. output_dir='outputs',
  26. ),
  27. data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
  28. )
  29. model.config.use_cache = True # silence the warnings. Please re-enable for inference!
  30. trainer.train()

and saved it locally with:

  1. trainer.save_model(cwd+"/finetuned_model")
  2. print("saved trainer locally")

as well as to the hub:

  1. model.push_to_hub("lucas0/empath-llama-7b", create_pr=1)

How can I load my finetuned model?

答案1

得分: 3

要加载经过微调的 peft/lora 模型,请查看 guanco 示例,https://stackoverflow.com/a/76372390/610569

  1. import torch
  2. from peft import PeftModel
  3. from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
  4. model_name = "decapoda-research/llama-7b-hf"
  5. adapters_name = "lucas0/empath-llama-7b"
  6. print(f"开始加载模型 {model_name} 到内存中")
  7. m = AutoModelForCausalLM.from_pretrained(
  8. model_name,
  9. #load_in_4bit=True,
  10. torch_dtype=torch.bfloat16,
  11. device_map={"": 0}
  12. )
  13. m = PeftModel.from_pretrained(m, adapters_name)
  14. m = m.merge_and_unload()
  15. tok = LlamaTokenizer.from_pretrained(model_name)
  16. tok.bos_token_id = 1
  17. stop_token_ids = [0]
  18. print(f"成功加载模型 {model_name} 到内存中")

为了正确加载模型,您至少需要一个 A10g GPU 运行时。

有关更多详细信息,请参阅:

英文:

To load a fine-tuned peft/lora model, take a look at the guanco example, https://stackoverflow.com/a/76372390/610569

  1. import torch
  2. from peft import PeftModel
  3. from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
  4. model_name = "decapoda-research/llama-7b-hf"
  5. adapters_name = "lucas0/empath-llama-7b"
  6. print(f"Starting to load the model {model_name} into memory")
  7. m = AutoModelForCausalLM.from_pretrained(
  8. model_name,
  9. #load_in_4bit=True,
  10. torch_dtype=torch.bfloat16,
  11. device_map={"": 0}
  12. )
  13. m = PeftModel.from_pretrained(m, adapters_name)
  14. m = m.merge_and_unload()
  15. tok = LlamaTokenizer.from_pretrained(model_name)
  16. tok.bos_token_id = 1
  17. stop_token_ids = [0]
  18. print(f"Successfully loaded the model {model_name} into memory")

You will need an A10g GPU runtime minimally to load the model properly.


For more details see

答案2

得分: 0

你可以在推送后像这样加载。我成功地使用以下代码片段完成了这个操作:

  1. # 安装所需的库
  2. # pip install peft transformers
  3. import torch
  4. from peft import PeftModel, PeftConfig
  5. from transformers import LlamaTokenizer, LlamaForCausalLM
  6. from accelerate import infer_auto_device_map, init_empty_weights
  7. peft_model_id = "--path--" # 替换为你的模型路径
  8. # 从预训练模型配置中加载配置
  9. config = PeftConfig.from_pretrained(peft_model_id)
  10. # 使用LlamaForCausalLM加载预训练模型
  11. model1 = LlamaForCausalLM.from_pretrained(
  12. config.base_model_name_or_path,
  13. torch_dtype='auto',
  14. device_map='auto',
  15. offload_folder="offload",
  16. offload_state_dict=True
  17. )
  18. # 使用LlamaTokenizer加载tokenizer
  19. tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
  20. # 加载Lora模型
  21. model1 = PeftModel.from_pretrained(model1, peft_model_id)

注意:请将peft_model_id替换为你实际的模型路径。

英文:

You can load like this after pushing. I did using the following snippet successfully .

  1. # pip install peft transformers
  2. import torch
  3. from peft import PeftModel, PeftConfig
  4. from transformers import LlamaTokenizer, LlamaForCausalLM
  5. from accelerate import infer_auto_device_map, init_empty_weights
  6. peft_model_id = "--path--"
  7. config = PeftConfig.from_pretrained(peft_model_id)
  8. model1 = LlamaForCausalLM.from_pretrained(
  9. config.base_model_name_or_path,
  10. torch_dtype='auto',
  11. device_map='auto',
  12. offload_folder="offload", offload_state_dict = True
  13. )
  14. tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
  15. # Load the Lora model
  16. model1 = PeftModel.from_pretrained(model, peft_model_id)

huangapple
  • 本文由 发表于 2023年6月13日 01:34:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76459034.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定