英文:
How to load a fine-tuned peft/lora model based on llama with Huggingface transformers?
问题
我已经按你的要求翻译了以下内容:
尝试加载本地保存的模型
model = AutoModelForCausalLM.from_pretrained("finetuned_model")
结果为 Killed
。
尝试从hub加载模型:
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "lucas0/empath-llama-7b"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(cwd+"/tokenizer.model")
# 加载Lora模型
model = PeftModel.from_pretrained(model, peft_model_id)
结果为:
AttributeError: /home/ubuntu/empath/lora/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats
模型创建:
我使用PEFT和LoRa进行了微调:
model = AutoModelForCausalLM.from_pretrained(
"decapoda-research/llama-7b-hf",
torch_dtype=torch.float16,
device_map='auto',
)
我不得不下载并手动指定llama的分词器。
tokenizer = LlamaTokenizer(cwd+"/tokenizer.model")
tokenizer.pad_token = tokenizer.eos_token
至于训练:
from peft import LoraConfig, get_peft_model
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, config)
data = pd.read_csv("my_csv.csv")
dataset = Dataset.from_pandas(data)
tokenized_dataset = dataset.map(lambda samples: tokenizer(samples["text"]))
trainer = transformers.Trainer(
model=model,
train_dataset=tokenized_dataset,
args=transformers.TrainingArguments(
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
warmup_steps=100,
max_steps=100,
learning_rate=1e-3,
fp16=True,
logging_steps=1,
output_dir='outputs',
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = True # 消除警告,请在推断时重新启用!
trainer.train()
并且本地保存了模型:
trainer.save_model(cwd+"/finetuned_model")
print("saved trainer locally")
以及推送到hub:
model.push_to_hub("lucas0/empath-llama-7b", create_pr=1)
如何加载我微调的模型?
英文:
I've followed this tutorial (colab notebook) in order to finetune my model.
Trying to load my locally saved model
model = AutoModelForCausalLM.from_pretrained("finetuned_model")
yields Killed
.
Trying to load model from hub:
yields
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "lucas0/empath-llama-7b"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(cwd+"/tokenizer.model")
# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)
yields
AttributeError: /home/ubuntu/empath/lora/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats
Model Creation:
I have finetuned a model using PEFT and LoRa:
model = AutoModelForCausalLM.from_pretrained(
"decapoda-research/llama-7b-hf",
torch_dtype=torch.float16,
device_map='auto',
)
I had to download and manually specify the llama tokenizer.
tokenizer = LlamaTokenizer(cwd+"/tokenizer.model")
tokenizer.pad_token = tokenizer.eos_token
to the training:
from peft import LoraConfig, get_peft_model
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, config)
data = pd.read_csv("my_csv.csv")
dataset = Dataset.from_pandas(data)
tokenized_dataset = dataset.map(lambda samples: tokenizer(samples["text"]))
trainer = transformers.Trainer(
model=model,
train_dataset=tokenized_dataset,
args=transformers.TrainingArguments(
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
warmup_steps=100,
max_steps=100,
learning_rate=1e-3,
fp16=True,
logging_steps=1,
output_dir='outputs',
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = True # silence the warnings. Please re-enable for inference!
trainer.train()
and saved it locally with:
trainer.save_model(cwd+"/finetuned_model")
print("saved trainer locally")
as well as to the hub:
model.push_to_hub("lucas0/empath-llama-7b", create_pr=1)
How can I load my finetuned model?
答案1
得分: 3
要加载经过微调的 peft/lora 模型,请查看 guanco 示例,https://stackoverflow.com/a/76372390/610569
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
model_name = "decapoda-research/llama-7b-hf"
adapters_name = "lucas0/empath-llama-7b"
print(f"开始加载模型 {model_name} 到内存中")
m = AutoModelForCausalLM.from_pretrained(
model_name,
#load_in_4bit=True,
torch_dtype=torch.bfloat16,
device_map={"": 0}
)
m = PeftModel.from_pretrained(m, adapters_name)
m = m.merge_and_unload()
tok = LlamaTokenizer.from_pretrained(model_name)
tok.bos_token_id = 1
stop_token_ids = [0]
print(f"成功加载模型 {model_name} 到内存中")
为了正确加载模型,您至少需要一个 A10g GPU 运行时。
有关更多详细信息,请参阅:
- https://github.com/artidoro/qlora#tutorials-and-demonstrations
- 推理笔记本: https://colab.research.google.com/drive/1ge2F1QSK8Q7h0hn3YKuBCOAS0bK8E0wf?usp=sharing
- 训练笔记本: https://colab.research.google.com/drive/1VoYNfYDKcKRQRor98Zbf2-9VQTtGJ24k?usp=sharing
英文:
To load a fine-tuned peft/lora model, take a look at the guanco example, https://stackoverflow.com/a/76372390/610569
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
model_name = "decapoda-research/llama-7b-hf"
adapters_name = "lucas0/empath-llama-7b"
print(f"Starting to load the model {model_name} into memory")
m = AutoModelForCausalLM.from_pretrained(
model_name,
#load_in_4bit=True,
torch_dtype=torch.bfloat16,
device_map={"": 0}
)
m = PeftModel.from_pretrained(m, adapters_name)
m = m.merge_and_unload()
tok = LlamaTokenizer.from_pretrained(model_name)
tok.bos_token_id = 1
stop_token_ids = [0]
print(f"Successfully loaded the model {model_name} into memory")
You will need an A10g GPU runtime minimally to load the model properly.
For more details see
答案2
得分: 0
你可以在推送后像这样加载。我成功地使用以下代码片段完成了这个操作:
# 安装所需的库
# pip install peft transformers
import torch
from peft import PeftModel, PeftConfig
from transformers import LlamaTokenizer, LlamaForCausalLM
from accelerate import infer_auto_device_map, init_empty_weights
peft_model_id = "--path--" # 替换为你的模型路径
# 从预训练模型配置中加载配置
config = PeftConfig.from_pretrained(peft_model_id)
# 使用LlamaForCausalLM加载预训练模型
model1 = LlamaForCausalLM.from_pretrained(
config.base_model_name_or_path,
torch_dtype='auto',
device_map='auto',
offload_folder="offload",
offload_state_dict=True
)
# 使用LlamaTokenizer加载tokenizer
tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
# 加载Lora模型
model1 = PeftModel.from_pretrained(model1, peft_model_id)
注意:请将peft_model_id
替换为你实际的模型路径。
英文:
You can load like this after pushing. I did using the following snippet successfully .
# pip install peft transformers
import torch
from peft import PeftModel, PeftConfig
from transformers import LlamaTokenizer, LlamaForCausalLM
from accelerate import infer_auto_device_map, init_empty_weights
peft_model_id = "--path--"
config = PeftConfig.from_pretrained(peft_model_id)
model1 = LlamaForCausalLM.from_pretrained(
config.base_model_name_or_path,
torch_dtype='auto',
device_map='auto',
offload_folder="offload", offload_state_dict = True
)
tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path)
# Load the Lora model
model1 = PeftModel.from_pretrained(model, peft_model_id)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论