英文:
stucking at downloading shards for loading LLM model from huggingface
问题
我只会为您提供翻译,不会回答关于如何解决问题的问题。
我只是在使用huggingface的示例来使用他们的LLM模型,但它卡在这里:
downloading shards: 0%| | 0/5 [00:00<?, ?it/s]
(我正在使用Jupiter笔记本,`python 3.11`,并且已安装所有要求)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model = "tiiuae/falcon-40b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
)
sequences = pipeline(
"Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
max_length=200,
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
我该如何修复它?
<details>
<summary>英文:</summary>
I am just using huggingface example to use their LLM model, but it stuck at the:
downloading shards: 0%| | 0/5 [00:00<?, ?it/s]
(I am using Jupiter notebook, `python 3.11`, and all requirements were installed)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model = "tiiuae/falcon-40b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
)
sequences = pipeline(
"Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
max_length=200,
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
how can I fix it?
答案1
得分: 1
我认为它没有卡住。
这只是非常庞大的模型,需要一些时间来下载。tqdm
只会在第一次迭代之后开始估计进度,所以看起来好像什么都没有发生。我正在下载LLama2的最小版本(7B参数),它正在下载两个分片。第一个分片花了超过17分钟才完成,我有一个相对快速的互联网连接。
英文:
I think it's not stuck.
These are just very large models that take a while to download. tqdm
only estimates after the first iteration, so it just looks like nothing is happening. I'm currently downloading the smallest version of LLama2 (7B parameters) and it's downloading two shards. The first took over 17 minutes to complete and I have reasonably fast internet connection.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论