卡在从Huggingface下载shards以加载LLM模型。

huangapple go评论72阅读模式
英文:

stucking at downloading shards for loading LLM model from huggingface

问题

我只会为您提供翻译,不会回答关于如何解决问题的问题。

我只是在使用huggingface的示例来使用他们的LLM模型,但它卡在这里:

downloading shards: 0%| | 0/5 [00:00<?, ?it/s]


(我正在使用Jupiter笔记本,`python 3.11`,并且已安装所有要求)

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = &quot;tiiuae/falcon-40b-instruct&quot;

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    &quot;text-generation&quot;,
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map=&quot;auto&quot;,
)
sequences = pipeline(
   &quot;Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:&quot;,
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f&quot;Result: {seq[&#39;generated_text&#39;]}&quot;)

我该如何修复它?


<details>
<summary>英文:</summary>

I am just using huggingface example to use their LLM model, but it stuck at the: 

downloading shards: 0%| | 0/5 [00:00<?, ?it/s]


(I am using Jupiter notebook, `python 3.11`, and all requirements were installed)

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = &quot;tiiuae/falcon-40b-instruct&quot;

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    &quot;text-generation&quot;,
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map=&quot;auto&quot;,
)
sequences = pipeline(
   &quot;Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:&quot;,
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f&quot;Result: {seq[&#39;generated_text&#39;]}&quot;)

how can I fix it?

答案1

得分: 1

我认为它没有卡住。
这只是非常庞大的模型,需要一些时间来下载。tqdm只会在第一次迭代之后开始估计进度,所以看起来好像什么都没有发生。我正在下载LLama2的最小版本(7B参数),它正在下载两个分片。第一个分片花了超过17分钟才完成,我有一个相对快速的互联网连接。

英文:

I think it's not stuck.
These are just very large models that take a while to download. tqdm only estimates after the first iteration, so it just looks like nothing is happening. I'm currently downloading the smallest version of LLama2 (7B parameters) and it's downloading two shards. The first took over 17 minutes to complete and I have reasonably fast internet connection.

huangapple
  • 本文由 发表于 2023年7月18日 03:57:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76707715.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定