transformer-model | 开发者交流平台

使用`nn.Linear(…)`到`nn.Parameter(torch.tensor(…))`会导致性能下降。

英文: Drop in performance from using nn.Linear(...) to nn.Parameter(torch.tensor(...)) 问题 I am doing s...

2023年7月14日77评论

英文: Use of Params in pyspak 问题在这个示例中，我试图将overrides作为一个Params对象使用，并希望它被用作字符串列表。但是，我无法使用下面的代码分配它的值。 ...

2023年7月11日68评论

英文: How to skip weights init when loading pretrained transformers model? 问题我需要找出如何在开始时不初始化权重的情况下加载预...

2023年5月29日70评论

英文: Transformers from scratch - shape '[1, 40, 64]' is invalid for input of size when passin...

2023年5月22日71评论

英文: Store intermediate values of pytorch module 问题 I try to plot attention maps for ViT. I know that...

2023年5月10日56评论

英文: Informer: loss always Nan 问题我尝试使用infomer模型来预测我的数据集。但是当我将训练数据集更改为我的数据集时，虽然程序可以运行，但我的损失一直是NaN，并且在...

2023年3月10日121评论

英文: Failure to install old versions of transformers in colab 问题我最近在Colab中安装Transformer版本2.9.0时遇到了问题...

2023年3月9日74评论

英文: How to use Huggingface GenerationMixin (or its beam search) with my own model? 问题 Huggingface的使用...

2023年3月9日107评论

英文: Copy Stage on IBM Data Stage 问题我在使用“复制数据”将数据插入表时发现了一个奇怪的问题。所有列都在一个转换器中进行处理，而在转换器中有两个特殊的列。列A使用索...

2023年3月1日48评论

英文: What is the training data input to the transformers (attention is all you need)? 问题抱歉，我只返回翻译好的部...

2020年1月6日59评论