2023年8月10日 20:28:32go评论100阅读模式

英文:

Unable to run a model using HuggingFace Inference Endpoints

问题

我可以使用免费端点成功发出请求，但在使用推理端点时，我收到404响应。以下是相关的代码部分：

mode = 'paid'                                              # 如果是'free'则工作正常
model_id = "sentence-transformers/all-MiniLM-L6-v2"
headers = {"Authorization": f"Bearer {HUGGINGFACE_TOKEN}"}
if mode == 'free':
    # 这个工作正常
    api_url = f"https://api-inference.huggingface.co/pipeline/feature-extraction/{model_id}"
else:
    api_url = f"https://xxxxxxxxxxxxxxxxx.us-east-1.aws.endpoints.huggingface.cloud/{model_id}"
def get_embeddings(texts):
    response = requests.post(api_url, headers=headers, json={"inputs": texts, "options":{"wait_for_model":True}})

在Web界面中，端点显示为正在运行，我可以在那里进行测试，没有问题。

我漏掉了什么？

英文:

I am able to make successful requests using the free endpoint, but when using Inference Endpoints, I get 404 response. Here is the relevant piece of code:

mode = &#39;paid&#39;                                              # works if &#39;free&#39;
model_id = &quot;sentence-transformers/all-MiniLM-L6-v2&quot;
headers = {&quot;Authorization&quot;: f&quot;Bearer {HUGGINGFACE_TOKEN}&quot;}
if mode == &#39;free&#39;:
    # This works
    api_url = f&quot;https://api-inference.huggingface.co/pipeline/feature-extraction/{model_id}&quot;
else:
    api_url = f&quot;https://xxxxxxxxxxxxxxxxx.us-east-1.aws.endpoints.huggingface.cloud/{model_id}&quot;
def get_embeddings(texts):
    response = requests.post(api_url, headers=headers, json={&quot;inputs&quot;: texts, &quot;options&quot;:{&quot;wait_for_model&quot;:True}})

In the web UI, the endpoint is shown as running and I can test it there no problem.

What am I missing?

答案1

得分: 2

如在评论中提到的：

URL 没有 /{model_id} 终点。
task 部分应根据您的需求正确填写。

在移除 /{model_id} 后，我们遇到了一个 400, list indices must be integers or slices, not str 错误消息。这是由于错误的任务引起的。它不是在获取嵌入向量，而是试图在列表中的字符串之间获取相似性。将任务更改为嵌入向量后，模型成功地从单个字符串生成了嵌入向量。有关涵盖部署过程的详细教程，请参阅 Getting Started with Hugging Face Inference Endpoints 。

英文:

As mentioned in the comments;

The URL doesn't have a /{model_id} endpoint.
The task section should be filled correctly according to your needs.

After removing the /{model_id}, we faced a 400, list indices must be integers or slices, not str message. Which was caused by the faulty task. Instead of getting the embeddings, it was trying to get the similarities between strings in a list. After changing the task to embeddings, the model successfully generated embeddings from a single string. For a detailed tutorial that covers the deployment process, please see Getting Started with Hugging Face Inference Endpoints .

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

无法在HuggingFace推理端点上运行模型。

问题

答案1

如何在使用Huggingface Trainer时指定要使用的GPU？

如何在使用Hugging Face进行推断时添加异常处理？

HuggingFace trainer 中的步数是如何计算的？

我不明白 prompts 在 llama_index 中是如何工作的。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。