2023年6月26日 13:54:15go评论92阅读模式

英文:

extract json from a long text (gpt answer) according to a json-schema?

问题

目前我的解决方案是使用贪婪模式的正则表达式匹配 JSON，然后解析 JSON 并通过 JSON 架构检查是否有效。是否有更好的方法？

英文:

There is some text contains a json result like (this line may change):

{
  &quot;foo&quot;:{ &quot;bar&quot; : [&quot;1&quot;] }
}

when i got the text,
i need to extract the json from text, and make sure the json is exactly the format i instruct gpt to return in prompt via examples.

currently my solution is to match json by regexp with greedy mode , then parse json and check if the json is valid by json-schema.

Is there some better way?

答案1

得分: 1

Your code must always be prepared for the event that ChatGPT does not comply with your request. However, if you prompt properly, this will rarely happen.

You should consider abandoning the mix of text and JSON output. Have ChatGPT respond with JSON only and ask it to write the text to a string field in the JSON.

To have the model produce JSON, you should use one of the current models (gpt-4-0613 or gpt-3.5-turbo-0613) via the API. These models take the system message more seriously, so you can ask the model to respond only in JSON. See this example:

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
   "model": "gpt-3.5-turbo-0613",
   "messages": [
      {"role": "system", "content": "You are a friendly assistant. Your answers are JSON only."},
      {"role": "assistant", "content": "{\"message\": \"Understood. I will output my answers in JSON format.\" }" },
      {"role": "user", "content": "List three attractions in London." }
   ]
}'

Chat GPT will reply with nice JSON:

{
  "attractions": [
    {
      "name": "The British Museum",
      "description": "A world-famous museum containing a vast collection of art and artifacts from around the globe."
    },
    {
      "name": "The Tower of London",
      "description": "A historic castle that has served various purposes over the centuries, including a royal palace, prison, and treasury."
    },
    {
      "name": "The London Eye",
      "description": "A gigantic ferris wheel offering panoramic views of the city skyline."
    }
  ]
}

However, the better approach is to use ChatGPT's newly introduced function call feature. This allows you to define functions and their parameters, and ChatGPT will then provide JSON with the desired structure to call those functions. Please note that ChatGPT may deviate from your request even with this method, but this will be extremely rare.

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
  "model": "gpt-3.5-turbo-0613",
  "messages": [ 
    { "role": "system", "content": "You are a friendly assistant and you will always call one of the provided functions." },
    { "role": "user", "content": "List three attractions in London." }
  ],
  "functions": [{
      "name":"presentAttractions",
      "description": "Presents the attractions to the user.",
      "parameters": {
        "type": "object",
        "properties": {
          "message": {
            "type": "string",
            "description": "A message to display to the user."
          },
          "attractions": {
            "type": "array",
            "description": "A list of attractions.",
            "items": {
              "type": "string"
            }
          }
        },
        "required": [ "message","attractions" ]
      }
    }]
}'

This results in this response:

"message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "presentAttractions",
          "arguments": "{
            \"message\": \"Here are three attractions in London:\",
            \"attractions\": [\"Big Ben\", \"Buckingham Palace\", \"Tower Bridge\"]
          }"
        }
      },

This way you get the JSON nicely prepared for further processing. Of course, you can also specify several different functions and let ChatGPT decide which one to call. And you can restrict the allowed functions with the function_call property.

For a full description of the API, see the OpenAI API documentation. To learn more about function calls, read the OpenAI blog and OpenAI Cookbook advanced examples.

英文:

Your code must always be prepared for the event that ChatGPT does not comply with your request. However, if you prompt properly, this will rarely happen.

You should consider abandoning the mix of text and JSON output. Have ChatGPT respond with JSON only and ask it to write the text to a string field in the JSON.

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H &#39;Content-Type: application/json&#39; -d &#39;{
   &quot;model&quot;: &quot;gpt-3.5-turbo-0613&quot;,
   &quot;messages&quot;: [
      {&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You are a friendly assistant. Your answers are JSON only.&quot;},
      {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;{\&quot;message\&quot;: \&quot;Understood. I will output my answers in JSON format.\&quot; }&quot; },
      {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;List three attractions in London.&quot; }
   ]
}&#39;

Chat GPT will reply with nice JSON:

{
  &quot;attractions&quot;: [
    {
      &quot;name&quot;: &quot;The British Museum&quot;,
      &quot;description&quot;: &quot;A world-famous museum containing a vast collection of art and artifacts from around the globe.&quot;
    },
    {
      &quot;name&quot;: &quot;The Tower of London&quot;,
      &quot;description&quot;: &quot;A historic castle that has served various purposes over the centuries, including a royal palace, prison, and treasury.&quot;
    },
    {
      &quot;name&quot;: &quot;The London Eye&quot;,
      &quot;description&quot;: &quot;A gigantic ferris wheel offering panoramic views of the city skyline.&quot;
    }
  ]
}

However, the better approach is to use ChatGPT's newly introduced function call feature. This allows you to define functions and their parameters and ChatGPT will then provide JSON with the desired structure to call those functions. Please note that ChatGPT may deviate from your request even with this method, but this will be extremely rare.

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H &#39;Content-Type: application/json&#39; -d &#39;{
  &quot;model&quot;: &quot;gpt-3.5-turbo-0613&quot;,
  &quot;messages&quot;: [ 
    { &quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You are a friendly assistant and you will always call one of the provided functions.&quot; },
    { &quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;List three attractions in London.&quot; }
  ],
  &quot;functions&quot;: [{
      &quot;name&quot;:&quot;presentAttractions&quot;,
      &quot;description&quot;: &quot;Presents the attractions to the user.&quot;,
      &quot;parameters&quot;: {
        &quot;type&quot;: &quot;object&quot;,
        &quot;properties&quot;: {
          &quot;message&quot;: {
            &quot;type&quot;: &quot;string&quot;,
            &quot;description&quot;: &quot;A message to display to the user.&quot;
          },
          &quot;attractions&quot;: {
            &quot;type&quot;: &quot;array&quot;,
            &quot;description&quot;: &quot;A list of attractions.&quot;,
            &quot;items&quot;: {
              &quot;type&quot;: &quot;string&quot;
            }
          }
        },
        &quot;required&quot;: [ &quot;message&quot;,&quot;attractions&quot; ]
      }
    }]
}&#39;

This results in this response:

&quot;message&quot;: {
        &quot;role&quot;: &quot;assistant&quot;,
        &quot;content&quot;: null,
        &quot;function_call&quot;: {
          &quot;name&quot;: &quot;presentAttractions&quot;,
          &quot;arguments&quot;: &quot;{\n  \&quot;message\&quot;: \&quot;Here are three attractions in London:\&quot;,\n  \&quot;attractions\&quot;: [\&quot;Big Ben\&quot;, \&quot;Buckingham Palace\&quot;, \&quot;Tower Bridge\&quot;]\n}&quot;
        }
      },

This way you get the JSON nicely prepared for further processing. Of course you can also specify several different functions and let ChatGPT decide which one to call. And you can restrict the allowed functions with the function_call property.

For a full description of the API, see the OpenAI API documentation. To learn more about function calls, read the OpenAI blog and OpenAI Cookbook advanced examples.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从长文本（GPT回答）根据JSON模式提取JSON？

问题

答案1

改变 JSON 对象的值。

在Setter或Getter上应用@JsonProperty，而不是同时应用在两者上。

文本文件的最后一行未正确存储到二维数组中。

Java中的String.replaceAll()与matcher.replaceAll()在循环中的差异

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。