从长文本(GPT回答)根据JSON模式提取JSON?

huangapple go评论72阅读模式
英文:

extract json from a long text (gpt answer) according to a json-schema?

问题

目前我的解决方案是使用贪婪模式的正则表达式匹配 JSON,然后解析 JSON 并通过 JSON 架构检查是否有效。是否有更好的方法?

英文:

There is some text contains a json result like (this line may change):

{
  "foo":{ "bar" : ["1"] }
}

when i got the text,
i need to extract the json from text, and make sure the json is exactly the format i instruct gpt to return in prompt via examples.

currently my solution is to match json by regexp with greedy mode , then parse json and check if the json is valid by json-schema.

Is there some better way?

答案1

得分: 1

Your code must always be prepared for the event that ChatGPT does not comply with your request. However, if you prompt properly, this will rarely happen.

You should consider abandoning the mix of text and JSON output. Have ChatGPT respond with JSON only and ask it to write the text to a string field in the JSON.

To have the model produce JSON, you should use one of the current models (gpt-4-0613 or gpt-3.5-turbo-0613) via the API. These models take the system message more seriously, so you can ask the model to respond only in JSON. See this example:

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
   "model": "gpt-3.5-turbo-0613",
   "messages": [
      {"role": "system", "content": "You are a friendly assistant. Your answers are JSON only."},
      {"role": "assistant", "content": "{\"message\": \"Understood. I will output my answers in JSON format.\" }" },
      {"role": "user", "content": "List three attractions in London." }
   ]
}'

Chat GPT will reply with nice JSON:

{
  "attractions": [
    {
      "name": "The British Museum",
      "description": "A world-famous museum containing a vast collection of art and artifacts from around the globe."
    },
    {
      "name": "The Tower of London",
      "description": "A historic castle that has served various purposes over the centuries, including a royal palace, prison, and treasury."
    },
    {
      "name": "The London Eye",
      "description": "A gigantic ferris wheel offering panoramic views of the city skyline."
    }
  ]
}

However, the better approach is to use ChatGPT's newly introduced function call feature. This allows you to define functions and their parameters, and ChatGPT will then provide JSON with the desired structure to call those functions. Please note that ChatGPT may deviate from your request even with this method, but this will be extremely rare.

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
  "model": "gpt-3.5-turbo-0613",
  "messages": [ 
    { "role": "system", "content": "You are a friendly assistant and you will always call one of the provided functions." },
    { "role": "user", "content": "List three attractions in London." }
  ],
  "functions": [{
      "name":"presentAttractions",
      "description": "Presents the attractions to the user.",
      "parameters": {
        "type": "object",
        "properties": {
          "message": {
            "type": "string",
            "description": "A message to display to the user."
          },
          "attractions": {
            "type": "array",
            "description": "A list of attractions.",
            "items": {
              "type": "string"
            }
          }
        },
        "required": [ "message","attractions" ]
      }
    }]
}'

This results in this response:

"message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "presentAttractions",
          "arguments": "{
            \"message\": \"Here are three attractions in London:\",
            \"attractions\": [\"Big Ben\", \"Buckingham Palace\", \"Tower Bridge\"]
          }"
        }
      },

This way you get the JSON nicely prepared for further processing. Of course, you can also specify several different functions and let ChatGPT decide which one to call. And you can restrict the allowed functions with the function_call property.

For a full description of the API, see the OpenAI API documentation. To learn more about function calls, read the OpenAI blog and OpenAI Cookbook advanced examples.

英文:

Your code must always be prepared for the event that ChatGPT does not comply with your request. However, if you prompt properly, this will rarely happen.

You should consider abandoning the mix of text and JSON output. Have ChatGPT respond with JSON only and ask it to write the text to a string field in the JSON.

To have the model produce JSON, you should use one of the current models (gpt-4-0613 or gpt-3.5-turbo-0613) via the API. These models take the system message more seriously, so you can ask the model to respond only in JSON. See this example:

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
   "model": "gpt-3.5-turbo-0613",
   "messages": [
      {"role": "system", "content": "You are a friendly assistant. Your answers are JSON only."},
      {"role": "assistant", "content": "{\"message\": \"Understood. I will output my answers in JSON format.\" }" },
      {"role": "user", "content": "List three attractions in London." }
   ]
}'

Chat GPT will reply with nice JSON:

{
  "attractions": [
    {
      "name": "The British Museum",
      "description": "A world-famous museum containing a vast collection of art and artifacts from around the globe."
    },
    {
      "name": "The Tower of London",
      "description": "A historic castle that has served various purposes over the centuries, including a royal palace, prison, and treasury."
    },
    {
      "name": "The London Eye",
      "description": "A gigantic ferris wheel offering panoramic views of the city skyline."
    }
  ]
}

However, the better approach is to use ChatGPT's newly introduced function call feature. This allows you to define functions and their parameters and ChatGPT will then provide JSON with the desired structure to call those functions. Please note that ChatGPT may deviate from your request even with this method, but this will be extremely rare.

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
  "model": "gpt-3.5-turbo-0613",
  "messages": [ 
    { "role": "system", "content": "You are a friendly assistant and you will always call one of the provided functions." },
    { "role": "user", "content": "List three attractions in London." }
  ],
  "functions": [{
      "name":"presentAttractions",
      "description": "Presents the attractions to the user.",
      "parameters": {
        "type": "object",
        "properties": {
          "message": {
            "type": "string",
            "description": "A message to display to the user."
          },
          "attractions": {
            "type": "array",
            "description": "A list of attractions.",
            "items": {
              "type": "string"
            }
          }
        },
        "required": [ "message","attractions" ]
      }
    }]
}'

This results in this response:

"message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "presentAttractions",
          "arguments": "{\n  \"message\": \"Here are three attractions in London:\",\n  \"attractions\": [\"Big Ben\", \"Buckingham Palace\", \"Tower Bridge\"]\n}"
        }
      },

This way you get the JSON nicely prepared for further processing. Of course you can also specify several different functions and let ChatGPT decide which one to call. And you can restrict the allowed functions with the function_call property.

For a full description of the API, see the OpenAI API documentation. To learn more about function calls, read the OpenAI blog and OpenAI Cookbook advanced examples.

huangapple
  • 本文由 发表于 2023年6月26日 13:54:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76553851.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定