英文:
extract json from a long text (gpt answer) according to a json-schema?
问题
目前我的解决方案是使用贪婪模式的正则表达式匹配 JSON,然后解析 JSON 并通过 JSON 架构检查是否有效。是否有更好的方法?
英文:
There is some text contains a json result like (this line may change):
{
"foo":{ "bar" : ["1"] }
}
when i got the text,
i need to extract the json from text, and make sure the json is exactly the format i instruct gpt to return in prompt via examples.
currently my solution is to match json by regexp with greedy mode , then parse json and check if the json is valid by json-schema.
Is there some better way?
答案1
得分: 1
Your code must always be prepared for the event that ChatGPT does not comply with your request. However, if you prompt properly, this will rarely happen.
You should consider abandoning the mix of text and JSON output. Have ChatGPT respond with JSON only and ask it to write the text to a string field in the JSON.
To have the model produce JSON, you should use one of the current models (gpt-4-0613 or gpt-3.5-turbo-0613) via the API. These models take the system message more seriously, so you can ask the model to respond only in JSON. See this example:
curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
"model": "gpt-3.5-turbo-0613",
"messages": [
{"role": "system", "content": "You are a friendly assistant. Your answers are JSON only."},
{"role": "assistant", "content": "{\"message\": \"Understood. I will output my answers in JSON format.\" }" },
{"role": "user", "content": "List three attractions in London." }
]
}'
Chat GPT will reply with nice JSON:
{
"attractions": [
{
"name": "The British Museum",
"description": "A world-famous museum containing a vast collection of art and artifacts from around the globe."
},
{
"name": "The Tower of London",
"description": "A historic castle that has served various purposes over the centuries, including a royal palace, prison, and treasury."
},
{
"name": "The London Eye",
"description": "A gigantic ferris wheel offering panoramic views of the city skyline."
}
]
}
However, the better approach is to use ChatGPT's newly introduced function call feature. This allows you to define functions and their parameters, and ChatGPT will then provide JSON with the desired structure to call those functions. Please note that ChatGPT may deviate from your request even with this method, but this will be extremely rare.
curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
"model": "gpt-3.5-turbo-0613",
"messages": [
{ "role": "system", "content": "You are a friendly assistant and you will always call one of the provided functions." },
{ "role": "user", "content": "List three attractions in London." }
],
"functions": [{
"name":"presentAttractions",
"description": "Presents the attractions to the user.",
"parameters": {
"type": "object",
"properties": {
"message": {
"type": "string",
"description": "A message to display to the user."
},
"attractions": {
"type": "array",
"description": "A list of attractions.",
"items": {
"type": "string"
}
}
},
"required": [ "message","attractions" ]
}
}]
}'
This results in this response:
"message": {
"role": "assistant",
"content": null,
"function_call": {
"name": "presentAttractions",
"arguments": "{
\"message\": \"Here are three attractions in London:\",
\"attractions\": [\"Big Ben\", \"Buckingham Palace\", \"Tower Bridge\"]
}"
}
},
This way you get the JSON nicely prepared for further processing. Of course, you can also specify several different functions and let ChatGPT decide which one to call. And you can restrict the allowed functions with the function_call
property.
For a full description of the API, see the OpenAI API documentation. To learn more about function calls, read the OpenAI blog and OpenAI Cookbook advanced examples.
英文:
Your code must always be prepared for the event that ChatGPT does not comply with your request. However, if you prompt properly, this will rarely happen.
You should consider abandoning the mix of text and JSON output. Have ChatGPT respond with JSON only and ask it to write the text to a string field in the JSON.
To have the model produce JSON, you should use one of the current models (gpt-4-0613 or gpt-3.5-turbo-0613) via the API. These models take the system message more seriously, so you can ask the model to respond only in JSON. See this example:
curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
"model": "gpt-3.5-turbo-0613",
"messages": [
{"role": "system", "content": "You are a friendly assistant. Your answers are JSON only."},
{"role": "assistant", "content": "{\"message\": \"Understood. I will output my answers in JSON format.\" }" },
{"role": "user", "content": "List three attractions in London." }
]
}'
Chat GPT will reply with nice JSON:
{
"attractions": [
{
"name": "The British Museum",
"description": "A world-famous museum containing a vast collection of art and artifacts from around the globe."
},
{
"name": "The Tower of London",
"description": "A historic castle that has served various purposes over the centuries, including a royal palace, prison, and treasury."
},
{
"name": "The London Eye",
"description": "A gigantic ferris wheel offering panoramic views of the city skyline."
}
]
}
However, the better approach is to use ChatGPT's newly introduced function call feature. This allows you to define functions and their parameters and ChatGPT will then provide JSON with the desired structure to call those functions. Please note that ChatGPT may deviate from your request even with this method, but this will be extremely rare.
curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
"model": "gpt-3.5-turbo-0613",
"messages": [
{ "role": "system", "content": "You are a friendly assistant and you will always call one of the provided functions." },
{ "role": "user", "content": "List three attractions in London." }
],
"functions": [{
"name":"presentAttractions",
"description": "Presents the attractions to the user.",
"parameters": {
"type": "object",
"properties": {
"message": {
"type": "string",
"description": "A message to display to the user."
},
"attractions": {
"type": "array",
"description": "A list of attractions.",
"items": {
"type": "string"
}
}
},
"required": [ "message","attractions" ]
}
}]
}'
This results in this response:
"message": {
"role": "assistant",
"content": null,
"function_call": {
"name": "presentAttractions",
"arguments": "{\n \"message\": \"Here are three attractions in London:\",\n \"attractions\": [\"Big Ben\", \"Buckingham Palace\", \"Tower Bridge\"]\n}"
}
},
This way you get the JSON nicely prepared for further processing. Of course you can also specify several different functions and let ChatGPT decide which one to call. And you can restrict the allowed functions with the function_call
property.
For a full description of the API, see the OpenAI API documentation. To learn more about function calls, read the OpenAI blog and OpenAI Cookbook advanced examples.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论