英文:
Csv reader misinterprets quotes
问题
当我尝试使用CSV阅读器读取我的字符串时,我得到的输出将JSON字符串转换为:
'{filter":"freeinternet"', 'region:"178307"}"'
它需要保持为:
"{\"filter\":\"freeinternet\",\"region\":\"178307\"}"
这是我尝试过的内容。我甚至尝试添加了quotechar、escapechar,并尝试不同的版本,但结果是不正确的。
import csv
from io import StringIO
s = u"""url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23"""
f = StringIO(s)
reader = csv.reader(f, delimiter=',', escapechar = '\\')
xd = [row for row in reader]
希望这有所帮助。
英文:
When I try reading my string with csv reader the output I get converts the json string to
'{filter":"freeinternet"', 'region:"178307"}"'
which needs to stay
"{\"filter\":\"freeinternet\",\"region\":\"178307\"}"
This is what I've tried. I've even tried adding quotechar, escapechar, and trying different versions, but it results in incorrect results
import csv
from io import StringIO
s = u"""url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23"""
f = StringIO(s)
reader = csv.reader(f, delimiter=',', escapechar = '\\')
xd = [row for row in reader]
Any help is appreciated
答案1
得分: 1
是的,这是因为Python中的 csv.reader
将反斜杠字符 \
视为转义字符,而您的CSV数据中的JSON字符串包含需要保留的反斜杠。
请查看以下代码:
import csv
import json
from io import StringIO
s = u"""url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{"filter":"freeinternet","region":"178307"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{"filter":"freeinternet","region":"178307"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{"filter":"freeinternet","region":"178307"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23""""
f = StringIO(s)
reader = csv.DictReader(f)
rows = [row for row in reader]
print(json.dumps(rows, indent=2))
英文:
yes,it is because the csv.reader
in Python treats the backslash character \
as an escape character, and the JSON string in your CSV data has backslashes that need to be preserved
check this out :
import csv
import json
from io import StringIO
s = u"""url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23"""
f = StringIO(s)
reader = csv.DictReader(f)
rows = [row for row in reader]
print(json.dumps(rows, indent=2))
答案2
得分: 1
以下是您要翻译的内容:
您可以做的是,操纵输入字符串以在文件和字典数据中使用不同的引号字符。
例如,将'
用作输入的引号字符,并保留"
作为JSON/字典值的引号字符。
import csv
from pprint import pprint
from io import StringIO
s = """url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23"""
f = StringIO(s.replace(',"{', ',\'{').replace('}",', '}\','))
reader = csv.reader(f, delimiter=',', quotechar = '\'')
pprint([row for row in reader])
这一行s.replace(',"{', ',\'{').replace('}",', '}\',')
将"
引号字符替换为'
,然后我们在csv.reader
的参数中使用它。
注意:只要没有嵌套字典,它就会起作用。
英文:
What you can do is, manipulate the input string to have different quote characters for the file and for dictionary data.
For example, use '
as a quote character for input and keep "
as a quote character for your JSON/dictionary values.
import csv
from pprint import pprint
from io import StringIO
s = """url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23"""
f = StringIO(s.replace(',"{', ',\'{').replace('}",', '}\','))
reader = csv.reader(f, delimiter=',', quotechar = '\'')
pprint([row for row in reader])
The line s.replace(',"{', ',\'{').replace('}",', '}\',')
replaces "
quote character with '
and then we are using that in csv.reader
's argument.
Note: It'll work as long as there are no nested dictionaries.
答案3
得分: 0
以下是代码的翻译部分:
import csv
from io import StringIO
s = u""""""url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{"filter":"freeinternet","region":"178307"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{"filter":"freeinternet","region":"178307"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{"filter":"freeinternet","region":"178307"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23""""
f = StringIO(s)
reader = csv.reader(f, delimiter=',', quotechar='"', escapechar='\\')
xd = [row for row in reader]
print(xd)
英文:
Try this:
import csv
from io import StringIO
s = u"""url,forgeresponsetype,identifiers,metadata,partitionkey,sortkey,expirationdate,lastmodifieddate,redirectkey,siteid,locale,type,update_date
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23
https://www.expedia.com/Seattle-Hotels.d178307.Travel-Guide-Hotels,MOVED_PERMANENTLY,"{\"filter\":\"freeinternet\",\"region\":\"178307\"}",,HOTEL_DESTINATION_THEME.494647,1.en_US.filter:freeinternet.region:178307,1696746399,2023-06-18T06:26:40.521Z,,1,en_US,HOTEL_DESTINATION_THEME,21/06/23"""
f = StringIO(s)
reader = csv.reader(f, delimiter=',', quotechar='\'', escapechar='\'')
xd = [row for row in reader]
print(xd)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论