英文:
Python: json.dumps with ensure_ascii=False and encoding('utf-8') seems to convert string to bytes
问题
我正在生成一个Python字典,如下所示:
placedict = {
"id": geonames.geonames_id,
"info": json.dumps(jsoninfo),
}
其中"id"是一个字符串,"info"是一个有效且可读的JSON字符串。
但是,正如您所看到的,虽然jsoninfo变量包含有效的UTF-8字符,但placedict['info']的字符不是UTF-8编码,而是转义字符。
因此,我尝试将json.dumps行更改为:
placedict = {
"id": geonames.geonames_id,
"info": json.dumps(jsoninfo).encode("utf-8"),
}
或者甚至
placedict = {
"id": geonames.geonames_id,
"info": json.dumps(jsoninfo, ensure_ascii=False).encode("utf-8"),
}
希望这将以所期望的方式编码JSON,但我发现在进行这些修改之后,字典的'info'成员返回为b'.........',因此在MongoDB中找到了一个二进制字符串。
我想在MongoDB中存储一个具有UTF-8编码可读JSON字符串的字典。
我在哪里出错了?
英文:
I am generating a Python dictionary as follows:
placedict = {
"id": geonames.geonames_id,
"info": json.dumps(jsoninfo),
}
where id is a string and info a valid and readable JSON string:
'{"geonamesurl": "http://geonames.org/310859/kahramanmara\\u015f.html", "searchstring": "Kahramanmara\\u015f", "place": "Kahramanmara\\u015f", "confidence": 1, "typecode": "PPLA", "toponym": "Kahramanmara\\u015f", "geoid": 310859, "continent": "AS", "country": "Turkey", "state": "Kahramanmara\\u015f", "region": "Kahramanmara\\u015f", "lat": "37.5847", "long": "36.92641", "population": 376045, "bbox": {"northeast": [37.66426194452945, 37.02690583904019], "southwest": [37.50514805547055, 36.825904160959816]}, "timezone": "Europe/Istanbul", "wikipedia": "en.wikipedia.org/wiki/Kahramanmara%C5%9F", "hyerlist": ["part-of: Earth GeoID: 6295630 GeoCode: AREA", "part-of: Asia GeoID: 6255147 GeoCode: CONT", "part-of: Turkey GeoID: 298795 GeoCode: PCLI", "part-of: Kahramanmara\\u015f GeoID: 310858 GeoCode: ADM1", "part-of: Kahramanmara\\u015f GeoID: 310859 GeoCode: PPLA"], "childlist": ["Aksu", "Barbaros", "Egemenlik"]}'
but as you can see while the jsoninfo variable holds valid utf-8 chars, the placedict['info'] chars are not utf-8 encoded but rather escaped.
I therefore tried to change the json.dumps line to:
placedict = {
"id": geonames.geonames_id,
"info": json.dumps(jsoninfo).encode("utf-8"),
}
or even
placedict = {
"id": geonames.geonames_id,
"info": json.dumps(jsoninfo, ensure_ascii=False).encode("utf-8"),
}
hoping this would encode the JSON as desired, but I see that after either of these modifications, the 'info" member of the dictionary returns as b'.........' and therefore find a binary string in MongoDB.
I want to store the dictionary with an utf-8 encoded readable JSON string in MongoDB.
Where am I making a mistake?
答案1
得分: 2
你可以只使用json.dumps,并设置ensure_ascii=False。
import json
jsoninfo = {"El": "Niño"}
info = json.dumps(jsoninfo, ensure_ascii=False)
print(info) # {"El": "Niño"}
英文:
You might use just json.dumps with ensure_ascii=False
import json
jsoninfo = {"El":"Niño"}
info = json.dumps(jsoninfo, ensure_ascii=False)
print(info) # {"El": "Niño"}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论