如何选择 JSON 文件中的单独行以进行 json.loads()?

huangapple go评论71阅读模式
英文:

How to select separate lines in a json file for json.loads()?

问题

你想逐行选择这个JSON文件的内容以便将每行写入关系数据库,但如果不逐行选择,json.loads() 会引发 "Extra Data" 错误。

英文:

Anyone know an effective way to select just one line at a time of this json file in Python?

I want to be able to write each line into a relational database but json.loads() throws an 'Extra Data' error if I don't select each line separately.

Thanks

{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Seoul (Incheon)","departure.timezone":"Asia/Seoul","departure.iata":"ICN","departure.icao":"RKSI","departure.terminal":"1","departure.gate":"38","departure.scheduled":"2023-04-14T12:00:00+00:00","departure.estimated":"2023-04-14T12:00:00+00:00","arrival.airport":"Fukuoka","arrival.timezone":"Asia/Tokyo","arrival.iata":"FUK","arrival.icao":"RJFF","arrival.terminal":"I","arrival.scheduled":"2023-04-14T13:20:00+00:00","arrival.estimated":"2023-04-14T13:20:00+00:00","airline":"Korean Air","flight.number":"5077","flight.iata":"KE5077","flight.icao":"KAL5077","flight.codeshared.airline_name":"jin air","flight.codeshared.airline_iata":"lj","flight.codeshared.airline_icao":"jna","flight.codeshared.flight_number":"223","flight.codeshared.flight_iata":"lj223","flight.codeshared.flight_icao":"jna223","destination":"Tokyo","country":"Japan","arrival_airport":"Fukuoka","schedule_arrive":"2023-04-14T13:20:00+00:00","temperature":16,"description":1,"wind_speed":24,"wind_degree":240,"humidity":72,"feelslike":16,"visibility":10,"cloud_cover":25}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Seoul (Incheon)","departure.timezone":"Asia/Seoul","departure.iata":"ICN","departure.icao":"RKSI","departure.terminal":"1","departure.gate":"E01","departure.scheduled":"2023-04-14T12:00:00+00:00","departure.estimated":"2023-04-14T12:00:00+00:00","arrival.airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","arrival.timezone":"Asia/Taipei","arrival.iata":"TPE","arrival.icao":"RCTP","arrival.terminal":"2","arrival.gate":"C8","arrival.baggage":"7B","arrival.scheduled":"2023-04-14T13:35:00+00:00","arrival.estimated":"2023-04-14T13:35:00+00:00","airline":"Thai Airways International","flight.number":"6397","flight.iata":"TG6397","flight.icao":"THA6397","flight.codeshared.airline_name":"eva air","flight.codeshared.airline_iata":"br","flight.codeshared.airline_icao":"eva","flight.codeshared.flight_number":"169","flight.codeshared.flight_iata":"br169","flight.codeshared.flight_icao":"eva169","destination":"Taipei","country":"Taiwan","arrival_airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","schedule_arrive":"2023-04-14T13:35:00+00:00","temperature":22,"description":2,"wind_speed":6,"wind_degree":310,"humidity":88,"feelslike":25,"visibility":6,"cloud_cover":50}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Cologne/bonn","departure.timezone":"Europe/Berlin","departure.iata":"CGN","departure.icao":"EDDK","departure.delay":22,"departure.scheduled":"2023-04-14T03:55:00+00:00","departure.estimated":"2023-04-14T03:55:00+00:00","arrival.airport":"Vienna International","arrival.timezone":"Europe/Vienna","arrival.iata":"VIE","arrival.icao":"LOWW","arrival.scheduled":"2023-04-14T05:23:00+00:00","arrival.estimated":"2023-04-14T05:23:00+00:00","airline":"UPS Airlines","flight.number":"274","flight.iata":"5X274","flight.icao":"UPS274","destination":"Vienna","country":"Austria","arrival_airport":"Vienna International","schedule_arrive":"2023-04-14T05:23:00+00:00","temperature":7,"description":3,"wind_speed":17,"wind_degree":330,"humidity":87,"feelslike":5,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Seoul (Incheon)","departure.timezone":"Asia/Seoul","departure.iata":"ICN","departure.icao":"RKSI","departure.terminal":"1","departure.gate":"E01","departure.scheduled":"2023-04-14T12:00:00+00:00","departure.estimated":"2023-04-14T12:00:00+00:00","arrival.airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","arrival.timezone":"Asia/Taipei","arrival.iata":"TPE","arrival.icao":"RCTP","arrival.terminal":"2","arrival.gate":"C8","arrival.baggage":"7B","arrival.scheduled":"2023-04-14T13:35:00+00:00","arrival.estimated":"2023-04-14T13:35:00+00:00","airline":"Thai Airways International","flight.number":"6397","flight.iata":"TG6397","flight.icao":"THA6397","flight.codeshared.airline_name":"eva air","flight.codeshared.airline_iata":"br","flight.codeshared.airline_icao":"eva","flight.codeshared.flight_number":"169","flight.codeshared.flight_iata":"br169","flight.codeshared.flight_icao":"eva169","destination":"Taipei","country":"Taiwan","arrival_airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","schedule_arrive":"2023-04-14T13:35:00+00:00","temperature":22,"description":4,"wind_speed":6,"wind_degree":310,"humidity":88,"feelslike":25,"visibility":6,"cloud_cover":50}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Guangzhou Baiyun International","departure.timezone":"Asia/Shanghai","departure.iata":"CAN","departure.icao":"ZGGG","departure.terminal":"2","departure.scheduled":"2023-04-14T10:10:00+00:00","departure.estimated":"2023-04-14T10:10:00+00:00","arrival.airport":"Xiamen","arrival.timezone":"Asia/Shanghai","arrival.iata":"XMN","arrival.icao":"ZSAM","arrival.terminal":"3","arrival.scheduled":"2023-04-14T11:40:00+00:00","arrival.estimated":"2023-04-14T11:40:00+00:00","airline":"Hebei Airlines","flight.number":"8312","flight.iata":"NS8312","flight.icao":"HBH8312","flight.codeshared.airline_name":"xiamen airlines","flight.codeshared.airline_iata":"mf","flight.codeshared.airline_icao":"cxa","flight.codeshared.flight_number":"8306","flight.codeshared.flight_iata":"mf8306","flight.codeshared.flight_icao":"cxa8306","destination":"Shanghai","country":"China","arrival_airport":"Xiamen","schedule_arrive":"2023-04-14T11:40:00+00:00","temperature":16,"description":5,"wind_speed":4,"wind_degree":170,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Hangzhou","departure.timezone":"Asia/Shanghai","departure.iata":"HGH","departure.icao":"ZSHC","departure.terminal":"3","departure.scheduled":"2023-04-14T09:20:00+00:00","departure.estimated":"2023-04-14T09:20:00+00:00","arrival.airport":"Nanning","arrival.timezone":"Asia/Shanghai","arrival.iata":"NNG","arrival.icao":"ZGNN","arrival.terminal":"T2","arrival.scheduled":"2023-04-14T12:05:00+00:00","arrival.estimated":"2023-04-14T12:05:00+00:00","airline":"Loong Air","flight.number":"3479","flight.iata":"GJ3479","flight.icao":"CDC3479","flight.codeshared.airline_name":"xiamen airlines","flight.codeshared.airline_iata":"mf","flight.codeshared.airline_icao":"cxa","flight.codeshared.flight_number":"8351","flight.codeshared.flight_iata":"mf8351","flight.codeshared.flight_icao":"cxa8351","destination":"Shanghai","country":"China","arrival_airport":"Nanning","schedule_arrive":"2023-04-14T12:05:00+00:00","temperature":16,"description":6,"wind_speed":7,"wind_degree":210,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Hangzhou","departure.timezone":"Asia/Shanghai","departure.iata":"HGH","departure.icao":"ZSHC","departure.terminal":"3","departure.scheduled":"2023-04-14T09:20:00+00:00","departure.estimated":"2023-04-14T09:20:00+00:00","arrival.airport":"Nanning","arrival.timezone":"Asia/Shanghai","arrival.iata":"NNG","arrival.icao":"ZGNN","arrival.terminal":"T2","arrival.scheduled":"2023-04-14T12:05:00+00:00","arrival.estimated":"2023-04-14T12:05:00+00:00","airline":"Hebei Airlines","flight.number":"8353","flight.iata":"NS8353","flight.icao":"HBH8353","flight.codeshared.airline_name":"xiamen airlines","flight.codeshared.airline_iata":"mf","flight.codeshared.airline_icao":"cxa","flight.codeshared.flight_number":"8351","flight.codeshared.flight_iata":"mf8351","flight.codeshared.flight_icao":"cxa8351","destination":"Shanghai","country":"China","arrival_airport":"Nanning","schedule_arrive":"2023-04-14T12:05:00+00:00","temperature":16,"description":7,"wind_speed":7,"wind_degree":210,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Yancheng","departure.timezone":"Asia/Shanghai","departure.iata":"YNZ","departure.icao":"ZSYN","departure.scheduled":"2023-04-14T11:55:00+00:00","departure.estimated":"2023-04-14T11:55:00+00:00","arrival.airport":"Changsha","arrival.timezone":"Asia/Shanghai","arrival.iata":"CSX","arrival.icao":"ZGHA","arrival.terminal":"2","arrival.scheduled":"2023-04-14T14:00:00+00:00","arrival.estimated":"2023-04-14T14:00:00+00:00","airline":"Chongqing Airlines","flight.number":"2005","flight.iata":"OQ2005","flight.icao":"CQN2005","destination":"Shanghai","country":"China","arrival_airport":"Changsha","schedule_arrive":"2023-04-14T14:00:00+00:00","temperature":16,"description":8,"wind_speed":7,"wind_degree":210,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}

答案1

得分: 2

以下是您要翻译的内容:

data = []
with open(path) as fh:  # where path is your JSON file
    for line in fh:     # file-likes are iterable
        data.append(json.loads(line))

或者,您可以考虑让数据源通过在行之间添加逗号将其包装为[],使其成为有效的JSON列表(或者如果可以控制数据源,直接选择更友好的格式)

>>> json.loads("""{"foo":1}\n{"bar":2}")      # current
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 340, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 10)
>>> json.loads("""[{"foo":1},\n{"bar":2}]""")   # JSON list
[{'foo': 1}, {'bar': 2}]
英文:

You could open() it and read it back by-lines

data = []
with open(path) as fh:  # where path is your JSON file
    for line in fh:     # file-likes are iterable
        data.append(json.loads(line))

Alternatively, you might consider having the data source make this valid JSON by wrapping it with [] and adding commas between lines to make this a list (or directly pick a friendlier format if you can control the data source)

&gt;&gt;&gt; json.loads(&quot;&quot;&quot;{&quot;foo&quot;:1}\n{&quot;bar&quot;:2}&quot;&quot;&quot;)      # current
Traceback (most recent call last):
  File &quot;&lt;stdin&gt;&quot;, line 1, in &lt;module&gt;
  File &quot;/usr/lib/python3.9/json/__init__.py&quot;, line 346, in loads
    return _default_decoder.decode(s)
  File &quot;/usr/lib/python3.9/json/decoder.py&quot;, line 340, in decode
    raise JSONDecodeError(&quot;Extra data&quot;, s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 10)
&gt;&gt;&gt; json.loads(&quot;&quot;&quot;[{&quot;foo&quot;:1},\n{&quot;bar&quot;:2}]&quot;&quot;&quot;)   # JSON list
[{&#39;foo&#39;: 1}, {&#39;bar&#39;: 2}]

huangapple
  • 本文由 发表于 2023年4月17日 04:40:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76030225.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定