英文:
How to extract specific values from imported csv and write it?
问题
我有一个CSV文件,格式如下:
number,values
686790635,{"2019-05-24T13:46:35": "CSCvp87661"}
686235835,{"2019-02-27T14:13:53": "CSCvj48931"}
689324672,{"2020-06-19T08:50:53": "CSCvs42803"}
689995407,{"2020-09-27T05:51:39": "CSCvg55782"}
688751767,{"2020-03-26T11:28:44": "CSCvc81396"}
689868626,{"2020-09-10T01:29:51": "CSCux76799", "2020-09-10T01:29:53": "CSCux76799"}
689206940,{"2020-06-02T20:40:44": "CSCvo65492"}
686259208,{"2019-03-11T02:55:43": "CSCvi66732", "2019-03-11T02:55:52": "CSCvg81628"}
689030956,{"2020-05-07T10:05:09": "CSCvh19223"}
我试图提取以下列表中的值:
values = ["CSCvp87661", "CSCvj48931", "CSCvs42803", "CSCvg55782", "CSCvc81396", "CSCux76799", "CSCux76799", "CSCvo65492", "CSCvi66732", "CSCvg81628", "CSCvh19223"]
我试图循环遍历并迭代这些值,但无法以确切的列表格式获取它。希望这能帮到您。
英文:
I have a csv in the below format:
number,values
686790635,{'2019-05-24T13:46:35': 'CSCvp87661'}
686235835,{'2019-02-27T14:13:53': 'CSCvj48931'}
689324672,{'2020-06-19T08:50:53': 'CSCvs42803'}
689995407,{'2020-09-27T05:51:39': 'CSCvg55782'}
688751767,{'2020-03-26T11:28:44': 'CSCvc81396'}
689868626,"{'2020-09-10T01:29:51': 'CSCux76799', '2020-09-10T01:29:53': 'CSCux76799'}"
689206940,{'2020-06-02T20:40:44': 'CSCvo65492'}
686259208,"{'2019-03-11T02:55:43': 'CSCvi66732', '2019-03-11T02:55:52': 'CSCvg81628'}"
689030956,{'2020-05-07T10:05:09': 'CSCvh19223'}
Here I was trying to extract the values as in below list:
values = [CSCvp87661,CSCvj48931, CSCvs42803, CSCvg55782, CSCvc81396, CSCux76799, CSCux76799, CSCvo65492, CSCvi66732, CSCvg81628, CSCvh19223]
I was trying to loop over and iterate the values but not able to get it in the exact list format.
Any help would be helpful.
答案1
得分: 1
首先,这是一个奇怪的CSV格式。有更好的选项来组织这些数据,例如JSON。
import csv
from ast import literal_eval
from itertools import chain
with open('data.csv') as f:
rdr = csv.DictReader(f)
data = list(chain(*(literal_eval(line['values']).values() for line in rdr)))
print(data)
输出:
['CSCvp87661', 'CSCvj48931', 'CSCvs42803', 'CSCvg55782', 'CSCvc81396', 'CSCux76799', 'CSCux76799', 'CSCvo65492', 'CSCvi66732', 'CSCvg81628', 'CSCvh19223']
英文:
First of all - these is odd csv format. There are better options to structure these data - e.g. JSON
import csv
from ast import literal_eval
from itertools import chain
with open('data.csv') as f:
rdr = csv.DictReader(f)
data = list(chain(*(literal_eval(line['values']).values() for line in rdr)))
print(data)
output
['CSCvp87661', 'CSCvj48931', 'CSCvs42803', 'CSCvg55782', 'CSCvc81396', 'CSCux76799', 'CSCux76799', 'CSCvo65492', 'CSCvi66732', 'CSCvg81628', 'CSCvh19223']
答案2
得分: 0
values
列是一个JSON字典,因此您可以使用内置的json
模块将其转换为Python字典,并使用.values()
来获取值,以下是一些简单的(未经测试的)代码来帮助您,您可能需要使用.strip(''"'')
来删除某些行上的额外引号:
import csv
import json
values = []
with open('your_csv_file.csv', 'r') as file:
csv_reader = csv.DictReader(file)
for row in csv_reader:
value = row['values']
# 解析JSON字符串
value_dict = json.loads(value)
value_list = list(value_dict.values())
values.extend(value_list)
print(values)
注意:请将 'your_csv_file.csv'
替换为您实际的CSV文件路径。
英文:
The values
column is a JSON dictionary, so you can use the json
built-in module to convert it to a python dictionary, and .values()
to get the values, here is some simple (untested) code to help you, you may need to use .strip('"')
to remove the extra quotes on some lines:
import csv
import json
values = []
with open('your_csv_file.csv', 'r') as file:
csv_reader = csv.DictReader(file)
for row in csv_reader:
value = row['values']
# Parse the JSON string
value_dict = json.loads(value)
value_list = list(value_dict.values())
values.extend(value_list)
print(values)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论