将多个JSON文件的列表转换为Pandas数据框。

huangapple go评论118阅读模式
英文:

convert list of multiple json files into a dataframe pandas

问题

以下是您提供的代码和问题的翻译部分:

  1. base_dir = 'jsons_final_folder/'
  2. data_list = []
  3. for file in os.listdir(base_dir):
  4. if 'json' in file:
  5. json_path = os.path.join(base_dir, file)
  6. json_data = pd.read_json(json_path, lines=True)
  7. data_list.append(json_data)

我得到了一个看起来像这样的列表

  1. print(data_list)
  2. output:
  3. [ 0
  4. 0 {"general":{"key":"value","q":"..., 0
  5. 0 {"general":{"key":"value","q":"..., 0
  6. 0 {"general":{"key":"value","q":"..., 0
  7. 0 {"general":{"key":"value","q":"..., 0
  8. 0 {"general":{"key":"value","q":"..., 0
  9. 0 {"general":{"key":"value","q":"..., 0
  10. 0 {"general":{"key":"value","q":"...,] 0

所以这是我的代码来转换DataFrame(数据帧):

  1. with open("f.csv","w") as f:
  2. wr = csv.writer(f)
  3. wr.writerow(data_list)

但是我得到了一个类型为pandas.core.frame.DataFrame的DataFrame,像这样:

  1. |{"general":{"key":"value","q":"... | {"general":{"key":"value","q":"... | {"general":{"key":"value","q":"... | {"general":{"key":"value","q":"... |
  2. |-------------------------------- | -------------------------------- | -------------------------------- | -------------------------------- |
  3. | | | | |

具有 n 列和 0 行的形状。

我在尝试将**dilimiter(分隔符)**添加到DataFrame,但是仍然无法获得我想要的结果。

我想要的最终形状是这样的:

  1. | json |
  2. |---------------------|
  3. | {"general":{"key":"value","q":"... |
  4. | {"general":{"key":"value","q":"... |

谢谢您。

英文:

This how I parsed multiple json files in a single list

  1. base_dir = 'jsons_final_folder/'
  2. data_list = []
  3. for file in os.listdir(base_dir):
  4. if 'json' in file:
  5. json_path = os.path.join(base_dir, file)
  6. json_data = pd.read_json(json_path, lines=True)
  7. data_list.append(json_data)

And I got a list that look like this

  1. print(data_list)
  2. output:
  3. [ 0
  4. 0 {"general":{"key":"value","q":"..., 0
  5. 0 {"general":{"key":"value","q":"..., 0
  6. 0 {"general":{"key":"value","q":"..., 0
  7. 0 {"general":{"key":"value","q":"..., 0
  8. 0 {"general":{"key":"value","q":"..., 0
  9. 0 {"general":{"key":"value","q":"..., 0
  10. 0 {"general":{"key":"value","q":"...,] 0

So this is my code to convert df

  1. with open("f.csv","w") as f:
  2. wr = csv.writer(f)
  3. wr.writerow(data_list)

But I get a df that type pandas.core.frame.DataFrame like this:

{"general":{"key":"value","q":"..., {"general":{"key":"value","q":"..., {"general":{"key":"value","q":"..., {"general":{"key":"value","q":"...,

with shape of n columns and 0 rows

What I am trying to do here is to make a df out of this list that contains only jsons with specific queries but i don't what's the problem.

I also tried to add dilimiter

I wanted the final shape be look like this

json
{"general":{"key":"value","q":"...,
{"general":{"key":"value","q":"...,

Thank you

答案1

得分: 0

你试过 df = pd.DataFrame({'json': data_list}) 吗?

英文:

Have you tried df = pd.DataFrame({'json': data_list}) ?

答案2

得分: 0

Guys I found the solution from This video

function to return files

  1. def get_files(filepath):
  2. all_files = []
  3. for root, dirs, files in os.walk(filepath):
  4. files = glob.glob(os.path.join(root, '*.json'))
  5. for f in files:
  6. all_files.append(os.path.abspath(f))
  7. return all_files

j_files = get_files("../path goes here/")

Here we read each file and import it into list

  1. j_files_list = []
  2. for j_file in j_files:
  3. with open(j_file) as doc:
  4. exp = json.load(doc)
  5. j_files_list.append(exp)

here we convert it into a df

  1. df = pd.DataFrame(j_files_list)

And then save list to csv

  1. df.to_csv('json_files_in_df.csv')

Thanks for helping out

英文:

Guys I found the solution from This video

function to return files

  1. def get_files(filepath):
  2. all_files = []
  3. for root, dirs, files in os.walk(filepath):
  4. files = glob.glob(os.path.join(root,'*.json'))
  5. for f in files:
  6. all_files.append(os.path.abspath(f))
  7. return all_files
  8. j_files = get_files("../path goes here/")

Here we read each file and import it into list

  1. j_files_list = []
  2. for j_file in j_files:
  3. with open(j_file) as doc:
  4. exp = json.load(doc)
  5. j_files_list.append(exp)

here we convert it into a df

  1. df = pd.DataFrame(j_files_list)

And then save list to csv

  1. df.to_csv('json_files_in_df.csv')

Thanks for helping out

huangapple
  • 本文由 发表于 2023年2月27日 09:46:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/75576164.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定