将多个JSON文件的列表转换为Pandas数据框。

huangapple go评论74阅读模式
英文:

convert list of multiple json files into a dataframe pandas

问题

以下是您提供的代码和问题的翻译部分:

base_dir = 'jsons_final_folder/'
data_list = []
for file in os.listdir(base_dir):

    if 'json' in file:
        json_path = os.path.join(base_dir, file)
        json_data = pd.read_json(json_path, lines=True)
        data_list.append(json_data)

我得到了一个看起来像这样的列表

print(data_list)

output:

[                                                   0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,]                                         0

所以这是我的代码来转换DataFrame(数据帧):

with open("f.csv","w") as f:
    wr = csv.writer(f)
    wr.writerow(data_list)

但是我得到了一个类型为pandas.core.frame.DataFrame的DataFrame,像这样:

|{"general":{"key":"value","q":"... | {"general":{"key":"value","q":"... | {"general":{"key":"value","q":"... | {"general":{"key":"value","q":"... |
|-------------------------------- | -------------------------------- | -------------------------------- | -------------------------------- |
|                                 |                                 |                                 |                                 |

具有 n 列和 0 行的形状。

我在尝试将**dilimiter(分隔符)**添加到DataFrame,但是仍然无法获得我想要的结果。

我想要的最终形状是这样的:

|      json      |
|---------------------|
|   {"general":{"key":"value","q":"...         |
|   {"general":{"key":"value","q":"...         |

谢谢您。

英文:

This how I parsed multiple json files in a single list

base_dir = 'jsons_final_folder/'
data_list = []
for file in os.listdir(base_dir):

    if 'json' in file:
        json_path = os.path.join(base_dir, file)
        json_data = pd.read_json(json_path, lines=True)
        data_list.append(json_data)

And I got a list that look like this

print(data_list)

output:

[                                                   0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,                                          0

0  {"general":{"key":"value","q":"...,]                                         0

So this is my code to convert df

with open("f.csv","w") as f:
    wr = csv.writer(f)
    wr.writerow(data_list)

But I get a df that type pandas.core.frame.DataFrame like this:

{"general":{"key":"value","q":"..., {"general":{"key":"value","q":"..., {"general":{"key":"value","q":"..., {"general":{"key":"value","q":"...,

with shape of n columns and 0 rows

What I am trying to do here is to make a df out of this list that contains only jsons with specific queries but i don't what's the problem.

I also tried to add dilimiter

I wanted the final shape be look like this

json
{"general":{"key":"value","q":"...,
{"general":{"key":"value","q":"...,

Thank you

答案1

得分: 0

你试过 df = pd.DataFrame({'json': data_list}) 吗?

英文:

Have you tried df = pd.DataFrame({'json': data_list}) ?

答案2

得分: 0

Guys I found the solution from This video

function to return files

def get_files(filepath):
   all_files = []
   for root, dirs, files in os.walk(filepath):
      files = glob.glob(os.path.join(root, '*.json'))
      for f in files:
          all_files.append(os.path.abspath(f))
   return all_files

j_files = get_files("../path goes here/")

Here we read each file and import it into list

j_files_list = []
for j_file in j_files:
    with open(j_file) as doc:
        exp = json.load(doc)
        j_files_list.append(exp)

here we convert it into a df

df = pd.DataFrame(j_files_list)

And then save list to csv

df.to_csv('json_files_in_df.csv')

Thanks for helping out

英文:

Guys I found the solution from This video

function to return files

def get_files(filepath):
   all_files = []
   for root, dirs, files in os.walk(filepath):
      files = glob.glob(os.path.join(root,'*.json'))
      for f in files:
          all_files.append(os.path.abspath(f))
   return all_files


j_files = get_files("../path goes here/")

Here we read each file and import it into list

j_files_list = []
for j_file in j_files:
    with open(j_file) as doc:
        exp = json.load(doc)
        j_files_list.append(exp)

here we convert it into a df

df = pd.DataFrame(j_files_list)

And then save list to csv

df.to_csv('json_files_in_df.csv')

Thanks for helping out

huangapple
  • 本文由 发表于 2023年2月27日 09:46:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/75576164.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定