无法在使用Python的Parallel和delayed函数时获取两个数据框输出。

huangapple go评论62阅读模式
英文:

Unable to get 2 dataframe output while using Parallel and delayed functions in python

问题

I want to do multi processing or parallel processing in python. I have written the following code.

import pandas as pd
import multiprocessing
from joblib import Parallel, delayed
from tqdm import tqdm

num_cores = multiprocessing.cpu_count()

df = pd.DataFrame([["A",5],["B",4],["C",7]],columns=["item","val"])
inputs = ["A","B"]

def my_function(inputs):
    for unique_id in inputs:
        df3 code
        df4 code
    return (df3,df4)

if __name__ == "__main__":
    df3,df4 = Parallel(n_jobs=num_cores)(delayed(my_function)(i) for i in inputs)```

I am able to get df3 and df4 output if save to csv file but while returning 2 variables I am getting following error:

***ValueError: not enough values to unpack (expected 2, got 1)***

What can be the possible reason? How to resolve it?

<details>
<summary>英文:</summary>

I want to do multi processing or parallel processing in python. I have written the following code.
```import numpy as np
import pandas as pd
import multiprocessing
from joblib import Parallel, delayed
from tqdm import tqdm

num_cores = multiprocessing.cpu_count()

df = pd.DataFrame([[&quot;A&quot;,5],[&quot;B&quot;,4],[&quot;C&quot;,7]],columns=[&quot;item&quot;,&quot;val&quot;])
inputs = [&quot;A&quot;,&quot;B&quot;]

def my_function(inputs):
    for unique_id in inputs:
        df3 code
        df4 code
    return (df3,df4)

if __name__ == &quot;__main__&quot;:
    df3,df4 = Parallel(n_jobs=num_cores)(delayed(my_function)(i) for i in inputs)```

I am able to get df3 and df4 output if save to csv file but while returning 2 variables I am getting following error:

***ValueError: not enough values to unpack (expected 2, got 1)***

What can be the possible reason? How to resolve it?

</details>


# 答案1
**得分**: 1

你可以尝试使用 `zip` 和 `pd.concat` 来执行以下操作:

```python
df3, df4 = zip(*Parallel(n_jobs=num_cores)(delayed(my_function)(i) for i in inputs))
df3 = pd.concat(df3)
df4 = pd.concat(df4)
英文:

What your code does it not clear. However you can try to use zip and pd.concat:

df3, df4 = zip(*Parallel(n_jobs=num_cores)(delayed(my_function)(i) for i in inputs))
df3 = pd.concat(df3)
df4 = pd.concat(df4)

huangapple
  • 本文由 发表于 2023年6月15日 16:52:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76480760.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定