英文:
Plotting multiple datasets in single graph
问题
我有许多数据集,来自多个 Excel 文件,我想在同一张图上绘制它们,每个数据集使用不同的颜色。
我已经创建了 4 个包含随机数据的电子表格,用于测试。
第一列定义了测量,代码应该选择其中一个包含 5 行数据(X、Y)的列,并将它们添加到数据框中。结果应该是每个文件对应一个数据集,所有数据集都绘制在同一张图上,并且每个数据集都用不同的颜色绘制。
我一直在使用从这里的人那里获取的修改后的代码片段,他们也试图做同样的事情。问题是,我无法为每个数据集设置不同的颜色,因为程序将它们视为一个数据集,这是因为由于 pd.concat()
,它将它们合并成一行。您知道我如何克服这个问题吗?
其他提出绘制单个图中的多个数据集的问题几乎都是关于少量数据集的,而在我的情况下,我有大约 50 个数据集,所以不能为每个数据集创建一个子图,除非有一种自动执行此操作的方法。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import glob
import os
from os import path
import sys
import openpyxl
# 创建一个包含目录中所有 Excel 文件的列表
xlsx_files = glob.glob(r'C:\Users\exx762\Desktop\*.xlsx')
files = []
n = len(xlsx_files)
index = 0
# 从每个文件中选择所需的数据块并添加到数据框
for file in xlsx_files:
index += 1
files.append(pd.read_excel(file))
df_files = pd.concat(files)
ph_loops = df_files[df_files['Measurement'] == 2]
x = ph_loops['X']
y = ph_loops['Y']
# 绘制数据框中的元素
ax = plt.subplot()
colors = plt.cm.jet(np.linspace(0, 1, n))
ax.set_prop_cycle('color', list(colors))
ax.plot(x, y, marker='.', c=colors[index - 1], linewidth=0.5, markersize=2)
print(colors[index - 1])
ax.tick_params(axis='y', color='k')
ax.set_xlabel('X', fontsize=12, weight='bold')
ax.set_ylabel('Y', fontsize=12, weight='bold')
ax.set_title(file + '\n')
ax.tick_params(width=2)
plt.plot()
plt.show()
<details>
<summary>英文:</summary>
I have many datasets taken from multiple excel files that I would like to plot on the same graph each with a different color.
I have created 4 spreadsheets with random data for testing.
The first column defines the measurement, the code should select one of this containing 5 rows of data (X, Y), and add them to a dataframe. The results should be 1 dataset for every file to be plot all together on the same graph and having each plot of a different color.
[Spreadsheets](https://i.stack.imgur.com/yplW0.png)
I have been using modified pieces of codes taken on here from people which were trying to do the same thing. The problem is that I cannot color each plot differently because the program counts them as one, because due to the `pd.concat()` it merge these into 1 line. Do you know how I could overcome this?
Other questions asking to plot multiple datasets in single graph are almost all about a small number of dataset, while in my case I have like 50, thus cannot create a subplot for each one of them, unless there is a way to do this automatically
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import glob
import os
from os import path
import sys
import openpyxl
create a list of all excel files in the directory
xlsx_files=glob.glob(r'C:\Users\exx762\Desktop*.xlsx')
files=[]
n=len(xlsx_files)
index=0
select chunk of data needed from each file and add to dataframe
for file in xlsx_files:
index+=1
files.append(pd.read_excel(file))
df_files=pd.concat(files)
ph_loops=df_files[df_files['Measurement']==2]
x = ph_loops['X']
y = ph_loops['Y']
plot elements in the dataframe
ax=plt.subplot()
colors=plt.cm.jet(np.linspace(0, 1, n))
ax.set_prop_cycle('color', list(colors))
ax.plot(x, y, marker='.', c=colors[index-1], linewidth=0.5, markersize=2)
print(colors[index-1])
ax.tick_params(axis='y', color='k')
ax.set_xlabel('X', fontsize=12, weight='bold')
ax.set_ylabel('Y', fontsize=12, weight='bold')
ax.set_title(file+'\n')
ax.tick_params(width=2)
plt.plot()
plt.show()
[> Actual result](https://i.stack.imgur.com/9QVQQ.png)
</details>
# 答案1
**得分**: 0
你可以在连接数据框时添加一个id字段(下面我使用了`name`),然后可以在循环中绘图。示例:
```python
# 创建示例数据框
dfs = []
for i in range(1, 4):
df = pd.DataFrame(np.random.randn(10, 2), columns=['x', 'y'])
df.insert(0, 'name', i)
dfs.append(df)
result = pd.concat(dfs, ignore_index=True)
# 绘图
fig, ax = plt.subplots()
for name, group in result.groupby('name'):
group.plot(x='x', y='y', ax=ax, label=name)
plt.show()
英文:
You can add an id field (I used name
below) to the dataframes as you concatenate them, then you can plot in a loop. Example:
# Create example dataframes
dfs = []
for i in range(1, 4):
df = pd.DataFrame(np.random.randn(10, 2), columns=['x', 'y'])
df.insert(0, 'name', i)
dfs.append(df)
result = pd.concat(dfs, ignore_index=True)
# Plot
fig, ax = plt.subplots()
for name, group in result.groupby('name'):
group.plot(x='x', y='y', ax=ax, label=name)
plt.show()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论