问题

我有许多数据集，来自多个 Excel 文件，我想在同一张图上绘制它们，每个数据集使用不同的颜色。

我已经创建了 4 个包含随机数据的电子表格，用于测试。

第一列定义了测量，代码应该选择其中一个包含 5 行数据（X、Y）的列，并将它们添加到数据框中。结果应该是每个文件对应一个数据集，所有数据集都绘制在同一张图上，并且每个数据集都用不同的颜色绘制。

我一直在使用从这里的人那里获取的修改后的代码片段，他们也试图做同样的事情。问题是，我无法为每个数据集设置不同的颜色，因为程序将它们视为一个数据集，这是因为由于 pd.concat()，它将它们合并成一行。您知道我如何克服这个问题吗？

其他提出绘制单个图中的多个数据集的问题几乎都是关于少量数据集的，而在我的情况下，我有大约 50 个数据集，所以不能为每个数据集创建一个子图，除非有一种自动执行此操作的方法。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import glob
import os
from os import path
import sys
import openpyxl 

# 创建一个包含目录中所有 Excel 文件的列表
xlsx_files = glob.glob(r'C:\Users\exx762\Desktop\*.xlsx')
files = []
n = len(xlsx_files)
index = 0

# 从每个文件中选择所需的数据块并添加到数据框
for file in xlsx_files:
    index += 1
    files.append(pd.read_excel(file))
df_files = pd.concat(files)
ph_loops = df_files[df_files['Measurement'] == 2]
x = ph_loops['X']
y = ph_loops['Y']

# 绘制数据框中的元素
ax = plt.subplot()
colors = plt.cm.jet(np.linspace(0, 1, n))
ax.set_prop_cycle('color', list(colors))
ax.plot(x, y, marker='.', c=colors[index - 1], linewidth=0.5, markersize=2)
print(colors[index - 1])
ax.tick_params(axis='y', color='k')
ax.set_xlabel('X', fontsize=12, weight='bold')
ax.set_ylabel('Y', fontsize=12, weight='bold')
ax.set_title(file + '\n')
ax.tick_params(width=2)
plt.plot()
plt.show()

实际结果


<details>
<summary>英文:</summary>

I have many datasets taken from multiple excel files that I would like to plot on the same graph each with a different color.
I have created 4 spreadsheets with random data for testing.
The first column defines the measurement, the code should select one of this containing 5 rows of data (X, Y), and add them to a dataframe. The results should be 1 dataset for every file to be plot all together on the same graph and having each plot of a different color.

[Spreadsheets](https://i.stack.imgur.com/yplW0.png)

I have been using modified pieces of codes taken on here from people which were trying to do the same thing.  The problem is that I cannot color each plot differently because the program counts them as one, because due to the `pd.concat()` it merge these into 1 line. Do you know how I could overcome this?

Other questions asking to plot multiple datasets in single graph are almost all about a small number of dataset, while in my case I have like 50, thus cannot create a subplot for each one of them, unless there is a way to do this automatically

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import glob
import os
from os import path
import sys
import openpyxl

create a list of all excel files in the directory

xlsx_files=glob.glob(r'C:\Users\exx762\Desktop*.xlsx')
files=[]
n=len(xlsx_files)
index=0

select chunk of data needed from each file and add to dataframe

for file in xlsx_files:
index+=1
files.append(pd.read_excel(file))
df_files=pd.concat(files)
ph_loops=df_files[df_files['Measurement']==2]
x = ph_loops['X']
y = ph_loops['Y']

plot elements in the dataframe

ax=plt.subplot()
colors=plt.cm.jet(np.linspace(0, 1, n))
ax.set_prop_cycle(&#39;color&#39;, list(colors))
ax.plot(x, y, marker=&#39;.&#39;, c=colors[index-1], linewidth=0.5, markersize=2)
print(colors[index-1])
ax.tick_params(axis=&#39;y&#39;, color=&#39;k&#39;)
ax.set_xlabel(&#39;X&#39;, fontsize=12, weight=&#39;bold&#39;)
ax.set_ylabel(&#39;Y&#39;, fontsize=12, weight=&#39;bold&#39;)
ax.set_title(file+&#39;\n&#39;)
ax.tick_params(width=2)
plt.plot()

plt.show()


[&gt; Actual result](https://i.stack.imgur.com/9QVQQ.png)


</details>


# 答案1
**得分**: 0

你可以在连接数据框时添加一个id字段（下面我使用了`name`），然后可以在循环中绘图。示例：

```python
# 创建示例数据框
dfs = []
for i in range(1, 4):
    df = pd.DataFrame(np.random.randn(10, 2), columns=['x', 'y'])
    df.insert(0, 'name', i)
    dfs.append(df)

result = pd.concat(dfs, ignore_index=True)

# 绘图
fig, ax = plt.subplots()
for name, group in result.groupby('name'):
    group.plot(x='x', y='y', ax=ax, label=name)

plt.show()

英文:

You can add an id field (I used name below) to the dataframes as you concatenate them, then you can plot in a loop. Example:

# Create example dataframes
dfs = []
for i in range(1, 4):
    df = pd.DataFrame(np.random.randn(10, 2), columns=[&#39;x&#39;, &#39;y&#39;])
    df.insert(0, &#39;name&#39;, i)
    dfs.append(df)

result = pd.concat(dfs, ignore_index=True)

# Plot
fig, ax = plt.subplots()
for name, group in result.groupby(&#39;name&#39;):
    group.plot(x=&#39;x&#39;, y=&#39;y&#39;, ax=ax, label=name)

plt.show()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在单个图中绘制多个数据集

问题

create a list of all excel files in the directory

select chunk of data needed from each file and add to dataframe

plot elements in the dataframe

Excel：如何在不同的Excel文件中识别重复值

如何根据一组列的组合作为主键，从另一个CSV文件更新CSV文件？

Pandas将.xlsx列读取为日期时间而不是浮点数。

将xticks更改为月份的名称

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论