如何优化热图?

huangapple go评论58阅读模式
英文:

How to refine heatmap?

问题

我有一个类似这样的pandas数据框:

             SPX   RYH  RSP   RCD   RYE  ...   RTM   RHS   RYT   RYU  EWRE
    Date
    2022-02-25   NaN   NaN  NaN   NaN   NaN  ...   NaN   NaN   NaN   NaN   NaN
    2022-03-04   9.0   5.0  8.0  12.0   1.0  ...   6.0   4.0  11.0   2.0   3.0
    2022-03-11   8.0  12.0  6.0  11.0   1.0  ...   3.0  13.0   9.0   2.0   4.0
    2022-03-18   5.0   6.0  8.0   1.0  13.0  ...   9.0  10.0   2.0  12.0  11.0
    2022-03-25   5.0  12.0  9.0  13.0   1.0  ...   2.0   4.0  10.0   3.0   7.0

这是关于它的信息:

>>> a.ranks.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 52 entries, 2022-02-25 to 2023-02-17
Data columns (total 13 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   SPX     51 non-null     float64
 1   RYH     51 non-null     float64
 2   RSP     51 non-null     float64
 3   RCD     51 non-null     float64
 4   RYE     51 non-null     float64
 5   RYF     51 non-null     float64
 6   RGI     51 non-null     float64
 7   EWCO    51 non-null     float64
 8   RTM     51 non-null     float64
 9   RHS     51 non-null     float64
 10  RYT     51 non-null     float64
 11  RYU     51 non-null     float64
 12  EWRE    51 non-null     float64
dtypes: float64(13)
memory usage: 5.7 KB

我绘制了一个热图,如下所示:

cmap = sns.diverging_palette(133, 10, as_cmap=True)
sns.heatmap(self.ranks, cmap=cmap, annot=True, cbar=False)
plt.show()

这是结果:

如何优化热图?

我想要的是图像上下翻转,y轴上有符号,x轴上有日期。我尝试过.imshow()和各种变换数据框的方法,但都没有成功。

我怀疑有两个问题:

  1. Seaborn或imshow是正确的方法吗?
  2. 如何对一个索引为日期时间的pandas数据框进行变换?
英文:

I have a pandas data frame the looks like this:

         SPX   RYH  RSP   RCD   RYE  ...   RTM   RHS   RYT   RYU  EWRE
Date                                     ...                              
2022-02-25   NaN   NaN  NaN   NaN   NaN  ...   NaN   NaN   NaN   NaN   NaN
2022-03-04   9.0   5.0  8.0  12.0   1.0  ...   6.0   4.0  11.0   2.0   3.0 
2022-03-11   8.0  12.0  6.0  11.0   1.0  ...   3.0  13.0   9.0   2.0   4.0
2022-03-18   5.0   6.0  8.0   1.0  13.0  ...   9.0  10.0   2.0  12.0  11.0
2022-03-25   5.0  12.0  9.0  13.0   1.0  ...   2.0   4.0  10.0   3.0   7.0

Here is the info on it:

&gt;&gt;&gt; a.ranks.info()
&lt;class &#39;pandas.core.frame.DataFrame&#39;&gt;
DatetimeIndex: 52 entries, 2022-02-25 to 2023-02-17
Data columns (total 13 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   SPX     51 non-null     float64
 1   RYH     51 non-null     float64
 2   RSP     51 non-null     float64
 3   RCD     51 non-null     float64
 4   RYE     51 non-null     float64
 5   RYF     51 non-null     float64
 6   RGI     51 non-null     float64
 7   EWCO    51 non-null     float64
 8   RTM     51 non-null     float64
 9   RHS     51 non-null     float64
 10  RYT     51 non-null     float64
 11  RYU     51 non-null     float64
 12  EWRE    51 non-null     float64
dtypes: float64(13)
memory usage: 5.7 KB
&gt;&gt;&gt; 

I plot a heatmap of it like so:

    cmap = sns.diverging_palette(133, 10, as_cmap=True)
    sns.heatmap(self.ranks, cmap=cmap, annot=True, cbar=False)
    plt.show()

This is the result:如何优化热图?
What I would like to have is the image flipped with symbols on the y-axis and dates on the x-axis. I have tried .imshow() and the various pivot methods to no avail.

I suspect that have two questions:
Is seaborn or imshow the right way to go about this?
How do I pivot a pandas dataframe where the index is datetime?

答案1

得分: 1

你可以通过转置数据帧(df.T,交换索引和列)来交换 x 和 y。由于默认的日期时间转换也会添加时间,所以日期需要手动转换为字符串。sns.heatmap 有参数可以明确设置或更改刻度标签。可选地,你可以删除所有的全 NaN 行。

英文:

You can flip the x and y by transposing the dataframe (df.T, interchanging index and columns).
As the default datetime conversion also adds the time, the dates need to be converted manually to strings. sns.heatmap has parameters to explicitly set or change the tick labels.
Optionally, you can drop the all-NaN rows.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# first create some test data similar to the given data
rank_df = pd.DataFrame(
    np.random.randint(1, 15, size=(52, 13)).astype(float),
    columns=[&#39;SPX&#39;, &#39;RYH&#39;, &#39;RSP&#39;, &#39;RCD&#39;, &#39;RYE&#39;, &#39;RYF&#39;, &#39;RGI&#39;, &#39;EWCO&#39;, &#39;RTM&#39;, &#39;RHS&#39;, &#39;RYT&#39;, &#39;RYU&#39;, &#39;EWRE&#39;],
    index=pd.date_range(&#39;2022-02-25&#39;, &#39;2023-02-17&#39;, freq=&#39;W-FRI&#39;))
rank_df.iloc[0, :] = np.nan

rank_df_transposed = rank_df.dropna(how=&#39;all&#39;).T
xticklabels = [t.strftime(&#39;%Y-%m-%d&#39;) for t in rank_df_transposed.columns]
# optionally remove repeating months
xticklabels = [t1[8:] + (&#39;\n&#39; + t1[:7] if t1[:7] != t0[:7] else &#39;&#39;)
               for t0, t1 in zip([&#39; &#39; * 10] + xticklabels[:-1], xticklabels)]

fig, ax = plt.subplots(figsize=(15, 7))
sns.heatmap(data=rank_df_transposed,
            xticklabels=xticklabels, yticklabels=True,
            annot=True, cbar=False, ax=ax)
ax.tick_params(axis=&#39;x&#39;, rotation=0)
ax.tick_params(axis=&#39;y&#39;, rotation=0)
plt.tight_layout()  # fit all the labels nicely into the surrounding figure
plt.show()

如何优化热图?

huangapple
  • 本文由 发表于 2023年2月24日 07:55:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/75551453.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定