Display both axes in sorted order for non numerical data

huangapple go评论119阅读模式
英文:

Display both axes in sorted order for non numerical data

问题

如何实现两个轴的正确顺序:

  • 要实现 a-b-c 而不是 c-a-b
  • 要实现 x-y-z 而不是 y-z-x
import matplotlib.pyplot as plt

categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]

plt.scatter(categories_x, categories_y)
plt.show()

Display both axes in sorted order for non numerical data

在Stack Overflow上有很多解决方案,它们依赖于以下两种属性之一:

  1. 数据可以转换为数字,例如("1","0",...)-> 转换为数字
  2. 只有一个轴的顺序不正确 -> 按此轴对两个数组进行排序(这样做的原因是轴刻度按首次出现的顺序排列)

但对于我的示例,这些解决方案都不适用。

我正在寻找一种如何在matplotlib中实现这一目标的解决方案。我知道可能有其他方法来传达相同的信息,或者可能有其他不会出现这个问题的库。

英文:

How to achieve the correct order for both axes:

  • a-b-c instead of c-a-b
  • x-y-z instead of y-z-x
import matplotlib.pyplot as plt


categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]

plt.scatter(categories_x, categories_y)
plt.show()

Display both axes in sorted order for non numerical data

There are a lot of solutions on SO, that rely on either of two properties:

  1. The data can be cast to numerical, e.g. ("1", "0", ...) -> cast to numerical
  2. Only one axis has the wrong order -> sort the two arrays by this axis (the reason why this works is, that the axis-ticks are ordered by first occurrence)

But for my example neither of these solutions work.

I'm looking for a solution, of how to get this to work in matplotlib. I am aware, that there are other probably even better ways to convey the same message, or maybe other libraries that don't have this issue.

答案1

得分: 1

关于使用[tag:pandas]和有序的分类数据,您可以尝试以下代码:

ax = plt.subplot()

X = pd.Categorical(categories_x, ordered=True)
Y = pd.Categorical(categories_y, ordered=True)

ax.scatter(X.codes, Y.codes)
ax.set_xticks(range(len(X.categories)), X.categories)
ax.set_yticks(range(len(Y.categories)), Y.categories)

输出结果如下:

Display both axes in sorted order for non numerical data

英文:

What about using [tag:pandas] and an ordered Categorical?

ax = plt.subplot()

X = pd.Categorical(categories_x, ordered=True)
Y = pd.Categorical(categories_y, ordered=True)

ax.scatter(X.codes, Y.codes)
ax.set_xticks(range(len(X.categories)), X.categories)
ax.set_yticks(range(len(Y.categories)), Y.categories)

Output:

Display both axes in sorted order for non numerical data

答案2

得分: 1

以下是您要翻译的代码部分:

import matplotlib.pyplot as plt

categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]

def axis_to_number(values):
    # 可以自定义的映射函数
    return {j:i for i, j in enumerate(sorted(set(values))}

map_x = axis_to_number(categories_x)
map_y = axis_to_number(categories_y)

# 现在将原始数组转换为映射值以保持顺序
cx = [map_x[i] for i in categories_x]
cy = [map_y[i] for i in categories_y]

xticks, xticklabels = [x for x in map_x.values()], [x for x in map_x.keys()]
yticks, yticklabels = [y for y in map_y.values()], [y for y in map_y.keys()]

# 绘图
fig, ax = plt.subplots()
ax.plot(cx, cy, 'o')
ax.set_xticks(xticks)
ax.set_xticklabels(xticklabels)
ax.set_yticks(yticks)
ax.set_yticklabels(yticklabels)

希望这有帮助!

英文:

How about simply converting everything to numerical values and playing with the x- and y-ticklabels


import matplotlib.pyplot as plt

categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]

def axis_to_number(values):
    # this mapping function can be customized 
    return {j:i for i, j in enumerate(sorted(set(values)))}

map_x = axis_to_number(categories_x)
map_y = axis_to_number(categories_y)

# now convert the original arrays to the 
# mapped values to keep the order
cx = [map_x[i] for i in categories_x]
cy = [map_y[i] for i in categories_y]

xticks, xticklabels = [x for x in map_x.values()], [x for x in map_x.keys()]
yticks, yticklabels = [y for y in map_y.values()], [y for y in map_y.keys()]

# plot
fig, ax = plt.subplots()
ax.plot(cx, cy, 'o')
ax.set_xticks(xticks)
ax.set_xticklabels(xticklabels)
ax.set_yticks(yticks)
ax.set_yticklabels(yticklabels)

Display both axes in sorted order for non numerical data

答案3

得分: 0

我们可以使用排序函数将它们按顺序排列。

import matplotlib.pyplot as plt

categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]

plt.scatter(sorted(categories_x), sorted(categories_y))
plt.show()
英文:

We can use sort function to arrange them in sequence.

import matplotlib.pyplot as plt

categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]

plt.scatter(sorted(categories_x), sorted(categories_y))
plt.show()

答案4

得分: 0

以下是翻译好的代码部分:

import matplotlib.pyplot as plt

categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]

p1 = plt.scatter(sorted(categories_x), sorted(categories_y), c='#00000000')
# p1.set_visible(False)

plt.scatter(categories_x, categories_y)

plt.show()

如果你有大量的数据点并且性能成为问题,你可以考虑使用 sorted(set(categ...))

英文:

You could draw a first scatter with the ordered strings, to get the ticks setup properly, then hide it (or use a transparent color) and draw the actual diagram:

import matplotlib.pyplot as plt
   
categories_x = ["c", "a", "b", "c", "b"]
categories_y = ["y", "z", "y", "x", "z"]
    
p1 = plt.scatter(sorted(categories_x),sorted(categories_y),c='#00000000')
# p1.set_visible(False)

plt.scatter(categories_x,categories_y)

plt.show()

You may want to use sorted(set(categ...)) if you have large number of points and performance becomes a concern

huangapple
  • 本文由 发表于 2023年6月5日 19:36:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76406024.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定