英文:
Altair bar chart with bars of variable width?
问题
I'm trying to use Altair in Python to make a bar chart where the bars have varying width depending on the data in a column of the source dataframe. The ultimate goal is to get a chart like this one.
The height of the bars corresponds to a marginal-cost of each energy-technology (given as a column in the source dataframe). The bar width corresponds to the capacity of each energy-technology (also given as columns in the source dataframe). Colors are ordinal data also from the source dataframe. The bars are sorted in increasing order of marginal cost. (A plot like this is called a "generation stack" in the energy industry). This is easy to achieve in matplotlib like shown in the code below:
import matplotlib.pyplot as plt
# Make fake dataset
height = [3, 12, 5, 18, 45]
bars = ('A', 'B', 'C', 'D', 'E')
# Choose the width of each bar and their positions
width = [0.1, 0.2, 3, 1.5, 0.3]
y_pos = [0, 0.3, 2, 4.5, 5.5]
# Make the plot
plt.bar(y_pos, height, width=width)
plt.xticks(y_pos, bars)
plt.show()
But is there a way to do this with Altair? I would want to do this with Altair so I can still get the other great features of Altair like a tooltip, selectors/bindings as I have lots of other data I want to show alongside the bar-chart.
First 20 rows of my source data looks like this.
英文:
I'm trying to use Altair in Python to make a bar chart where the bars have varying width depending on the data in a column of the source dataframe. The ultimate goal is to get a chart like this one:
The height of the bars corresponds to a marginal-cost of each energy-technology (given as a column in the source dataframe). The bar width corresponds to the capacity of each energy-technology (also given as a columns in the source dataframe). Colors are ordinal data also from the source dataframe. The bars are sorted in increasing order of marginal cost. (A plot like this is called a "generation stack" in the energy industry). This is easy to achieve in matplotlib like shown in the code below:
import matplotlib.pyplot as plt
# Make fake dataset
height = [3, 12, 5, 18, 45]
bars = ('A', 'B', 'C', 'D', 'E')
# Choose the width of each bar and their positions
width = [0.1,0.2,3,1.5,0.3]
y_pos = [0,0.3,2,4.5,5.5]
# Make the plot
plt.bar(y_pos, height, width=width)
plt.xticks(y_pos, bars)
plt.show()
(code from https://python-graph-gallery.com/5-control-width-and-space-in-barplots/)
But is there a way to do this with Altair? I would want to do this with Altair so I can still get the other great features of Altair like a tooltip, selectors/bindings as I have lots of other data I want to show alongside the bar-chart.
First 20 rows of my source data looks like this:
(does not match exactly the chart shown above).
答案1
得分: 8
以下是您要翻译的代码部分:
在Altair中,要实现这一点的方法是使用rect
标记并明确构建您的条形图。以下是一个模仿您的数据的示例:
import altair as alt
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame({
'MarginalCost': 100 * np.random.rand(30),
'Capacity': 10 * np.random.rand(30),
'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})
df = df.sort_values('MarginalCost')
df['x1'] = df['Capacity'].cumsum()
df['x0'] = df['x1'].shift(fill_value=0)
alt.Chart(df).mark_rect().encode(
x=alt.X('x0:Q', title='Capacity'),
x2='x1',
y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
color='Technology:N',
tooltip=["Technology", "Capacity", "MarginalCost"]
)
要在不对数据进行预处理的情况下获得相同的结果,您可以使用Altair的转换语法:
df = pd.DataFrame({
'MarginalCost': 100 * np.random.rand(30),
'Capacity': 10 * np.random.rand(30),
'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})
alt.Chart(df).transform_window(
x1='sum(Capacity)',
sort=[alt.SortField('MarginalCost')]
).transform_calculate(
x0='datum.x1 - datum.Capacity'
).mark_rect().encode(
x=alt.X('x0:Q', title='Capacity'),
x2='x1',
y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
color='Technology:N',
tooltip=["Technology", "Capacity", "MarginalCost"]
)
英文:
In Altair, the way to do this would be to use the rect
mark and construct your bars explicitly. Here is an example that mimics your data:
import altair as alt
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame({
'MarginalCost': 100 * np.random.rand(30),
'Capacity': 10 * np.random.rand(30),
'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})
df = df.sort_values('MarginalCost')
df['x1'] = df['Capacity'].cumsum()
df['x0'] = df['x1'].shift(fill_value=0)
alt.Chart(df).mark_rect().encode(
x=alt.X('x0:Q', title='Capacity'),
x2='x1',
y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
color='Technology:N',
tooltip=["Technology", "Capacity", "MarginalCost"]
)
To get the same result without preprocessing of the data, you can use Altair's transform syntax:
df = pd.DataFrame({
'MarginalCost': 100 * np.random.rand(30),
'Capacity': 10 * np.random.rand(30),
'Technology': np.random.choice(['SOLAR', 'THERMAL', 'WIND', 'GAS'], 30)
})
alt.Chart(df).transform_window(
x1='sum(Capacity)',
sort=[alt.SortField('MarginalCost')]
).transform_calculate(
x0='datum.x1 - datum.Capacity'
).mark_rect().encode(
x=alt.X('x0:Q', title='Capacity'),
x2='x1',
y=alt.Y('MarginalCost:Q', title='Marginal Cost'),
color='Technology:N',
tooltip=["Technology", "Capacity", "MarginalCost"]
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论