英文:
visualize spectral data in altair
问题
在Altair中绘制频谱图/频谱数据。
我正在尝试绘制一个包含频率区间作为列标题、时间/帧作为索引的数据框(df),每个单元格的值是那个时间的那个频率的振幅。如何将Y轴编码为列标签,并使Altair识别它作为编码通道呢?也许这是一个微不足道的任务,我没有构想出解决方案,或者Altair并不理想地适用于这种任务,无论哪种情况,都欢迎任何建议。
问题是我的数据框中的列是频率区间(列标签将成为Y轴),索引是时间轴(X轴),我将使用颜色通道来编码每个单元格的值。如何将所有列标题传递给Y轴的编码?或者是否有一种重新排列数据框的方法/策略?频谱图
我有一个matplotlib版本(如图所示),但Altair提供更好的互动性和一些额外的功能,使它成为更理想的选择。我有很多数据要处理,偶尔需要传递特定对象的可视化。
根据Joel的请求,这里是数据集的一个小样本
谢谢
col_freq= [172.265625, 344.53125, 516.796875, 689.0625]
row_1 = [1610974057651.0325, 1261973870532.6091, \
234137860730.91223, 42549716015.37]
row_2 = [4741489056189.282, 3278778293422.225,
160494114891.44345, 57040784835.97968]
row_3 = [198776867252.5261, 661886049124.3528, 188309227047.4264,
124549622810.97015]
data = [row_1, row_2, row_3]
df = pd.DataFrame(data, columns=col_freq)
df
以下是使用scipy.spectrogram的分析代码:
import pandas as pd
import altair as alt
import scipy as sc
from scipy import fft, signal
from matplotlib import pyplot as plt
这段代码生成完整的数据框,对于x,请使用任何audio.wav文件
sample_rate, audio_file =
sc.io.wavfile.read('path/my_audio.wav')
size = 2048
window_func = signal.windows.hann(size)
def wave_to_spect(x, label):
freq, time, Sxx = signal.spectrogram(x, sample_rate,
window=window_func, nperseg=len(window_func))
col_name = [str(x) for x in freq]
df = pd.DataFrame(Sxx, index=col_name)
df = df.T
# 在matplotlib中可视化
plt.pcolormesh(time, freq, np.log(Sxx), shading='gouraud')
plt.title('Spectrogram) ' + label)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
wave_to_spect(audio_file, "audio_car")
英文:
plot spectrograms/ spectral data in Altair.
I am attempting to plot a df that contains frequency bins as the column titles and time/frames as the index and each cell value is the magnitude of that frequency of that time. How can I encode y-axis with column labels and have altair recognize that as an encoding channel. Perhaps this is trivial task and I'm not conceptualizing a solution or this is a task that altair is not ideally suited in either case any recommendations are appreciated.
The Issue is my frequency bins are my columns in the df(and the column labels would be the y-axis) the index is the time axis (x) and I would use the color channel to encode the values of each cell.
How can I pass all the column titles as an encoding for the y axis? Or is their a method/strategu of reorienting the df? spectrogram
I have a matplotlib version (as shown in the image) but Altair offers better interactivity and a couple of additional features that make it a more desirable option. I have a lot of data to process and occasionally passing a visualization of specific objects is necessary.
As per Joel's request here's a small sample of the dataset
Thanks
col_freq= [172.265625, 344.53125, 516.796875, 689.0625]
row_1 = [1610974057651.0325, 1261973870532.6091, \
234137860730.91223, 42549716015.37]
row_2 = [4741489056189.282, 3278778293422.225, \
160494114891.44345, 57040784835.97968]
row_3 = [198776867252.5261, 661886049124.3528, 188309227047.4264, \
124549622810.97015]
data = [row_1, row_2, row_3]
df = pd.DataFrame(data, columns=col_freq)
df
here's the analysis code using scipy.spectrogram:
import pandas as pd
import altair as alt
import scipy as sc
from scipy import fft, signal
from matplotlib import pyplot as plt
# this code produces the full df, for x use any audio.wav file
sample_rate, audio_file =
sc.io.wavfile.read('path/my_audio.wav')
size = 2048
window_func = signal.windows.hann(size)
def wave_to_spect(x, label):
freq, time, Sxx = signal.spectrogram(x, sample_rate,
window=window_func, nperseg=len(window_func))
col_name = [str(x) for x in freq]
df = pd.DataFrame(Sxx, index=col_name)
df = df.T
# visualize in matplotlib
plt.pcolormesh(time, freq, np.log(Sxx), shading='gouraud')
plt.title('Spectrogram) ' + label)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
wave_to_spect(audio_file, "audio_car")
答案1
得分: 0
以下是翻译好的代码部分:
-
首先,将您的DataFrame从宽格式转换为长格式,可以使用以下代码:
df['time'] = df.index df = pd.melt(df, id_vars=['time']) df.columns = ['time', 'freq', 'value'] df.shape # (12, 3) df.head() # time freq value #0 0 172.265625 1.610974e+12 #1 1 172.265625 4.741489e+12 #2 2 172.265625 1.987769e+11 #3 0 344.53125 1.261974e+12 #4 1 344.53125 3.278778e+12
-
现在,使用altair进行可视化,使用
mark_rect()
(使用color
通道来编码value
):alt.Chart(df).mark_rect().encode( x='time:O', y='freq:O', color='value:Q' )
希望这有所帮助。
英文:
With the sample data posted, here is how it can be done:
-
First
melt
yourDataFrame
(convert from wide to long format):df['time'] = df.index df = pd.melt(df, id_vars=['time']) df.head() df.columns = ['time', 'freq', 'value'] df.shape # (12, 3) df.head() # time freq value #0 0 172.265625 1.610974e+12 #1 1 172.265625 4.741489e+12 #2 2 172.265625 1.987769e+11 #3 0 344.53125 1.261974e+12 #4 1 344.53125 3.278778e+12
-
Now visualize with
altair
usingmark_rect()
(usecolor
channel to encode thevalues
):alt.Chart(df).mark_rect().encode( x='time:O', y='freq:O', color='value:Q' )
to obtain the following figure:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论