英文:
Processing my dataframes with conditions - python jupyter notebook
问题
这是我的第一篇帖子,我对编程和Python不太了解。希望我可以描述得清楚,请耐心等待
为了我的学习,我使用Jupyter Notebook(numpy pandas等)来处理和绘制我的数据。
这与X射线束和Geiger-Müller计数器有关。
它看起来像这样:
# 加载我的txt文件
data1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt", skiprows=2, usecols=[0, 1],
names=["Winkel1_1_KBr", "Rate1_1_KBr"], delimiter="\t", decimal=",")
# 放入数据框中
df1_1_KBr = pd.DataFrame(data1_1_KBr, columns=["Winkel1_1_KBr", "Rate1_1_KBr"])
# 这是对我的值进行校正
N_1_1_KBr = (df1_1_KBr.Rate1_1_KBr / (1 - T * df1_1_KBr.Rate1_1_KBr))
校正应该仅适用于较大的"Rate1_1_KBr"值。
处理后的数据应该是一个pandas数据框或类似的数据,它看起来与我的原始数据一样,只有被校正的条目。
我尝试编写循环但失败了。
应该像这样(没有代码,只是思路):
如果"Rate1_1_KBr"的值小于200,则将它们放入我的新数据框中;如果"Rate1_1_KBr"的值大于200,则使用我的校正进行处理,然后将它们放入我的新数据框中。
如果有人能给我一个像我这样的初学者一个好的解释,那就太棒了
英文:
This is my first post and im not so much into programming and python. Hope i can discribe this well pls be patient with me
For my studies i use jupyter notebook (numpy pandas etc.) to process and plot my data.
Here it has something to do with xray beams and a Geiger-Müller-counter.
It looks like this :
# loading my txt
data1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt",skiprows=2,usecols=[0,1],
names=["Winkel1_1_KBr","Rate1_1_KBr"],delimiter="\t",decimal=",")
#putting into dataframe
df1_1_KBr = pd.DataFrame(data1_1_KBr,columns=["Winkel1_1_KBr","Rate1_1_KBr"])
#this is the correction of my values
N_1_1_KBr = (df1_1_KBr.Rate1_1_KBr/(1-T*df1_1_KBr.Rate1_1_KBr))
The correction should only apply for higher values of "Rate1_1_KBr".
The processed data should be a panda dataframe or array like data that just looks the same like my orignal data with the entries that are corrected.
I tried to write loop and i failed.
It should be something like this (no code just thoughts) :
If the values of "Rate1_1_KBr" are smaller then 200, then put these into my new dataframe, if the values of "Rate1_1_KBr" are bigger then 200 then process them with my correction and afterwards put them into my new dataframe.
Would be awesome if somebody got an nice explanation for a beginner like me
答案1
得分: 2
我不确定你是否熟悉numpy,但你可以像使用数据框一样使用索引。此外,你的代码的第二行是多余的,因为pd.read_csv
本身就返回一个数据框。所以你想要的代码如下:
df1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt", skiprows=2, usecols=[0,1],
names=["Winkel1_1_KBr","Rate1_1_KBr"], delimiter="\t", decimal=",")
correction_df = df1_1_KBr[df1_1_KBr["Rate1_1_KBr"] > 200]
N_1_1_KBr = df1_1_KBr.copy()
N_1_1_KBr.loc[correction_df.index, 'Rate1_1_KBr'] = (correction_df["Rate1_1_KBr"] / (1 - T * correction_df["Rate1_1_KBr"]))
请注意,无需使用apply
,因为这种方式的索引速度要快得多。
还不确定T
是一个常量值还是数据框的一部分。如果是前者,这段代码应该可以工作。
英文:
I'm not sure if you're familiar with numpy, but you can use indexing with dataframes the same way. Also, the second line of your code is redundant as the pd.read_csv
returns a dataframe itself. So the code you want looks like this:
df1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt",skiprows=2,usecols=[0,1],
names=["Winkel1_1_KBr","Rate1_1_KBr"],delimiter="\t",decimal=",")
correction_df = df1_1_KBr[df1_1_KBr.Rate1_1_KBr>200]
N_1_1_KBr = df1_1_KBr.copy()
N_1_1_KBr[correction_df.index, 'Rate1_1_KBr'] = (correction_df.Rate1_1_KBr/(1-T*correction_df.Rate1_1_KBr))
Note that there is no need to use apply
as it is much slower than indexing this way.
Also not sure if T
is a constant value or part of a dataframe. If the former, this code should work.
答案2
得分: 1
欢迎Dennis,
def proc_rate(rate, threshold, T):
if rate < threshold:
return rate
else:
return (rate / (1 - T * rate))
ans = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)
如果您想将系列写回原始数据框:
df1_1_KBr.loc[:, 'Rate1_1_KBr'] = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)
您也可以筛选速率,然后应用校正方法:
sub_frame = df1_1_KBr.Rate1_1_KBr[df1_1_KBr.Rate1_1_KBr < 200].apply(proc_rate, threshold=200, T=T)
英文:
Welcome Dennis,
def proc_rate(rate, threshold, T):
if rate<threshold:
return rate
else:
return (rate/(1-T*rate))
ans = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)
if You want to write the series back to the original data frame:
df1_1_KBr.loc[:, 'Rate1_1_KBr'] = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)
You can also filter the rates and then apply the correction method:
sub_frame = df1_1_KBr.Rate1_1_KBr[df1_1_KBr.Rate1_1_KBr<200].apply(proc_rate, threshold=200, T=T)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论