处理我的数据框,使用条件 – Python Jupyter 笔记本

huangapple go评论70阅读模式
英文:

Processing my dataframes with conditions - python jupyter notebook

问题

这是我的第一篇帖子,我对编程和Python不太了解。希望我可以描述得清楚,请耐心等待 处理我的数据框,使用条件 – Python Jupyter 笔记本

为了我的学习,我使用Jupyter Notebook(numpy pandas等)来处理和绘制我的数据。
这与X射线束和Geiger-Müller计数器有关。
它看起来像这样:

# 加载我的txt文件
data1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt", skiprows=2, usecols=[0, 1], 
                           names=["Winkel1_1_KBr", "Rate1_1_KBr"], delimiter="\t", decimal=",")

# 放入数据框中
df1_1_KBr = pd.DataFrame(data1_1_KBr, columns=["Winkel1_1_KBr", "Rate1_1_KBr"])

# 这是对我的值进行校正
N_1_1_KBr = (df1_1_KBr.Rate1_1_KBr / (1 - T * df1_1_KBr.Rate1_1_KBr))

校正应该仅适用于较大的"Rate1_1_KBr"值。

处理后的数据应该是一个pandas数据框或类似的数据,它看起来与我的原始数据一样,只有被校正的条目。

我尝试编写循环但失败了。

应该像这样(没有代码,只是思路):

如果"Rate1_1_KBr"的值小于200,则将它们放入我的新数据框中;如果"Rate1_1_KBr"的值大于200,则使用我的校正进行处理,然后将它们放入我的新数据框中。

如果有人能给我一个像我这样的初学者一个好的解释,那就太棒了 处理我的数据框,使用条件 – Python Jupyter 笔记本

英文:

This is my first post and im not so much into programming and python. Hope i can discribe this well pls be patient with me 处理我的数据框,使用条件 – Python Jupyter 笔记本

For my studies i use jupyter notebook (numpy pandas etc.) to process and plot my data.
Here it has something to do with xray beams and a Geiger-Müller-counter.
It looks like this :

# loading my txt
data1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt",skiprows=2,usecols=[0,1], 
                           names=["Winkel1_1_KBr","Rate1_1_KBr"],delimiter="\t",decimal=",")
#putting into dataframe
df1_1_KBr = pd.DataFrame(data1_1_KBr,columns=["Winkel1_1_KBr","Rate1_1_KBr"])

#this is the correction of my values
N_1_1_KBr = (df1_1_KBr.Rate1_1_KBr/(1-T*df1_1_KBr.Rate1_1_KBr))

The correction should only apply for higher values of "Rate1_1_KBr".

The processed data should be a panda dataframe or array like data that just looks the same like my orignal data with the entries that are corrected.

I tried to write loop and i failed.

It should be something like this (no code just thoughts) :

If the values of "Rate1_1_KBr" are smaller then 200, then put these into my new dataframe, if the values of "Rate1_1_KBr" are bigger then 200 then process them with my correction and afterwards put them into my new dataframe.

Would be awesome if somebody got an nice explanation for a beginner like me 处理我的数据框,使用条件 – Python Jupyter 笔记本

答案1

得分: 2

我不确定你是否熟悉numpy,但你可以像使用数据框一样使用索引。此外,你的代码的第二行是多余的,因为pd.read_csv本身就返回一个数据框。所以你想要的代码如下:

df1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt", skiprows=2, usecols=[0,1], 
                       names=["Winkel1_1_KBr","Rate1_1_KBr"], delimiter="\t", decimal=",")
correction_df = df1_1_KBr[df1_1_KBr["Rate1_1_KBr"] > 200]
N_1_1_KBr = df1_1_KBr.copy()
N_1_1_KBr.loc[correction_df.index, 'Rate1_1_KBr'] = (correction_df["Rate1_1_KBr"] / (1 - T * correction_df["Rate1_1_KBr"]))

请注意,无需使用apply,因为这种方式的索引速度要快得多。

还不确定T是一个常量值还是数据框的一部分。如果是前者,这段代码应该可以工作。

英文:

I'm not sure if you're familiar with numpy, but you can use indexing with dataframes the same way. Also, the second line of your code is redundant as the pd.read_csv returns a dataframe itself. So the code you want looks like this:

df1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt",skiprows=2,usecols=[0,1], 
                           names=["Winkel1_1_KBr","Rate1_1_KBr"],delimiter="\t",decimal=",")
correction_df = df1_1_KBr[df1_1_KBr.Rate1_1_KBr>200]
N_1_1_KBr = df1_1_KBr.copy()
N_1_1_KBr[correction_df.index, 'Rate1_1_KBr'] = (correction_df.Rate1_1_KBr/(1-T*correction_df.Rate1_1_KBr))

Note that there is no need to use apply as it is much slower than indexing this way.

Also not sure if T is a constant value or part of a dataframe. If the former, this code should work.

答案2

得分: 1

欢迎Dennis,

def proc_rate(rate, threshold, T):
    if rate < threshold:
        return rate
    else:
        return (rate / (1 - T * rate))

ans = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

如果您想将系列写回原始数据框:

df1_1_KBr.loc[:, 'Rate1_1_KBr'] = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

您也可以筛选速率,然后应用校正方法:

sub_frame = df1_1_KBr.Rate1_1_KBr[df1_1_KBr.Rate1_1_KBr < 200].apply(proc_rate, threshold=200, T=T)
英文:

Welcome Dennis,

def proc_rate(rate, threshold, T):
    if rate&lt;threshold:
        return rate
    else:
        return (rate/(1-T*rate))

ans = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

if You want to write the series back to the original data frame:

df1_1_KBr.loc[:, &#39;Rate1_1_KBr&#39;] = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

You can also filter the rates and then apply the correction method:

sub_frame = df1_1_KBr.Rate1_1_KBr[df1_1_KBr.Rate1_1_KBr&lt;200].apply(proc_rate, threshold=200, T=T)

huangapple
  • 本文由 发表于 2020年1月6日 21:06:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/59612674.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定