2020年1月6日 21:06:13go评论108阅读模式

英文:

Processing my dataframes with conditions - python jupyter notebook

问题

这是我的第一篇帖子，我对编程和Python不太了解。希望我可以描述得清楚，请耐心等待

为了我的学习，我使用Jupyter Notebook（numpy pandas等）来处理和绘制我的数据。
这与X射线束和Geiger-Müller计数器有关。
它看起来像这样：

# 加载我的txt文件
data1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt", skiprows=2, usecols=[0, 1], 
                           names=["Winkel1_1_KBr", "Rate1_1_KBr"], delimiter="\t", decimal=",")
# 放入数据框中
df1_1_KBr = pd.DataFrame(data1_1_KBr, columns=["Winkel1_1_KBr", "Rate1_1_KBr"])
# 这是对我的值进行校正
N_1_1_KBr = (df1_1_KBr.Rate1_1_KBr / (1 - T * df1_1_KBr.Rate1_1_KBr))

校正应该仅适用于较大的"Rate1_1_KBr"值。

处理后的数据应该是一个pandas数据框或类似的数据，它看起来与我的原始数据一样，只有被校正的条目。

我尝试编写循环但失败了。

应该像这样（没有代码，只是思路）：

如果"Rate1_1_KBr"的值小于200，则将它们放入我的新数据框中；如果"Rate1_1_KBr"的值大于200，则使用我的校正进行处理，然后将它们放入我的新数据框中。

如果有人能给我一个像我这样的初学者一个好的解释，那就太棒了

英文:

This is my first post and im not so much into programming and python. Hope i can discribe this well pls be patient with me

For my studies i use jupyter notebook (numpy pandas etc.) to process and plot my data.
Here it has something to do with xray beams and a Geiger-Müller-counter.
It looks like this :

# loading my txt
data1_1_KBr = pd.read_csv(&quot;1_1_SpektrumCuKBr_Daten.txt&quot;,skiprows=2,usecols=[0,1], 
                           names=[&quot;Winkel1_1_KBr&quot;,&quot;Rate1_1_KBr&quot;],delimiter=&quot;\t&quot;,decimal=&quot;,&quot;)
#putting into dataframe
df1_1_KBr = pd.DataFrame(data1_1_KBr,columns=[&quot;Winkel1_1_KBr&quot;,&quot;Rate1_1_KBr&quot;])
#this is the correction of my values
N_1_1_KBr = (df1_1_KBr.Rate1_1_KBr/(1-T*df1_1_KBr.Rate1_1_KBr))

The correction should only apply for higher values of "Rate1_1_KBr".

The processed data should be a panda dataframe or array like data that just looks the same like my orignal data with the entries that are corrected.

I tried to write loop and i failed.

It should be something like this (no code just thoughts) :

If the values of "Rate1_1_KBr" are smaller then 200, then put these into my new dataframe, if the values of "Rate1_1_KBr" are bigger then 200 then process them with my correction and afterwards put them into my new dataframe.

Would be awesome if somebody got an nice explanation for a beginner like me

答案1

得分: 2

我不确定你是否熟悉numpy，但你可以像使用数据框一样使用索引。此外，你的代码的第二行是多余的，因为pd.read_csv本身就返回一个数据框。所以你想要的代码如下：

df1_1_KBr = pd.read_csv("1_1_SpektrumCuKBr_Daten.txt", skiprows=2, usecols=[0,1], 
                       names=["Winkel1_1_KBr","Rate1_1_KBr"], delimiter="\t", decimal=",")
correction_df = df1_1_KBr[df1_1_KBr["Rate1_1_KBr"] > 200]
N_1_1_KBr = df1_1_KBr.copy()
N_1_1_KBr.loc[correction_df.index, 'Rate1_1_KBr'] = (correction_df["Rate1_1_KBr"] / (1 - T * correction_df["Rate1_1_KBr"]))

请注意，无需使用apply，因为这种方式的索引速度要快得多。

还不确定T是一个常量值还是数据框的一部分。如果是前者，这段代码应该可以工作。

英文:

I'm not sure if you're familiar with numpy, but you can use indexing with dataframes the same way. Also, the second line of your code is redundant as the pd.read_csv returns a dataframe itself. So the code you want looks like this:

df1_1_KBr = pd.read_csv(&quot;1_1_SpektrumCuKBr_Daten.txt&quot;,skiprows=2,usecols=[0,1], 
                           names=[&quot;Winkel1_1_KBr&quot;,&quot;Rate1_1_KBr&quot;],delimiter=&quot;\t&quot;,decimal=&quot;,&quot;)
correction_df = df1_1_KBr[df1_1_KBr.Rate1_1_KBr&gt;200]
N_1_1_KBr = df1_1_KBr.copy()
N_1_1_KBr[correction_df.index, &#39;Rate1_1_KBr&#39;] = (correction_df.Rate1_1_KBr/(1-T*correction_df.Rate1_1_KBr))

Note that there is no need to use apply as it is much slower than indexing this way.

Also not sure if T is a constant value or part of a dataframe. If the former, this code should work.

答案2

得分: 1

欢迎Dennis，

def proc_rate(rate, threshold, T):
    if rate < threshold:
        return rate
    else:
        return (rate / (1 - T * rate))
ans = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

如果您想将系列写回原始数据框：

df1_1_KBr.loc[:, 'Rate1_1_KBr'] = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

您也可以筛选速率，然后应用校正方法：

sub_frame = df1_1_KBr.Rate1_1_KBr[df1_1_KBr.Rate1_1_KBr < 200].apply(proc_rate, threshold=200, T=T)

英文:

Welcome Dennis,

def proc_rate(rate, threshold, T):
    if rate&lt;threshold:
        return rate
    else:
        return (rate/(1-T*rate))
ans = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

if You want to write the series back to the original data frame:

df1_1_KBr.loc[:, &#39;Rate1_1_KBr&#39;] = df1_1_KBr.Rate1_1_KBr.apply(proc_rate, threshold=200, T=T)

You can also filter the rates and then apply the correction method:

sub_frame = df1_1_KBr.Rate1_1_KBr[df1_1_KBr.Rate1_1_KBr&lt;200].apply(proc_rate, threshold=200, T=T)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

处理我的数据框，使用条件 – Python Jupyter 笔记本

问题

答案1

答案2

如何在Python中使用带有约束条件的最小化函数？

在Python中选择字典中的一个子键。

Python3处理嵌套JSON响应的方法是什么？

将数据传递给一个使用pyodbc的变量。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。