2023年6月13日 16:15:18go评论73阅读模式

英文:

How to pass variable to filter condition dataframe pandas

问题

以下是翻译好的部分：

我有一些文件（10.230.30.146_480.txt、10.20.24.16_480.txt、10.55.30.2_383.txt），我需要将文件名的第一部分作为变量1，第二部分作为变量2。

我使用以下代码来实现这个目标：

for txt in na_sheets:
    x = txt.replace('.txt', '')
    y = x.split("_", 1)
    variable1 = y[0]
    variable2 = y[1]
    df1 = df[(df['MSAN_IP'] == variable1) & (df['OUTER_VLAN'] == variable2)]

然后，我创建了一个循环来迭代变量variable1和variable2，并将它们传递给过滤条件，但输出是一个只包含标题的空DataFrame。

英文:

I have files (10.230.30.146_480.txt, 10.20.24.16_480.txt, 10.55.30.2_383.txt), I need to use the first part of the file name as variable1 and the second part as variable2

I used the code to do that

for txt in na_sheets:

x=txt.replace('.txt','')

y= x.split("_", 1)

variable1 = y[0]

variable2=y[1]

df1=df[(df['MSAN_IP'] == 'variable1') & (df['OUTER_VLAN'] == variable2)]

Then I made for loop to iterate the variables variable1 and variable2
and filter dataframe df and pass this variables to the filer condition

The output is Empty DataFrame has the headers only

答案1

得分: 1

Here is the translated code snippet:

如果我理解正确，您可以使用：

na_sheets = [
    "10.230.30.146_480.txt",
    "10.20.24.16_480.txt",
    "10.55.30.2_383.txt"
]

dfs = {
    fn: df.loc[(df["MSAN_IP"] == v1) & (df["OUTER_VLAN"] == int(v2))] 
    for fn in na_sheets for v1, v2 in [fn.rstrip(".txt").split("_")]
}

*注：这将创建一个字典，其中键是文件名。*

输出：

for k, v in dfs.items():
    print(k, v, sep="\n", end="\n\n")

10.230.30.146_480.txt
         MSAN_IP  OUTER_VLAN
2  10.230.30.146         480

10.20.24.16_480.txt
       MSAN_IP  OUTER_VLAN
0  10.20.24.16         480

10.55.30.2_383.txt
      MSAN_IP  OUTER_VLAN
1  10.55.30.2         383

*使用的输入：*

df = pd.DataFrame({
    "MSAN_IP": ["10.20.24.16", "10.55.30.2", "10.230.30.146"],
    "OUTER_VLAN": [480, 383, 480],
})

I've translated the code portion as requested.

英文:

IIUC, you can use :

na_sheets = [
    &quot;10.230.30.146_480.txt&quot;,
    &quot;10.20.24.16_480.txt&quot;,
    &quot;10.55.30.2_383.txt&quot;
]

dfs = {
    fn: df.loc[(df[&quot;MSAN_IP&quot;] == v1) &amp; (df[&quot;OUTER_VLAN&quot;] == int(v2))] 
    for fn in na_sheets for v1, v2 in [fn.rstrip(&quot;.txt&quot;).split(&quot;_&quot;)] # maxsplit=1 ?
}

NB: This will make a dictionnary of DataFrames where the keys are the filenames.

Output :

for k, v in dfs.items():
    print(k, v, sep=&quot;\n&quot;, end=&quot;\n\n&quot;)

10.230.30.146_480.txt
         MSAN_IP  OUTER_VLAN
2  10.230.30.146         480

10.20.24.16_480.txt
       MSAN_IP  OUTER_VLAN
0  10.20.24.16         480

10.55.30.2_383.txt
      MSAN_IP  OUTER_VLAN
1  10.55.30.2         383

Input used :

df = pd.DataFrame({
    &quot;MSAN_IP&quot;: [&quot;10.20.24.16&quot;, &quot;10.55.30.2&quot;, &quot;10.230.30.146&quot;],
    &quot;OUTER_VLAN&quot;: [480, 383, 480],
})

答案2

得分: 1

你之所以没有得到结果是因为你的for循环正在覆盖值，所以你只得到最后一个值。你可以这样做：

na_sheets = (
    "10.230.30.146_480.txt",
    "10.20.24.16_480.txt",
    "10.55.30.2_383.txt"
)

k = [txt.replace('.txt', '').split("_", 1) for txt in na_sheets]
#[['10.230.30.146', '480'], ['10.20.24.16', '480'], ['10.55.30.2', '383']]

variable1 = [x[0] for x in k]
#['10.230.30.146', '10.20.24.16', '10.55.30.2']

variable2 = [x[1] for x in k]
#['480', '480', '383']

现在你可以在一个数据框中使用它们。

df = pd.DataFrame({
    "MSAN_IP": variable1,
    "OUTER_VLAN": variable2,
})

英文:

You are not getting the result because your for loop is overriding values and you are getting only the last value. You can do:

na_sheets = (
    &quot;10.230.30.146_480.txt&quot;,
    &quot;10.20.24.16_480.txt&quot;,
    &quot;10.55.30.2_383.txt&quot;
)

k=[txt.replace(&#39;.txt&#39;,&#39;&#39;).split(&quot;_&quot;, 1) for txt in na_sheets]
#[[&#39;10.230.30.146&#39;, &#39;480&#39;], [&#39;10.20.24.16&#39;, &#39;480&#39;], [&#39;10.55.30.2&#39;, &#39;383&#39;]]

variable1 = [x[0] for x in k]
#[&#39;10.230.30.146&#39;, &#39;10.20.24.16&#39;, &#39;10.55.30.2&#39;]

variable2 = [x[1] for x in k]
#[&#39;480&#39;, &#39;480&#39;, &#39;383&#39;]

Now you can use them in a dataframe.

df = pd.DataFrame({
    &quot;MSAN_IP&quot;: variable1,
    &quot;OUTER_VLAN&quot;: variable2,
})

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将变量传递给pandas数据框的筛选条件？

问题

答案1

答案2

如何在自定义数据生成器中获取正确的混淆矩阵数据？

UDP广播实际上是单播吗？

我无法使用Selenium点击按钮。

在Python/Beeware中双击表格时。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论