2023年7月3日 18:41:28go评论159阅读模式

英文:

How to remove duplicates from dynamic column through KQL

问题

我们正在尝试在Azure云中通过kQL查询设置应用程序（容器）日志的警报。我们正在运行以下查询以跟踪错误并删除重复项。

let ContainerIdList = KubePodInventory
| where ContainerName contains "custom-app-logger"
| where Namespace has "devns"
| where ClusterId =~ '/subscriptions/12345-xyzl-4e12-bc3a-5c7859365342636/resourcegroups/rg-aks-dev-dev/providers/Microsoft.ContainerService/managedClusters/aksdevcluster'
| distinct ContainerID;
ContainerLog 
| where ContainerID in (ContainerIdList)
| where LogEntry !has "SRV1174"
| where LogEntry has "| E |" or LogEntry has "| F |"
| where LogEntry !contains "the I/O interface definition of project"
| where LogEntry !contains "the I/O interface definition of cuc"
| project LogEntrySource, LogEntry, TimeGenerated 
| order by TimeGenerated desc
| top 1000 by LogEntry
| render table
| distinct LogEntry

然而，它没有删除重复项。如果您查看下面的日志输出，除了第一个和最后一个之外，其他所有日志都是相同的，只是其中的时间戳不同。请查看下面的日志输出截图。

[![enter image description here][1]][1]

您能帮忙告诉我们如何删除实际错误的重复项吗？如果是相同错误但时间戳不同，需要选择最新的错误。

英文:

we are trying to setup alerts for application(container) logs through kQL Queries in Azure cloud. we are running following query to track error and remove duplicates.

    let ContainerIdList = KubePodInventory
    | where ContainerName contains &quot;custom-app-logger&quot;
    | where Namespace has &quot;devns&quot;
    | where ClusterId =~ &#39;/subscriptions/12345-xyzl-4e12-bc3a-5c7859365342636/resourcegroups/rg-aks-dev-dev/providers/Microsoft.ContainerService/managedClusters/aksdevcluster&#39;
    | distinct ContainerID;
     ContainerLog 
    | where ContainerID in (ContainerIdList)
    | where LogEntry !has &quot;SRV1174&quot;
    | where LogEntry has &quot;| E |&quot; or LogEntry has &quot;| F |&quot;
    | where LogEntry !contains &quot;the I/O interface definition of project&quot; 
    | where LogEntry !contains &quot;the I/O interface definition of cuc&quot; 
    | project LogEntrySource, LogEntry, TimeGenerated 
    | order by TimeGenerated desc
    | top 1000 by LogEntry
    | render table
    | distinct LogEntry

however its not removing the duplicates. if you see the log output below except first and last one, all others are same only with different timestamp in it. please find the log output screenshot below.

[![enter image description here][1]][1]

could you please help me how can we remove duplicates with actual error, if it is same error with different time stamp in it, need to pick the latest one.

答案1

得分: 0

我在我的环境中复现并获得了如下所示的预期结果：

首先，取一个简单的表格如下：

LogEntry

0|2023-07-23 11:40:50|b|abc
0|2023-07-23 11:42:50|b|abc
0|2023-07-23 12:40:50|c|xyz
0|2023-07-23 14:40:50|d|bnm

然后，我使用以下查询来去除重复项：

let x = datatable(LogEntry: string)
[
    "0|2023-07-23 11:40:50|b|abc",
    "0|2023-07-23 11:42:50|b|abc",
    "0|2023-07-23 12:40:50|c|xyz",
    "0|2023-07-23 14:40:50|d|bnm"
]
| extend SplitLog = split(LogEntry, "|")
| project C1 = SplitLog[0], cc=SplitLog[1],C2 = todatetime(SplitLog[1]), C3 = SplitLog[2], UV = SplitLog[3];
x
| summarize arg_max(C2,*) by tostring(UV)
| project-away UV
| project Log = strcat(tostring(C1), "|", cc, "|", C3, "|", UV1)

输出：

如何通过KQL从动态列中删除重复项

Fiddle。

将LogEntry列以管道字符（|）分割并创建新列。然后，通过UniqueValues(UV)对其进行汇总，并取Timestamp(C2)的最大值，然后移除Unique Value列（这将得到一个额外的列），然后使用管道符号（|）将其拼接回去，就像我所做的那样。请尝试按照上面的示例来获取所需的结果。

编辑：

ContainerIdList
| extend SplitLog = split(LogEntry, "|")
| project C1 = SplitLog[0], cc=SplitLog[1],C2 = todatetime(SplitLog[1]), C3 = SplitLog[2], UV = SplitLog[3]
| summarize arg_max(C2,*) by tostring(UV)
| project-away UV
| project Log = strcat(tostring(C1), "|", cc, "|", C3, "|", UV1)

英文:

I have reproduced in my environment and got expected results as below:

Firstly, taken a simple table like below:

LogEntry
0|2023-07-23 11:40:50|b|abc
0|2023-07-23 11:42:50|b|abc
0|2023-07-23 12:40:50|c|xyz
0|2023-07-23 14:40:50|d|bnm

如何通过KQL从动态列中删除重复项

Then I have used below query to remove duplicates:

let x = datatable(LogEntry: string)
[
    &quot;0|2023-07-23 11:40:50|b|abc&quot;,
    &quot;0|2023-07-23 11:42:50|b|abc&quot;,
    &quot;0|2023-07-23 12:40:50|c|xyz&quot;,
    &quot;0|2023-07-23 14:40:50|d|bnm&quot;
]
| extend SplitLog = split(LogEntry, &quot;|&quot;)
| project C1 = SplitLog[0], cc=SplitLog[1],C2 = todatetime(SplitLog[1]), C3 = SplitLog[2], UV = SplitLog[3];
x
| summarize arg_max(C2,*) by tostring(UV)
| project-away UV
| project Log = strcat(tostring(C1), &quot;|&quot;, cc, &quot;|&quot;, C3, &quot;|&quot;, UV1)

如何通过KQL从动态列中删除重复项

Output:

如何通过KQL从动态列中删除重复项

Fiddle.

Split the LogEntry column by pipes(|) and create new columns. Then summarize it by UniqueValues(UV) and take the maximum of Timestamp(C2) and remove Unique Value column(which you will get extra column) then concat it back with Pipe symbol like I have done. Try to follow above example to get desired result as I have got.

EDIT:

ContainerIdList
    | extend SplitLog = split(LogEntry, &quot;|&quot;)
    | project C1 = SplitLog[0], cc=SplitLog[1],C2 = todatetime(SplitLog[1]), C3 = SplitLog[2], UV = SplitLog[3]
    | summarize arg_max(C2,*) by tostring(UV)
    | project-away UV
    | project Log = strcat(tostring(C1), &quot;|&quot;, cc, &quot;|&quot;, C3, &quot;|&quot;, UV1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何通过KQL从动态列中删除重复项

问题

答案1

数据存储选项用于 Power Apps

AzureAD身份验证: 受众验证失败。受众不匹配

如何在连续查询中访问多个Kusto let语句的结果？

在Web聊天中进行的测试与应用服务不起作用。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论