2023年2月19日 14:01:24go评论59阅读模式

英文:

How to separate a line with " " delimiter but, excluding string encapsulated in the single quotes?

问题

请尝试以下代码：

sed "s/],/], /g; s/\([0-9]*\)ms//g" | awk -F\' '{print "\x27"$2"\x27", $4", "$6", "$10}'

这会将输入转换成你期望的输出：

'Temp.200.200B.Y2K & K-102 & P-503B.SP' (tp9012ga-bt102-734b-pqm4-kjk94kj10), 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 99

如果你对此有任何疑问，请随时问。

英文:

This is my first post ever so please forgive me if I missed any details.

PROBLEM STATEMENT:
I have a bunch of these lines in the file. The fields are separated by space.

'Temp.200.200B.Y2K & K-102 & P-503B.SP' (tp9012ga-bt102-734b-pqm4-kjk94kj10), PASSED, 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 1440] took 99ms including network delay.

I would like to keep what's in the single quotes and also break these into fields with " " delimiter. The desired output is below.

'Temp.200.200B.Y2K & K-102 & P-503B.SP' (tp9012ga-bt102-734b-pqm4-kjk94kj10), 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 99

now keep in mind that the character inside of the single quotes varies vastly but, they are always encapsulated within single quotes.

I have tried cut with a space delimiter but, it also considers spaces in the string inside of the single quotes.
cut -d\' -f1-6

Also, if you notice my desired output, I also wanted to remove some fields and some characters such as 'ms' from 99ms.

答案1

得分: 1

如何使用“”分隔符分隔一行，但要排除在单引号中包装的字符串？

我将使用GNU AWK 来执行此任务，以下是一个简单的示例，假设file.txt的内容如下：

fields without quotes
'quoted field' 'another quoted field' 'yet another field'
mixed 'quoted field' unquoted

然后可以运行以下awk命令：

awk 'BEGIN{FPAT="\047[^\047]*\047|[^ ]*"}{print "第一个字段是",$1; print "第二个字段是",$2; print "第三个字段是",$3}' file.txt

这将输出：

第一个字段是 fields
第二个字段是 without
第三个字段是 quotes
第一个字段是 'quoted field'
第二个字段是 'another quoted field'
第三个字段是 'yet another field'
第一个字段是 mixed
第二个字段是 'quoted field'
第三个字段是 unquoted

解释：我使用FPAT来告诉GNU AWK如何划分字段，即单引号（因为'用作终止符，我使用ASCII代码\047表示该字符的八进制编码），后跟零个或多个非引号字符，或者后跟零个或多个非空格字符。 免责声明：此解决方案假定'都是完美平衡的，并且引号内永远不会出现'以外的非终止字符。

（在GNU Awk 5.0.1中测试通过）

英文:

> How to separate a line with " " delimiter but, excluding string
> encapsulated in the single quotes?

I would harness GNU AWK for this task following way, consider following simple example, let file.txt content be

fields without quotes
&#39;quoted field&#39; &#39;another quoted field&#39; &#39;yet another field&#39;
mixed &#39;quoted field&#39; unquoted

then

awk &#39;BEGIN{FPAT=&quot;7[^7]*7|[^ ]*&quot;}{print &quot;1st field is&quot;,$1; print &quot;2nd field is&quot;,$2; print &quot;3rd field is&quot;,$3}&#39; file.txt

gives output

1st field is fields
2nd field is without
3rd field is quotes
1st field is &#39;quoted field&#39;
2nd field is &#39;another quoted field&#39;
3rd field is &#39;yet another field&#39;
1st field is mixed
2nd field is &#39;quoted field&#39;
3rd field is unquoted

Explanation: I use FPAT to inform GNU AWK what constitutes field, namely single quote (as ' is used as terminator I use \047 which is ASCII code of that character in octal) followed by zero-or-more non-quotes followed by single quote OR (|) zero-or-more non-space characters. Disclaimer: this solution assumes ' are perfectly balanced and there is never ' inside quoted field which is non-terminating.

(tested in GNU Awk 5.0.1)

答案2

得分: 1

这可能适用于您（GNU sed）：

sed -E 's/'\''[^'\'']*'\''|\S+/&\n/g
        s/.*/echo "&"|sed -n "1,2p;4,5p;8s#ms##p"/e
        s/\n//g' file
在空格分隔符之前添加换行符。

使用替代命令中的评估，运行第二个 sed 调用，并将每个字段视为一行。

删除或修改行（字段）。

删除插入的换行符。

英文:

This might work for you (GNU sed):

sed -E &#39;s/&#39;\&#39;&#39;[^&#39;\&#39;&#39;]*&#39;\&#39;&#39;|\S+/&amp;\n/g
        s/.*/echo &quot;&amp;&quot;|sed -n &quot;1,2p;4,5p;8s#ms##p&quot;/e
        s/\n//g&#39; file

Prepend newlines to space delimiters.

Using the evaluation within the substitution command, run a second invocation of sed and treat each field as a line.

Remove or amend the lines (fields).

Remove the inserted newlines.

答案3

得分: 0

通过查看问题陈述和所期望的输出，您可能需要使用,作为分隔符，结合使用awk和sed。

我将简单地在此示例中回显您的PROBLEM STATEMENT字符串，以向您展示如何执行。在这种情况下，我假设您的文件中的行格式相同（除了,之外的引号内字符变化不会有太大问题）。

输出结果：

&#39;Temp.200.200B.Y2K &amp; K-102 &amp; P-503B.SP&#39; (tp9012ga-bt102-734b-pqm4-kjk94kj10) , 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 99

编辑：
@Ed Morton - 我尝试了您的方法，您是对的。也可以仅使用awk来执行此操作。以下是相应的命令。

echo &quot;&#39;Temp.200.200B.Y2K &amp; K-102 &amp; P-503B.SP&#39; (tp9012ga-bt102-734b-pqm4-kjk94kj10), PASSED, 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 1440] took 99ms including network delay.&quot; | awk -F &quot;,&quot; &#39;{ gsub(&quot;[0-9]*] took &quot;,&quot;&quot;,$5); gsub(&quot;ms .*&quot;,&quot;&quot;,$5); print $1,&quot;,&quot;$3&quot;,&quot;$4&quot;,&quot;$5}&#39;

英文:

By looking at the problem statement and the desired output, you may need to go for , as delimiter along with a combination of awk and sed.

I will simply echo your PROBLEM STATEMENT string in this case to show you how it can be done.
I am assuming the line format is the same in your file (no issues with characters inside the quote changing vastly except for ,)

echo &quot;&#39;Temp.200.200B.Y2K &amp; K-102 &amp; P-503B.SP&#39; (tp9012ga-bt102-734b-pqm4-kjk94kj10), PASSED, 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 1440] took 99ms including network delay.&quot; | awk -F &quot;,&quot; &#39;{print $1,&quot;,&quot;$3&quot;,&quot;$4&quot;,&quot;$5}&#39; | sed -e &#39;s/ms .*//g&#39; -e &#39;s/[0-9]*] took //g&#39;

The Output:

&#39;Temp.200.200B.Y2K &amp; K-102 &amp; P-503B.SP&#39; (tp9012ga-bt102-734b-pqm4-kjk94kj10) , 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 99

EDIT:
@Ed Morton - I tried your approach and you are right. It can be done using awk only as well. The command is given below.

echo &quot;&#39;Temp.200.200B.Y2K &amp; K-102 &amp; P-503B.SP&#39; (tp9012ga-bt102-734b-pqm4-kjk94kj10), PASSED, 2023-02-12T06:39:48Z, 2023-02-12T07:25:48.044Z, 1440] took 99ms including network delay.&quot; | awk -F &quot;,&quot; &#39;{ gsub(&quot;[0-9]*] took &quot;,&quot;&quot;,$5); gsub(&quot;ms .*&quot;,&quot;&quot;,$5); print $1,&quot;,&quot;$3&quot;,&quot;$4&quot;,&quot;$5}&#39;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用“ ”分隔符分割一行，但不包括在单引号中封装的字符串？

问题

答案1

答案2

答案3

syntax error: unterminated quoted string Docker Certbot

govendor在cmd中无法工作。

如何编写shell脚本以获取Kubernetes集群中的Pod状态。

Bash script to update GO project on Ubuntu 16.04

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论