如何从文本文件内容中删除最后一个句点后的所有内容。

huangapple go评论46阅读模式
英文:

How to remove everything after the last . from a text file content

问题

你可以将以下的awk命令添加到你的bash脚本中来修改filename.txt文件:

awk -F. '{NF--; OFS="."} 1' filename.txt > modified_filename.txt

这将会创建一个名为modified_filename.txt的新文件,其中包含了去除扩展名的内容。

英文:

I have a file called filename.txt which contains the long list of names with extension as follows:
I want to remove the extension, meaning everything after the last character .

MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
AST006_S003.zip
AST003_S066.zip

I tried this command awk '{print substr($0, 1, index($0, ".") - 1)}' filename.txt but it removed everything after the first char .

MOR22A2
MOR22A2
AST006_S003
AST003_S066

I am looking to get something like below:

MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
AST006_S003
AST003_S066

I am using a bash script, how do I add the awk command in my bash script to modify the filename.txt file?

答案1

得分: 4

使用awk:

awk '{sub(/\.[^.]*$/, "")} 1' 文件
英文:

With awk:

awk '{sub(/\.[^.]*$/, "")} 1' file

</details>



# 答案2
**得分**: 3

i would say that you can use a for loop to iterate and construct a new string by concatenating from each field then print that out.

    awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv

input:

    cat test.csv
    MOR22A2.S4000011.h23v22.061.2023063111017.hdf
    MOR22A2.S4000011.h23v44.061.2023061111033.hdf

output:

    awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv
    MOR22A2.S4000011.h23v22.061.2023063111017
    MOR22A2.S4000011.h23v44.061.2023061111033

But you also have other options such as 1. https://unix.stackexchange.com/questions/234432/how-to-delete-the-last-column-of-a-file-in-linux/234436#234436 

Based on the first link, you can try:

    awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv

output:

    awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv
    MOR22A2.S4000011.h23v22.061.2023063111017
    MOR22A2.S4000011.h23v44.061.2023061111033

<details>
<summary>英文:</summary>

i would say that you can use a for loop to iterate and construct a new string by concatenating from each field then print that out.

    awk &#39;BEGIN{FS=OFS=&quot;.&quot;} { s=$1; for (i=2; i&lt;NF; i++){ s=s&quot;.&quot;$i }; print s }&#39; test.csv

input:

    cat test.csv
    MOR22A2.S4000011.h23v22.061.2023063111017.hdf
    MOR22A2.S4000011.h23v44.061.2023061111033.hdf

output:

    awk &#39;BEGIN{FS=OFS=&quot;.&quot;} { s=$1; for (i=2; i&lt;NF; i++){ s=s&quot;.&quot;$i }; print s }&#39; test.csv
    MOR22A2.S4000011.h23v22.061.2023063111017
    MOR22A2.S4000011.h23v44.061.2023061111033

But you also have other options such as 1. https://unix.stackexchange.com/questions/234432/how-to-delete-the-last-column-of-a-file-in-linux/234436#234436 

Based on the first link, you can try:

    awk &#39;BEGIN{ FS=OFS=&quot;.&quot;} NF{NF-=1};1&#39; test.csv

output:

    awk &#39;BEGIN{ FS=OFS=&quot;.&quot;} NF{NF-=1};1&#39; test.csv
    MOR22A2.S4000011.h23v22.061.2023063111017
    MOR22A2.S4000011.h23v44.061.2023061111033

</details>



# 答案3
**得分**: 2

如果您使用正则表达式并应用 '$',这将确保它只匹配字符串的末尾,因此您的完整正则表达式将是 `/[.][^.]*$/`,意思是“句号+非句号+字符串末尾”。

使用 `sed`,可以这样做:

```bash
   sed 's/[.][^.]*$//' < file.txt
英文:

If you use a regular expression and apply '$' that will ensure that it only matches the end of the string, so your complete regex would be /[.][^.]*$/ which means "fullstop+nofullstops+endofstring".

With sed, this would be:

   sed &#39;s/[.][^.]*$//&#39; &lt; file.txt

答案4

得分: 2

使用您提供的示例,请尝试以下awk代码,使用match函数来处理这个问题。

awk 'match($0,/^.*\./){print substr($0,RSTART,RLENGTH-1)}'  Input_file
英文:

With your shown samples please try following awk code using match function for this one.

awk &#39;match($0,/^.*\./){print substr($0,RSTART,RLENGTH-1)}&#39;  Input_file

答案5

得分: 2

这可以通过使用 revcut 的组合来实现,如下所示,假设 file.txt 的内容如下:

MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
AST006_S003.zip
AST003_S066.zip

然后

rev file.txt | cut --delimiter='.' --fields=2- | rev

得到输出

MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
AST006_S003
AST003_S066

解释:使用 rev 获取行的镜像版本,然后通知 cut 它是以点分隔的,并选择第2个字段及其后的部分(注意 2 后面的 -),然后再次镜像行以获得原始行。

英文:

This can be done using rev and cut combination as follows, let file.txt content be

MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
AST006_S003.zip
AST003_S066.zip

then

rev file.txt | cut --delimiter=&#39;.&#39; --fields=2- | rev

gives output

MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
AST006_S003
AST003_S066

Explanation: get mirror version of lines using rev then inform cut that it is dot-separated and select field 2nd and following (observe - after 2) then again mirror lines to get original lines.

huangapple
  • 本文由 发表于 2023年6月8日 09:35:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76428068.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定