如何从文本文件内容中删除最后一个句点后的所有内容。

huangapple go评论85阅读模式
英文:

How to remove everything after the last . from a text file content

问题

你可以将以下的awk命令添加到你的bash脚本中来修改filename.txt文件:

  1. awk -F. '{NF--; OFS="."} 1' filename.txt > modified_filename.txt

这将会创建一个名为modified_filename.txt的新文件,其中包含了去除扩展名的内容。

英文:

I have a file called filename.txt which contains the long list of names with extension as follows:
I want to remove the extension, meaning everything after the last character .

  1. MOR22A2.S4000011.h23v22.061.2023063111017.hdf
  2. MOR22A2.S4000011.h23v44.061.2023061111033.hdf
  3. AST006_S003.zip
  4. AST003_S066.zip

I tried this command awk '{print substr($0, 1, index($0, ".") - 1)}' filename.txt but it removed everything after the first char .

  1. MOR22A2
  2. MOR22A2
  3. AST006_S003
  4. AST003_S066

I am looking to get something like below:

  1. MOR22A2.S4000011.h23v22.061.2023063111017
  2. MOR22A2.S4000011.h23v44.061.2023061111033
  3. AST006_S003
  4. AST003_S066

I am using a bash script, how do I add the awk command in my bash script to modify the filename.txt file?

答案1

得分: 4

使用awk:

  1. awk '{sub(/\.[^.]*$/, "")} 1' 文件
英文:

With awk:

  1. awk '{sub(/\.[^.]*$/, "")} 1' file
  2. </details>
  3. # 答案2
  4. **得分**: 3
  5. i would say that you can use a for loop to iterate and construct a new string by concatenating from each field then print that out.
  6. awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv
  7. input:
  8. cat test.csv
  9. MOR22A2.S4000011.h23v22.061.2023063111017.hdf
  10. MOR22A2.S4000011.h23v44.061.2023061111033.hdf
  11. output:
  12. awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv
  13. MOR22A2.S4000011.h23v22.061.2023063111017
  14. MOR22A2.S4000011.h23v44.061.2023061111033
  15. But you also have other options such as 1. https://unix.stackexchange.com/questions/234432/how-to-delete-the-last-column-of-a-file-in-linux/234436#234436
  16. Based on the first link, you can try:
  17. awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv
  18. output:
  19. awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv
  20. MOR22A2.S4000011.h23v22.061.2023063111017
  21. MOR22A2.S4000011.h23v44.061.2023061111033
  22. <details>
  23. <summary>英文:</summary>
  24. i would say that you can use a for loop to iterate and construct a new string by concatenating from each field then print that out.
  25. awk &#39;BEGIN{FS=OFS=&quot;.&quot;} { s=$1; for (i=2; i&lt;NF; i++){ s=s&quot;.&quot;$i }; print s }&#39; test.csv
  26. input:
  27. cat test.csv
  28. MOR22A2.S4000011.h23v22.061.2023063111017.hdf
  29. MOR22A2.S4000011.h23v44.061.2023061111033.hdf
  30. output:
  31. awk &#39;BEGIN{FS=OFS=&quot;.&quot;} { s=$1; for (i=2; i&lt;NF; i++){ s=s&quot;.&quot;$i }; print s }&#39; test.csv
  32. MOR22A2.S4000011.h23v22.061.2023063111017
  33. MOR22A2.S4000011.h23v44.061.2023061111033
  34. But you also have other options such as 1. https://unix.stackexchange.com/questions/234432/how-to-delete-the-last-column-of-a-file-in-linux/234436#234436
  35. Based on the first link, you can try:
  36. awk &#39;BEGIN{ FS=OFS=&quot;.&quot;} NF{NF-=1};1&#39; test.csv
  37. output:
  38. awk &#39;BEGIN{ FS=OFS=&quot;.&quot;} NF{NF-=1};1&#39; test.csv
  39. MOR22A2.S4000011.h23v22.061.2023063111017
  40. MOR22A2.S4000011.h23v44.061.2023061111033
  41. </details>
  42. # 答案3
  43. **得分**: 2
  44. 如果您使用正则表达式并应用 '$',这将确保它只匹配字符串的末尾,因此您的完整正则表达式将是 `/[.][^.]*$/`,意思是“句号+非句号+字符串末尾”。
  45. 使用 `sed`,可以这样做:
  46. ```bash
  47. sed 's/[.][^.]*$//' < file.txt
英文:

If you use a regular expression and apply '$' that will ensure that it only matches the end of the string, so your complete regex would be /[.][^.]*$/ which means "fullstop+nofullstops+endofstring".

With sed, this would be:

  1. sed &#39;s/[.][^.]*$//&#39; &lt; file.txt

答案4

得分: 2

使用您提供的示例,请尝试以下awk代码,使用match函数来处理这个问题。

  1. awk 'match($0,/^.*\./){print substr($0,RSTART,RLENGTH-1)}' Input_file
英文:

With your shown samples please try following awk code using match function for this one.

  1. awk &#39;match($0,/^.*\./){print substr($0,RSTART,RLENGTH-1)}&#39; Input_file

答案5

得分: 2

这可以通过使用 revcut 的组合来实现,如下所示,假设 file.txt 的内容如下:

  1. MOR22A2.S4000011.h23v22.061.2023063111017.hdf
  2. MOR22A2.S4000011.h23v44.061.2023061111033.hdf
  3. AST006_S003.zip
  4. AST003_S066.zip

然后

  1. rev file.txt | cut --delimiter='.' --fields=2- | rev

得到输出

  1. MOR22A2.S4000011.h23v22.061.2023063111017
  2. MOR22A2.S4000011.h23v44.061.2023061111033
  3. AST006_S003
  4. AST003_S066

解释:使用 rev 获取行的镜像版本,然后通知 cut 它是以点分隔的,并选择第2个字段及其后的部分(注意 2 后面的 -),然后再次镜像行以获得原始行。

英文:

This can be done using rev and cut combination as follows, let file.txt content be

  1. MOR22A2.S4000011.h23v22.061.2023063111017.hdf
  2. MOR22A2.S4000011.h23v44.061.2023061111033.hdf
  3. AST006_S003.zip
  4. AST003_S066.zip

then

  1. rev file.txt | cut --delimiter=&#39;.&#39; --fields=2- | rev

gives output

  1. MOR22A2.S4000011.h23v22.061.2023063111017
  2. MOR22A2.S4000011.h23v44.061.2023061111033
  3. AST006_S003
  4. AST003_S066

Explanation: get mirror version of lines using rev then inform cut that it is dot-separated and select field 2nd and following (observe - after 2) then again mirror lines to get original lines.

huangapple
  • 本文由 发表于 2023年6月8日 09:35:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76428068.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定