英文:
How to remove everything after the last . from a text file content
问题
你可以将以下的awk命令添加到你的bash脚本中来修改filename.txt文件:
awk -F. '{NF--; OFS="."} 1' filename.txt > modified_filename.txt
这将会创建一个名为modified_filename.txt
的新文件,其中包含了去除扩展名的内容。
英文:
I have a file called filename.txt which contains the long list of names with extension as follows:
I want to remove the extension, meaning everything after the last character .
MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
AST006_S003.zip
AST003_S066.zip
I tried this command awk '{print substr($0, 1, index($0, ".") - 1)}' filename.txt
but it removed everything after the first char .
MOR22A2
MOR22A2
AST006_S003
AST003_S066
I am looking to get something like below:
MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
AST006_S003
AST003_S066
I am using a bash script, how do I add the awk command in my bash script to modify the filename.txt file?
答案1
得分: 4
使用awk:
awk '{sub(/\.[^.]*$/, "")} 1' 文件
英文:
With awk:
awk '{sub(/\.[^.]*$/, "")} 1' file
</details>
# 答案2
**得分**: 3
i would say that you can use a for loop to iterate and construct a new string by concatenating from each field then print that out.
awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv
input:
cat test.csv
MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
output:
awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv
MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
But you also have other options such as 1. https://unix.stackexchange.com/questions/234432/how-to-delete-the-last-column-of-a-file-in-linux/234436#234436
Based on the first link, you can try:
awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv
output:
awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv
MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
<details>
<summary>英文:</summary>
i would say that you can use a for loop to iterate and construct a new string by concatenating from each field then print that out.
awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv
input:
cat test.csv
MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
output:
awk 'BEGIN{FS=OFS="."} { s=$1; for (i=2; i<NF; i++){ s=s"."$i }; print s }' test.csv
MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
But you also have other options such as 1. https://unix.stackexchange.com/questions/234432/how-to-delete-the-last-column-of-a-file-in-linux/234436#234436
Based on the first link, you can try:
awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv
output:
awk 'BEGIN{ FS=OFS="."} NF{NF-=1};1' test.csv
MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
</details>
# 答案3
**得分**: 2
如果您使用正则表达式并应用 '$',这将确保它只匹配字符串的末尾,因此您的完整正则表达式将是 `/[.][^.]*$/`,意思是“句号+非句号+字符串末尾”。
使用 `sed`,可以这样做:
```bash
sed 's/[.][^.]*$//' < file.txt
英文:
If you use a regular expression and apply '$' that will ensure that it only matches the end of the string, so your complete regex would be /[.][^.]*$/
which means "fullstop+nofullstops+endofstring".
With sed
, this would be:
sed 's/[.][^.]*$//' < file.txt
答案4
得分: 2
使用您提供的示例,请尝试以下awk
代码,使用match
函数来处理这个问题。
awk 'match($0,/^.*\./){print substr($0,RSTART,RLENGTH-1)}' Input_file
英文:
With your shown samples please try following awk
code using match
function for this one.
awk 'match($0,/^.*\./){print substr($0,RSTART,RLENGTH-1)}' Input_file
答案5
得分: 2
这可以通过使用 rev
和 cut
的组合来实现,如下所示,假设 file.txt
的内容如下:
MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
AST006_S003.zip
AST003_S066.zip
然后
rev file.txt | cut --delimiter='.' --fields=2- | rev
得到输出
MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
AST006_S003
AST003_S066
解释:使用 rev
获取行的镜像版本,然后通知 cut
它是以点分隔的,并选择第2个字段及其后的部分(注意 2
后面的 -
),然后再次镜像行以获得原始行。
英文:
This can be done using rev
and cut
combination as follows, let file.txt
content be
MOR22A2.S4000011.h23v22.061.2023063111017.hdf
MOR22A2.S4000011.h23v44.061.2023061111033.hdf
AST006_S003.zip
AST003_S066.zip
then
rev file.txt | cut --delimiter='.' --fields=2- | rev
gives output
MOR22A2.S4000011.h23v22.061.2023063111017
MOR22A2.S4000011.h23v44.061.2023061111033
AST006_S003
AST003_S066
Explanation: get mirror version of lines using rev
then inform cut
that it is dot-separated and select field 2nd and following (observe -
after 2
) then again mirror lines to get original lines.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论