英文:
Renaming multiple files in bash by removing prefix and suffix containing special characters %?=
问题
最近我对Google Cloud Service API端点发出了请求并将许多文件wget到一个文件夹中。由于所有子目录分隔符0/
被替换为%2F
,并加上?alt=media
,所有下载的文件都受到了这些字符串的影响。例如:
我尝试在bash中测试以下内容,它返回了我想要的结果:
即Homo_sapiens_assembly19.fasta.alt。不幸的是,当我使用以下方式扩展它时,
所有文件都变成了一个名为"$file"的文件。我搞不清楚为什么。
请问有人能提供解决我的问题的方法吗?如果一些文件包含不同重复的"%2F",如何优雅地只保留最后一个"%2F"后面的字符串,并在同一行上删除末尾的"?alt=media"?
英文:
I recently make a request against the Google Cloud Service API endpoint and wget a lot of files into one single folder. Owing to the fact that all sub-directories separator 0/
are being replaced by %2F
with the addition of ?alt=media
, all the downloaded files are contaminated with these strings. e.g.
hg38%2Fv0%2FHomo_sapiens_assembly38.dict?alt=media
hg19%2Fv0%2FHomo_sapiens_assembly19.fasta.alt?alt=media
I tried to test the following in bash and it returned the result i wanted:
echo "$hg19%2Fv0%2FHomo_sapiens_assembly19.fasta.alt?alt=media" | sed -e "s/^$hg19%2Fv0%2F//" -e "s/\?.*//g"
i.e. Homo_sapiens_assembly19.fasta.alt. Unfortunately when I scaled it up using,
for file in *; do
mv "$file" '$(echo "$file" | sed -e "s/^$hg19%2Fv0%2F//" -e "s/\?.*//g")' ;
done
all the files turned into 1 file named "$file". I couldnt figure out why.
Please can anyone provide a solution to my problem? And if some of the files contain different repeats of "%2F", how can I elegantly only keep the string after the last "%2F" and string the "?alt=media" from the end in the same line?
Thank you in advance.
答案1
得分: 1
实际上,要删除除了最后一个以外的所有 %2F
出现的地方,你可以这样做:
echo "hg38%2Fv0%2FHomo_sapiens_assembly38.dict?alt=media" | sed -e "s/.*%2F\([^%]*\)?alt.*//"
- ".*%2F" 匹配任何字符,后跟最后一次出现的 "%2F"。
- "([^%]*)" 捕获任何不是 "%" 的字符。
- "?alt.*" 匹配字符串 "?alt" 后跟任何字符。
结果是:
Homo_sapiens_assembly38.dict
关于 for
循环,类似于这样:
for file in *
do
mv "$file" "$(echo "$file" | sed -e "s/^$hg19%2Fv0%2F//" -e "s/\?.*//g")"
done
请注意,你需要将 $hg19
替换为正确的变量或字符串。
英文:
actually to removing all occurrences of %2F
except for the last one, you can do like this:
echo "hg38%2Fv0%2FHomo_sapiens_assembly38.dict?alt=media" | sed -e "s/.*%2F\([^%]*\)\?alt.*//"
- ".*%2F" matches any characters followed by the last occurrence of "%2F".
- "([^%]*)" captures any characters that are not "%".
- "?alt.*" matches the string "?alt" followed by any characters.
result is :
Homo_sapiens_assembly38.dict
and about the for
loop something like this :
for file in *
do mv "$file" "$(echo "$file" | sed -e "s/^$hg19%2Fv0%2F//" -e "s/\?.*//g")"
done
答案2
得分: 1
使用.*
匹配到最后一个%2F
之前。
将命令替换放在双引号内,而不是单引号内。请参见https://stackoverflow.com/questions/6697753/difference-between-single-and-double-quotes-in-bash
在hg
之前不要加$
。
这不是必需的,但通常将sed
命令放在单引号内,除非在替换中使用变量。
for file in *; do
mv "$file" "$(echo "$file" | sed -e 's/^hg.*%2F//' -e 's/\?.*//g')" ;
done
英文:
Use .*
to match everything up to the last %2F
.
Put the command substitution inside double quotes, not single quotes. See https://stackoverflow.com/questions/6697753/difference-between-single-and-double-quotes-in-bash
Don't put $
before hg
at the beginning.
It's not a requirement, but sed
commands are usually put in single quotes, unless you're using variables in the substitution.
for file in *; do
mv "$file" "$(echo "$file" | sed -e 's/^hg.*%2F//' -e 's/\?.*//g')" ;
done
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论