英文:
Shell script to split a url and extract the variables
问题
我有一个类似这样的字符串:"git@github.com:myOrg/my-repo.git",我尝试拆分这个URL字符串并获取子字符串"myOrg"和"my-repo"。
我尝试了下面的脚本,但它失败了
REPO=$(echo $SSH_URL | sed -n 's#.*:\(.*\)/.*##p')
GIT_ORG=$(echo $SSH_URL | sed -n 's/^.*:\([^/]*\)\/\(.*\)\.git$//p')
但我得到以下错误
sed:-e表达式#1,第39个字符:未知的s选项
英文:
I have a string like this "git@github.com:myOrg/my-repo.git", I am trying to split the url string and get the substrings "myOrg" and "my-repo".
I tried below script but it is failing
REPO=$(echo $SSH_URL | sed -n 's#.*:\\(.*\\)/.*#\#p')
GIT_ORG=$(echo $SSH_URL | sed -n 's/^.*:\\([^\\/]*\\)\\/\\(.*\\)\\.git$/\/p')
but I am getting below error
sed: -e expression #1, char 39: unknown option to s
can someone please help
答案1
得分: 3
You could use parameter expansion to cut off the unneeded parts:
url="${SSH_URL}" # make a copy
url="${url#*:}" # drop until colon
url="${url%.git}" # drop extension
owner="${url%/*}" # extract "myOrg"
repo="${url#*/}" # extract "my-repo"
英文:
You could use parameter expansion to cut off the unneeded parts:
url="${SSH_URL}" # make a copy
url="${url#*:}" # drop until colon
url="${url%.git}" # drop extension
owner="${url%/*}" # extract "myOrg"
repo="${url#*/}" # extract "my-repo"
答案2
得分: 3
你可以使用内置的 read
命令:(编辑: 使用参数扩展去掉了 .git
)
#!/bin/bash
SSH_URL=git@github.com:myOrg/my-repo.git
IFS=':/ ' read -r _ git_org git_repo <<< "${SSH_URL%.git}"
$ echo "$git_org"
myOrg
$ echo "$git_repo"
my-repo
英文:
You could use the read
builtin: (edit: added the stripping of .git
with a parameter expansion)
#!/bin/bash
SSH_URL=git@github.com:myOrg/my-repo.git
IFS=':/' read -r _ git_org git_repo <<< "${SSH_URL%.git}"
$ echo "$git_org"
myOrg
$ echo "$git_repo"
my-repo
答案3
得分: 3
Use a regular expression.
[[ $SSH_URL =~ git@github.com:(.*)/(.*)\.git ]]
REPO=${BASH_REMATCH[2]}
GIT_ORG=${BASH_REMATCH[1]}
BASH_REMATCH
数组的第一个元素(0)是整个匹配的字符串;随后的元素是正则表达式中捕获组的内容,从左到右。
英文:
Use a regular expression.
[[ $SSH_URL =~ git@github.com:(.*)/(.*)\.git ]]
REPO=${BASH_REMATCH[2]}
GIT_ORG=${BASH_REMATCH[1]}
The first element (0) of the BASH_REMATCH
array is the entire matched string; subsequent elements are the contents of any capture groups in the regular expression, from left to right.
答案4
得分: 1
以下是您要翻译的内容:
"While the other answers are going to be more efficient (ie, they don't incur the overhead of spawning 4x subshells), some considerations re: OP's current sed
solution:
- in the 1st
sed
script the#
is used as the script delimiter - in the 2nd
sed
script the/
is used as the script delimiter - the error is being generated by the 2nd
sed
script because the/
also shows up in the data (ie,sed
can't distinguish between a/
used as a delimiter vs. a/
as part of the data) - try using
#
as the script delimiter in the 2ndsed
command to eliminate the error message
As for the current regexes, this may be easier to address if we enable extended regex support (-E
or -r
), eg:
$ echo "$SSH_URL" | sed -nE 's#^.:([^/])/.*$#\1#p'
myOrg
$ echo "$SSH_URL" | sed -nE 's#^./([^.])..*$#\1#p'
my-repogit
Eliminating the pipe/subshell with a here-string (<<< \"$var\"
):
$ sed -nE 's#^.:([^/])/.*$#\1#p' <<< "$SSH_URL"
myOrg
$ sed -nE 's#^./([^.])..*$#\1#p' <<< "$SSH_URL"
my-repo
Pulling all of this into OP's current code:
$ REPO=$(sed -nE 's#^.:([^/])/.$#\1#p' <<< "$SSH_URL")
$ GIT_ORG=$(sed -nE 's#^./([^.])..$#\1#p' <<< "$SSH_URL")
$ typeset -p REPO GIT_ORG
declare -- REPO="myOrg"
declare -- GIT_ORG="my-repo"
NOTES:
- the $( ... ) construct will still require a subshell to be spawned (2 total in this case)
- consider getting into the habit of using lower-cased variable names (eg, ssh_url, repo, and git_org) to minimize the (future) chance of overwriting system variables (eg, PWD, HOME, PATH)"
请注意,我已经排除了代码部分,只翻译了文本内容。
英文:
While the other answers are going to be more efficient (ie, they don't incur the overhead of spawning 4x subshells), some considerations re: OP's current sed
solution:
- in the 1st
sed
script the#
is used as the script delimiter - in the 2nd
sed
script the/
is used as the script delimiter - the error is being generated by the 2nd
sed
script because the/
also shows up in the data (ie,sed
can't distinguish between a/
used as a delimiter vs. a/
as part of the data) - try using
#
as the script delimiter in the 2ndsed
command to eliminate the error message
As for the current regexes, this may be easier to address if we enable extended regex support (-E
or -r
), eg:
$ echo "$SSH_URL" | sed -nE 's#^.*:([^/]*)/.*$##p'
myOrg
$ echo "$SSH_URL" | sed -nE 's#^.*/([^\.]*)\..*$##p'
my-repogit
Eliminating the pipe/subshell with a here-string (<<< "$var"
):
$ sed -nE 's#^.*:([^/]*)/.*$##p' <<< "$SSH_URL"
myOrg
$ sed -nE 's#^.*/([^\.]*)\..*$##p' <<< "$SSH_URL"
my-repo
Pulling all of this into OP's current code:
$ REPO=$(sed -nE 's#^.*:([^/]*)/.*$##p' <<< "$SSH_URL")
$ GIT_ORG=$(sed -nE 's#^.*/([^\.]*)\..*$##p' <<< "$SSH_URL")
$ typeset -p REPO GIT_ORG
declare -- REPO="myOrg"
declare -- GIT_ORG="my-repo"
NOTES:
- the
$( ... )
construct will still require a subshell to be spawned (2 total in this case) - consider getting into the habit of using lower-cased variable names (eg,
ssh_url
,repo
andgit_org
) to minimize the (future) chance of overwriting system variables (eg, PWD, HOME, PATH)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论