2023年5月25日 19:27:46go评论99阅读模式

英文:

How to fill the values to a shell script from tsv file and create new file from 1st column value

问题

我有一个包含100个条目的data.tsv文件，如下所示

ColumnA	ColumnB	ColumnC	ColumnD	ColumnE
Cell 31	Cell 4	Cell 3	Cell 5	Cell 8
Cell 21	Cell 2	Cell 5	Cell 6	Cell 9

我有一个下面的template.in脚本，需要从tsv文件中填充上述值

for_example = $ColumnA
testmyhpothesis = $ColumnB $ColumnC
cleandir = $ColumnD $ColumnE
testOutdir = /path/todir/$ColumnD

并创建一个包含ColumnA值的新脚本文件

例如：
文件1. Cell 31.in

for_example = Cell 31
testmyhpothesis = Cell 4 Cell 3
cleandir = Cell 5 Cell 8
testOutdir = /path/todir/Cell 5

文件2. Cell 21.in

for_example = Cell 21
testmyhpothesis = Cell 2 Cell 5
cleandir = Cell 6 Cell 9
testOutdir = /path/todir/Cell 6

awk -F '\t' '{for (i = 2; i <= NF; i++) {print $i >> new_file}}' data.tsv

英文:

I have a data.tsv file with 100 entries, as below

ColumnA	ColumnB	ColumnC	ColumnD	ColumnE
Cell 31	Cell 4	Cell 3	Cell 5	Cell 8
Cell 21	Cell 2	Cell 5	Cell 6	Cell 9

and I have a template.in script below, that need to fill above values from tsv file

for_example = $ColumnA
testmyhpothesis = $ColumnB $ColumnC
cleandir = $ColumnD $ColumnE
testOutdir = /path/todir/$ColumnD

And create new script file with columnA value

eg:
File 1. Cell 31.in

for_example = Cell 31
testmyhpothesis = Cell 4 Cell 3
cleandir = Cell 5 Cell 8
testOutdir = /path/todir/Cell 5

File 2. Cell 21.in

for_example = Cell 21
testmyhpothesis = Cell 2 Cell 5
cleandir = Cell 6 Cell 9
testOutdir = /path/todir/Cell 6

awk -F &#39;\t&#39; &#39;{for (i = 2; i &lt;= NF; i++) {print $i &gt;&gt; new_file} data.tsv

答案1

得分: 2

以下是代码的中文翻译部分：

#!/bin/bash
{
    # 从TSV标题行获取变量的名称
    IFS=$'\t' read -r -a varnames
    export "${varnames[@]}" || exit 1
    # 将每个字段读入相应的变量
    while IFS=$'\t' read -r "${varnames[@]}"
    do
        # 在“template.in”中替换扩展
        envsubst "${varnames[*]/#/$}" < template.in > "${!varnames[0]}.in"
    done
} < data.tsv

请注意，代码中的注释也已经翻译成了中文。

英文:

Here's a solution that makes use of envsubst for replacing the $ColumnXXX in template.in:

#!/bin/bash
{
    # get the names of the variables from the TSV header
    IFS=$&#39;\t&#39; read -r -a varnames
    export &quot;${varnames[@]}&quot; || exit 1
    # read each field into its corresponding variable
    while IFS=$&#39;\t&#39; read -r &quot;${varnames[@]}&quot;
    do
        # replace the expansions in &quot;template.in&quot;
        envsubst &quot;${varnames[*]/#/$}&quot; &lt; template.in &gt; &quot;${!varnames[0]}.in&quot;
    done
} &lt; data.tsv

答案2

得分: 1

以下是翻译好的内容：

使用任何awk：
$ cat tst.awk
BEGIN { FS="\t" }
NR == FNR {
    tmplt[++numLines] = $0
    next
}
FNR == 1 {
    for ( fldNr=1; fldNr<=NF; fldNr++ ) {
        tag = "$" $fldNr
        tags2fldNrs[tag] = fldNr
    }
    next
}
{
    out = $(tags2fldNrs["$ColumnA"]) ".in"
    for ( lineNr=1; lineNr<=numLines; lineNr++ ) {
        line = tmplt[lineNr]
        for ( tag in tags2fldNrs ) {
            if ( s = index(line,tag) ) {
                fldNr = tags2fldNrs[tag]
                val = $fldNr
                line = substr(line,1,s-1) val substr(line,s+length(tag))
            }
        }
        print line > out
    }
    close(out)
}

$ awk -f tst.awk template.in data.tsv
$ head Cell*
==> Cell 21.in <==
#!/bin/bash
for_example = Cell 21
testmyhpothesis = Cell 2 Cell 5
cleandir = Cell 6 Cell 9
testOutdir = /path/todir/Cell 6
==> Cell 31.in <==
#!/bash
for_example = Cell 31
testmyhpothesis = Cell 4 Cell 3
cleandir = Cell 5 Cell 8
testOutdir = /path/todir/Cell 5
如果列标签（名称）可能是其他列标签的子字符串，例如`ColumnA`和`ColumnAB`，那么这个代码会失败，并且它假定`ColumnA`下的值始终是唯一的。这与您提供的示例一致，如果您的示例不正确或无法弄清楚如何适应您的实际数据，请发布一个新问题。

英文:

Using any awk:

$ cat tst.awk
BEGIN { FS=&quot;\t&quot; }
NR == FNR {
    tmplt[++numLines] = $0
    next
}
FNR == 1 {
    for ( fldNr=1; fldNr&lt;=NF; fldNr++ ) {
        tag = &quot;$&quot; $fldNr
        tags2fldNrs[tag] = fldNr
    }
    next
}
{
    out = $(tags2fldNrs[&quot;$ColumnA&quot;]) &quot;.in&quot;
    for ( lineNr=1; lineNr&lt;=numLines; lineNr++ ) {
        line = tmplt[lineNr]
        for ( tag in tags2fldNrs ) {
            if ( s = index(line,tag) ) {
                fldNr = tags2fldNrs[tag]
                val = $fldNr
                line = substr(line,1,s-1) val substr(line,s+length(tag))
            }
        }
        print line &gt; out
    }
    close(out)
}

<p>

$ awk -f tst.awk template.in data.tsv

<p>

$ head Cell*
==&gt; Cell 21.in &lt;==
#!/bin/bash
for_example = Cell 21
testmyhpothesis = Cell 2 Cell 5
cleandir = Cell 6 Cell 9
testOutdir = /path/todir/Cell 6
==&gt; Cell 31.in &lt;==
#!/bin/bash
for_example = Cell 31
testmyhpothesis = Cell 4 Cell 3
cleandir = Cell 5 Cell 8
testOutdir = /path/todir/Cell 5

That would fail if you could have column tags (names) that are substrings of others, e.g. ColumnA and ColumnAB, and it assumes the values under ColumnA are always unique. That's consistent with the example you provided so post a new question if your example is wrong and you can't figure out how to adapt this to suit your real data.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何从tsv文件中填充值到shell脚本，并根据第一列的值创建新文件。

问题

答案1

答案2

Running multiple concurrent processes in a bash script, such that if one dies it takes the other ones down with it

JQ和添加具有其他键值和字符串插值的键

将一个将CSV文件连接起来并添加一列的awk程序泛化。

使用变量来更改Bash中的标志以切换命令。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。