2023年1月9日 19:08:34go评论91阅读模式

英文:

Using nested for loop Write a bash script to validate charectors and data type in any two field records are matching or not

问题

以下是您要求的代码部分的翻译：

# 获取当前日期
now=`date +"%d_%m_%y"`
# 构建文件名
file=transaction${now}.dat.gz
# 将头部字段逐行存入header_fieldc变量
header_fieldc=`zcat $file | head -1 | tr "|" "\n"`
# 将header_fieldc拆分为数组a
a=( $header_fieldc )
# 使用循环遍历每个字段
for (( i=0; i<=${#a[@]}; i++ ));do
  echo "${a[i]}"
  
  # 对于第一个字段特殊处理
  if [ $i == 0 ];then
    i=`expr $i + 1 `
    # 从文件中提取第i个字段的记录
    rec_fieldc=`zcat $file | sed '1d;$d' | cut -d\| -f $i `
  fi
  
  # 将rec_fieldc字段拆分为数组b
  b=( $rec_fieldc )
  # 使用循环遍历记录中的每个元素
  for (( j=0; j<=${#b[@]}; j++ ));do
    echo "${b[j]}"
    var=`echo "${b[j]}" `
    # 检查元素是否为整数
    if [[ "${var}" =~ ^[0-9]+$ ]];then
      echo " ${b[j]} 是有效的记录"
    else
      echo "${a[j]} 字段中存在无效字符" > exception.txt
      exit 0
    fi
  done
done

请注意，我已经纠正了一些代码中的拼写和格式错误。如果需要进一步的帮助或解释，请随时提出。

英文:

Sample data:

Header | &lt;Transaction_ID&gt; | &lt;Item_ Name&gt; |&lt;Item_Type&gt; | &lt;Customer_ID&gt; | &lt;Type_of_Transaction&gt; | &lt;Payment_Method&gt;|Amount
Data |1001 |Samsung |Handset |R2R003 |Online |Credit Card |100|
Data | 1004|LG |TV | R2R042| Online | Debit card|150.24|
Trailer | 2

Here number of fields in header is 7.We need to check whether the charectors in any two field records are matching or not and also we need to check whether the data type of fields matching with it's record.

Requirement:

Need to use nested for loop to perform validation of any two or three field records.

I tried this code below but it works fine for one field records.

!# /bin/bash 
now=`date +&quot;%d_%m_%y&quot; ` 
file=transaction${now}.dat.gz 
#header_fieldc is a parameter which has each header fields in new line 
header_fieldc=`zcat $file | head -1|tr &quot;|&quot; &quot;\n&quot; ` 
a=( $header_fieldc) 
for (( i=0; i&lt;=${#a[@]}; i++ ));do 
 
  echo &quot;${a[i]}&quot; 
 
  if [ $i == 0 ];then 
  
    i=`expr $i + 1 ` 
    rec_fieldc=`zcat $file |sed&#39;1d;$d&#39; 
    |cut -d\| -f $i `
  
  fi 
  #rec_fieldc parameter contains records of ith header field .
 
  b=( $rec_fieldc )
 
   for (( j=0; j&lt;=${#b[@]}; j++ ));do 
         
          echo &quot;${b[j]}&quot; 
         
          var=`echo &quot;${b[j]}&quot; ` 
         
          if [[ &quot;${var}&quot; =~ ^[0-9]+$ ]];then 
               echo &quot; ${b[j]} valid&quot; 
        
           else 
                 echo &quot;invalid character precent in ${a[j]} field&quot; &gt;exception.txt 
                 exit 0 
           fi 
   done
 
done

Output:

&lt;TransactionID&gt; 
1001 is a valid record 
1004 is a valid record

答案1

得分: 0

**注意：不要忘记，if语句使用 `==` 来比较，而不是 `=`，否则我认为您的代码可能会在此方面工作，如果您更正了这个错误。**
我复制了设置，如下所示。我添加了一些具有更多/更少字段的行以进行演示。sample_data.txt的内容：
```plaintext
Header | &lt;Transaction ID&gt; | &lt;Item Name&gt; |&lt;Item Type&gt; | &lt;Customer ID&gt; | &lt;Type of Transaction&gt; | &lt;Payment Method&gt;| Amount
Data |1001 |Samsung |Handset |R2R003 |Online |Credit Card |100|
Data |1001 |Samsung |Handset |R2R003 |Online |extra |Credit Card |100|
Data |1001 |Samsung |Online |Credit Card |100|
Data | 1004|LG |TV | R2R042| Online | Debit card|150.24|
Data |1001 |Samsung |Handset |R2R003 |Online |extra |Credit Card |100|
Trailer | 2

以下是test.sh脚本：

#!/bin/bash
header_field_count=$(cat sample_data.txt | awk -F &#39;|&#39; &#39;{print NF}&#39; |head -1)
echo &#39;header_field_count:&#39; $header_field_count
number_of_lines=$(wc -l &lt; sample_data.txt)
echo &#39;number of lines to process in file:&#39; $number_of_lines
let current_line=2 #跳过第一行，因为那是标题
data_field_count_array=($(cat sample_data.txt | awk -F &#39;|&#39; &#39;{print NF -1}&#39;)) # 注意NF -1，因为这些行末尾有一个额外的分隔符
while [ $current_line -lt $number_of_lines ]; do
  echo &#39;line:&#39; $current_line &#39;has&#39; ${data_field_count_array[$current_line-1]} &#39;fields&#39; #bash数组从零开始索引，因此要减1（第1行的索引为0）
  if [[ ${data_field_count_array[$current_line-1]} == $header_field_count ]]; then
    echo &#39;EQUAL to the Header&#39;
  else
    echo &#39;NOT EQUAL to the Header&#39;
  fi
  let current_line+=1
done
# 这将打印样本数据中的任何重复行
echo -n &#39;duplicate lines: &#39;
echo $(sort sample_data.txt  | uniq -d)

运行时的输出如下：

header_field_count: 8
number of lines to process in file: 6
line: 2 has 8 fields
EQUAL to the Header
line: 3 has 9 fields
NOT EQUAL to the Header
line: 4 has 6 fields
NOT EQUAL to the Header
line: 5 has 8 fields
EQUAL to the Header
duplicate lines: Data |1001 |Samsung |Handset |R2R003 |Online |extra |Credit Card |100|

它将适用于任何数量的数据行。

您可以进行更多的研究并添加更多的验证示例来检查字段是否为有效数字。请参阅：https://stackoverflow.com/questions/806906/how-do-i-test-if-a-variable-is-a-number-in-bash

字符类型的示例验证：
您可以通过调用以下脚本来测试此脚本：

./script.sh Test this string for me 将通过字符串检查

./script.sh -187.8 将通过数字检查

#!/bin/bash
#正则表达式示例
echo &#39;testing string:&quot;&#39;$@&#39;&quot;&#39;
re=&#39;^[0-9 -.]&#39;     #数字、数字和负数或小数点允许
if ! [[ $@ =~ $re ]]; then #将输入与允许的字符正则表达式进行比较
  echo &#39;not a decimal or negative number&#39;
fi
re=&#39;^[a-z ,]&#39;     #允许小写字母和空格逗号
if ! [[ $@ =~ $re ]]; then
  echo &#39;not a lowercase string with SPACE or COMA&#39;
fi
re=&#39;^[a-zA-Z ,.;]&#39; #允许大小写字母和空格、逗号、点、分号
if ! [[ $@ =~ $re ]]; then
  echo &#39;not a text string&#39;
fi

如果要检查电子邮件等内容，可以将带有@等字符的内容添加到括号中。

这是最后的编辑（问题太详细了）。

#!/bin/bash
header_field_count=$(cat sample_data.txt | awk -F &#39;|&#39; &#39;{print NF}&#39; |head -1)
echo &#39;header_field_count:&#39; $header_field_count
number_of_lines=$(wc -l &lt; sample_data.txt)
echo &#39;number of lines to process in file:&#39; $number_of_lines
let current_line=2 #跳过第一行，因为那是标题
data_field_count_array=($(cat sample_data.txt | awk -F &#39;|&#39; &#39;{print NF -1}&#39;)) # 注意NF -1，因为这些行末尾有一个额外的分隔符
while [ $current_line -lt $number_of_lines ]; do
  echo &#39;line:&#39; $current_line &#39;has&#39; ${data_field_count_array[$current_line-1]} &#39;fields&#39; #bash数组从零开始索引，因此要减1（第1行的索引为0）
  if [[ ${data_field_count_array[$current_line-1]} == $header_field_count ]]; then
    echo &#39;EQUAL to the Header&#39;
  else
    echo &#39;NOT EQUAL to the Header&#39;
  fi
  let current_line+=1
done
fieldstocheck=(&quot;Amount&quot; &quot;&lt;Item Name&gt;&quot;)  #要检查的字段名称；数组可以扩展
fieldtypecheck=(&quot;num&quot; &quot;string&quot;)     #我们指定检查要求（例如，Customer ID需要
<details>
<summary>英文:</summary>
**NOTE: do not forget that if statements use `==` to compare and not `=` otherwise I think your code might work if you correct this.**
I replicated the setup like so. I added a few lines that have more/fewer fields for the demo. Contents of sample_data.txt:


Here is the script test.sh:

this will print any duplicate lines in the sample data

echo -n 'duplicate lines: '
echo $(sort sample_data.txt | uniq -d)

when running this is the output:


it will work with any number of data lines
You can do more research and add more validation example to check if fields are valid numbers. See: https://stackoverflow.com/questions/806906/how-do-i-test-if-a-variable-is-a-number-in-bash
Example validation of character types:
you can play with this script by calling it:
`./script.sh Test this string for me` will pass the string check
`./script.sh -187.8` will pass the number check

#!/bin/bash
#regex example
echo 'testing string:"'$@'"'
re='^[0-9 -.]' #number digits and negative or decimal allowed
if ! [[ $@ =~ $re ]]; then #compare input with the allowed character regex
echo 'not a decimal or negative number'
fi
re='^[a-z ,]' #lower case letters and SPACE COMA allowed
if ! [[ $@ =~ $re ]]; then
echo 'not a lowercase string with SPACE or COMA'
fi
re='^[a-zA-Z ,.;]' #lower and uppercase letters and SPACE COMA DOT SEMICOLON allowed
if ! [[ $@ =~ $re ]]; then
echo 'not a text string'
fi

You can add whatever characters like `@` into the brackets if you want to check for e-mails or something.
This is the last edit (the question is too far into details)

#!/bin/bash
header_field_count=$(cat sample_data.txt | awk -F '|' '{print NF}' |head -1)
echo 'header_field_count:' $header_field_count
number_of_lines=$(wc -l < sample_data.txt)
echo 'number of lines to process in file:' $number_of_lines
let current_line=2 #skip 1st line because that is header
data_field_count_array=($(cat sample_data.txt | awk -F '|' '{print NF -1}')) # note NF -1 because these lines have an extra separator at the end
while [ $current_line -lt $number_of_lines ]; do
echo 'line:' $current_line 'has' ${data_field_count_array[$current_line-1]} 'fields' #bash arrays are zero indexed therefore the -1 (line 1 is index 0)
if [[ ${data_field_count_array[$current_line-1]} == $header_field_count ]]; then
echo 'EQUAL to the Header'
else
echo 'NOT EQUAL to the Header'
fi
let current_line+=1
done
fieldstocheck=("Amount" "<Item Name>") #name of fields to check; array may be expanded
fieldtypecheck=("num" "string") #we specify the check requirements (customer ID needs to be a number and so on)
#find which field index in each row corresponds to fieldstocheck
for i in $(seq 1 1 $header_field_count) ; do
field=$(cat sample_data.txt | head -1 | awk -F '|' '{print $'$i'}')
let finalindex=${#fieldstocheck[@]}-1
for j in $(seq 0 1 $finalindex); do
if [[ "$field" =~ "${fieldstocheck[j]}" ]]; then
echo $field '==' ${fieldstocheck[j]} 'at index:' $i
fieldstocheck[j]=$i
fi
done
done

#check the column entries if they are valid
let finalindex=${#fieldstocheck[@]}-1
for i in $(seq 0 1 $finalindex); do
echo $i
echo 'column' "${fieldstocheck[$i]}" 'needs to be a' "${fieldtypecheck[$i]}"
fieldlist=("$(cat sample_data.txt | awk -F '|' '{print $'${fieldstocheck[$i]}'}')")
for j in ${fieldlist[@]}; do
case "${fieldtypecheck[$i]}" in
num)
re='^[0-9 -.]' #number digits and negative or decimal allowed
if [[ "$j" =~ $re ]]; then
echo 'OK number' $j
else
echo 'ERROR not a decimal or negative number' $j
fi
;;
string)
re='^[a-zA-Z ,.;]' #lower and uppercase letters and SPACE COMA DOT SEMICOLON allowed
if [[ "$j" =~ $re ]]; then
echo 'OK string' $j
else
echo 'ERROR not a string' $j
fi
;;
*)
echo 'not valid variable type'
;;
esac
done
done


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Using nested for loop Write a bash script to validate charectors and data type in any two field records are matching or not

问题

答案1

this will print any duplicate lines in the sample data

终止运行中的命令当满足预期输出条件。

Shell command execution on Python2.7

检查一个字符串数组与另一个数组 – Bash

Bash脚本查找CSV文件中的重复数值

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。