阅读CSV文件并查找每列的填充率。

huangapple go评论64阅读模式
英文:

Read CSV and find fill rate of each column

问题

读取CSV文件并计算CSV文件中每列的填充率我正在读取一个类似以下结构的CSV文件

    学号    , 姓名     ,  专业
    1      , 人员1   ,  计算机科学
    2      , 人员2   ,  计算机科学
    3      , 人员3   ,  计算机科学
           , 人员4   ,  计算机科学
    null   ,         ,  null

如我们所见**学号**在第3行后面没有数据因此其填充率为80%
对于列**姓名**填充率为90%因为第5行没有数据其他列也是如此

我希望输出类似于以下内容

    {
      "fillRate": [
        {
          "columnName": "学号",
          "fillRate": "80%"
        },
        {
          "columnName": "姓名",
          "fillRate": "90%"
        },
        {
          "columnName": "专业",
          "fillRate": "90%"
        }
      ]
    }

以下是我目前的代码

```java
readFile(String fileName) {
    String fullPath = fileDir + "/" + fileName;
    int totalNumOfRows = 0;
    int totalNumOfRowsExcludeHeader = 0;
    String[] headers = null;
    String[] columnsValue;
    Map<String, Integer> headerAndValidRecord = new LinkedHashMap<String, Integer>();
    Map<String, Integer> headerAndPercentageOfValidRecord = new LinkedHashMap<String, Integer>();
    List<Map<String, String>> responseList = new ArrayList<Map<String, String>>();
    try {
        InputStream in = sftpConnection.get(fullPath); // 从服务器获取文件
        BufferedReader br = new BufferedReader(new InputStreamReader(in, "UTF-8"));

        String line = null;

        while ((line = br.readLine()) != null) {
            totalNumOfRows++;

            if (totalNumOfRows == 1) {
                headers = line.split(",");
                for (int h = 0; h < headers.length; h++) {
                    headerAndValidRecord.put(headers[h], 0);
                }
            } else {
                columnsValue = line.split(",");

                for (int cV = 0; cV < columnsValue.length; cV++) {
                    if ((columnsValue[cV] != null) && (!((columnsValue[cV].trim()).isEmpty()))) {
                        int countOfValidRecord = 0;
                        countOfValidRecord = headerAndValidRecord.get(headers[cV]);
                        countOfValidRecord = countOfValidRecord + 1;
                        headerAndValidRecord.put(headers[cV], countOfValidRecord);
                    }
                } // --For loop close.
            }
        }
        totalNumOfRowsExcludeHeader = totalNumOfRows - 1;
        headerAndPercentageOfValidRecord = calculatePercentage(headerAndValidRecord, totalNumOfRowsExcludeHeader);

    } catch (Exception e) {
        e.printStackTrace();
    }
    return responseList;
}
英文:

I have to read the CSV file and calculate the fill rate of each column in a CSV file. I am reading a CSV file that looks like the following:

Roll No, Name,  Department
1      , Person1,  CS
2      , Person2,  CS
3      , Person3,  CS
, Person4,  CS
null   ,        ,  null

Now as we can see the column Roll No does not contains data after 3rd row so its fill rate would be 80%
and for column Name the fill rate would be 90% because there is no data in a 5th row and so on for the rest of the columns in a CSV file.

I want the output to be something like:

{
&quot;fillRate&quot;: [
{
&quot;columnName&quot;: &quot;Roll No&quot;,
&quot;fillRate&quot;: &quot;80%&quot;
},
{
&quot;columnName&quot;: &quot;Name&quot;,
&quot;fillRate&quot;: &quot;90%&quot;
},
{
&quot;columnName&quot;: &quot;Department&quot;,
&quot;fillRate&quot;: &quot;90%&quot;
}
]
}

Below is my code so far:

readFile(String fileName){
String fullPath= fileDir + &quot;/&quot; + fileName;
int totalNumOfRows=0;
int totalNumOfRowsExcludeHeader=0;
String[] headers = null;
String[] columnsValue;
Map&lt;String,Integer&gt; headerAndValidRecord = new LinkedHashMap&lt;String,Integer&gt;();
Map&lt;String,Integer&gt; headerAndPercentageOfValidRecord = new LinkedHashMap&lt;String,Integer&gt;();
List&lt;Map&lt;String,String&gt;&gt; responeList= new ArrayList&lt;Map&lt;String,String&gt;&gt;();
try{
InputStream in = sftpConnection.get(fullPath);// getting File from server
BufferedReader br = new BufferedReader(new InputStreamReader(in, &quot;UTF-8&quot;));
String line = null;
while ((line = br.readLine()) != null){
totalNumOfRows++;
if(totalNumOfRows==1){
headers =line.split(&quot;,&quot;);
for(int h=0;h&lt;headers.length;h++){
headerAndValidRecord.put(headers[h], 0);
}
}
else{
columnsValue =line.split(&quot;,&quot;);
for(int cV=0;cV&lt;columnsValue.length;cV++){
if((columnsValue[cV]!=null) &amp;&amp; (!((columnsValue[cV].trim()).isEmpty()))){
int countOfValidRecord=0;
countOfValidRecord=headerAndValidRecord.get(headers[cV]);
countOfValidRecord=countOfValidRecord+1;
headerAndValidRecord.put(headers[cV], countOfValidRecord);
}
}//--For loop close.
}
} 
totalNumOfRowsExcludeHeader=totalNumOfRows-1;
);
headerAndPercentageOfValidRecord=calculatePercentage(headerAndValidRecord,totalNumOfRowsExcludeHeader);	
}catch(Exception e){
e.printStackTrace();
}
return responeList;
}

答案1

得分: 1

为什么不尝试(total_filled/total_column)* 100。

英文:

why don't you try (total_filled/total_column) * 100.

huangapple
  • 本文由 发表于 2020年8月26日 16:41:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/63593832.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定