英文:
Java: How to sum all the values of one column based on the criteria in a second column using HashMaps
问题
我有一个包含大约500,000行和22列航班数据的CSV文件。第5列包含每次飞行的飞机尾号。第22列包含每次飞行的飞行距离。我试图计算每个尾号(第5列)的总飞行距离(第22列)。
我创建了一个名为map1
的包含所有数据的HashMap
。我创建了第二个名为planeMileages
的HashMap
,以将每次飞行的飞行号和总飞行距离存放其中。我使用嵌套的if语句遍历map1
的每一行,检查尾号是否已经包含在planeMileages
中。如果在planeMileages
中存在,我希望在该键的accumulatedMileages
上累加。如果不存在,我想输入该键以及其第一个距离值。
我目前编写的代码在我看来似乎是正确的,但它产生了错误的结果,输出了错误的尾号。请您看一下我的主方法,帮我找出我忽视的问题。谢谢!
public class FlightData {
HashMap<String,String[]> dataMap;
public static void main(String[] args) {
FlightData map1 = new FlightData();
map1.dataMap = map1.createHashMap();
HashMap<String, Integer> planeMileages = new HashMap();
//Filling the Array with all tail numbers
for (String[] value : map1.dataMap.values()) {
if(planeMileages.containsKey(value[4])) {
int accumulatedMileage = planeMileages.get(value[4]) + Integer.parseInt(value[21]);
planeMileages.remove(value[4]);
planeMileages.put(value[4], accumulatedMileage);
}
else {
planeMileages.put(value[4], Integer.parseInt(value[21]));
}
}
String maxKey = Collections.max(planeMileages.entrySet(), Map.Entry.comparingByValue()).getKey();
System.out.println(maxKey);
}
public HashMap<String,String[]> createHashMap() {
File flightFile = new File("flights.csv");
HashMap<String,String[]> flightsMap = new HashMap<String,String[]>();
try {
Scanner s = new Scanner(flightFile);
while (s.hasNextLine()) {
String info = s.nextLine();
String [] piecesOfInfo = info.split(",");
String flightKey = piecesOfInfo[4] + "_" + piecesOfInfo[2] + "_" + piecesOfInfo[11]; //Setting the Key
String[] values = Arrays.copyOfRange(piecesOfInfo, 0, piecesOfInfo.length);
flightsMap.put(flightKey, values);
}
s.close();
}
catch (FileNotFoundException e) {
System.out.println("Cannot open: " + flightFile);
}
return flightsMap;
}
}
请查看我CSV文件的几行示例:
DayofMonth DayOfWeek FlightDate UniqueCarrier TailNum OriginAirportID Origin OriginStateName DestAirportID Dest DestStateName DepTime DepDelay WheelsOff WheelsOn ArrTime ArrDelay Cancelled CancellationCode Diverted AirTime Distance
3 1 10/3/2016 AA N786AA 10721 BOS Massachusetts 12478 JFK New York 556 -4 623 703 709 -6 0 0 40 187
4 2 10/4/2016 AA N794AA 10721 BOS Massachusetts 12478 JFK New York 554 -6 615 703 712 -3 0 0 48 187
1 6 10/1/2016 AA N783AA 12478 JFK New York 12892 LAX California 823 -7 844 1104 1111 -30 0 0 320 2475
2 7 10/2/2016 AA N798AA 12478 JFK New York 12892 LAX California 847 17 904 1131 1159 18 0 0 327 2475
3 1 10/3/2016 AA N786AA 12478 JFK New York 12892 LAX California 825 -5 838 1109 1131 -10 0 0 331 2475
4 2 10/4/2016 AA N794AA 12478 JFK New York 12892 LAX California 826 -4 848 1114 1132 -9 0 0 326 2475
英文:
I have a CSV file that contains roughly 500,000 rows and 22 columns of flight data. The 5th Column contains the tail number of each plane for each flight. The 22nd column contains the distance traveled for each flight. I'm attempting to sum the total distance traveled (column 22) for each tail number (column 5).
I created a HashMap
containing all data named map1
. I created a 2nd HashMap
named planeMileages
to place each flight number and its total distance traveled into. I'm using a nested if statement go through each line of map1
and see if the tail number is already contained in planeMileages
. If it is in planeMileages
, then I want to add on to the accumulatedMileages
for that key. If it is not contained, I'd like to input the key along with it's first distance value.
The current code that I've written seems sound to me, but it is producing the wrong result, outputting the incorrect tail number. Can you please take a look and let me know what I am overlooking in my main method? Thanks!
public class FlightData {
HashMap<String,String[]> dataMap;
public static void main(String[] args) {
FlightData map1 = new FlightData();
map1.dataMap = map1.createHashMap();
HashMap<String, Integer> planeMileages = new HashMap();
//Filling the Array with all tail numbers
for (String[] value : map1.dataMap.values()) {
if(planeMileages.containsKey(value[4])) {
int accumulatedMileage = planeMileages.get(value[4]) + Integer.parseInt(value[21]);
planeMileages.remove(value[4]);
planeMileages.put(value[4], accumulatedMileage);
}
else {
planeMileages.put(value[4],Integer.parseInt(value[21]));
}
}
String maxKey = Collections.max(planeMileages.entrySet(), Map.Entry.comparingByValue()).getKey();
System.out.println(maxKey);
}
public HashMap<String,String[]> createHashMap() {
File flightFile = new File("flights.csv");
HashMap<String,String[]> flightsMap = new HashMap<String,String[]>();
try {
Scanner s = new Scanner(flightFile);
while (s.hasNextLine()) {
String info = s.nextLine();
String [] piecesOfInfo = info.split(",");
String flightKey = piecesOfInfo[4] + "_" + piecesOfInfo[2] + "_" + piecesOfInfo[11]; //Setting the Key
String[] values = Arrays.copyOfRange(piecesOfInfo, 0, piecesOfInfo.length);
flightsMap.put(flightKey, values);
}
s.close();
}
catch (FileNotFoundException e)
{
System.out.println("Cannot open: " + flightFile);
}
return flightsMap;
}
}
Please see a few lines of my CSV file below:
DayofMonth DayOfWeek FlightDate UniqueCarrier TailNum OriginAirportID Origin OriginStateName DestAirportID Dest DestStateName DepTime DepDelay WheelsOff WheelsOn ArrTime ArrDelay Cancelled CancellationCode Diverted AirTime Distance
3 1 10/3/2016 AA N786AA 10721 BOS Massachusetts 12478 JFK New York 556 -4 623 703 709 -6 0 0 40 187
4 2 10/4/2016 AA N794AA 10721 BOS Massachusetts 12478 JFK New York 554 -6 615 703 712 -3 0 0 48 187
1 6 10/1/2016 AA N783AA 12478 JFK New York 12892 LAX California 823 -7 844 1104 1111 -30 0 0 320 2475
2 7 10/2/2016 AA N798AA 12478 JFK New York 12892 LAX California 847 17 904 1131 1159 18 0 0 327 2475
3 1 10/3/2016 AA N786AA 12478 JFK New York 12892 LAX California 825 -5 838 1109 1131 -10 0 0 331 2475
4 2 10/4/2016 AA N794AA 12478 JFK New York 12892 LAX California 826 -4 848 1114 1132 -9 0 0 326 2475
答案1
得分: 3
这是一个更加面向对象编程(OOP)的方式来实现。
你扩展了 HashMap 并添加了两个新方法,一个用于添加航班,另一个用于计算总距离。
这样一来,你就不需要不断地将修改后的值从 HashMap 中移除和添加回去。
你可以根据需要进行扩展。
import java.io.File;
import java.io.FileNotFoundException;
import java.util.*;
public class Main {
public static void main(String[] args) {
FlightData flightData = getFlightDataFromFile();
flightData.getDistanceTraveled("tail number");
}
public static FlightData getFlightDataFromFile() {
File flightFile = new File("flights.csv");
FlightData flightData = new FlightData();
try {
Scanner s = new Scanner(flightFile);
while (s.hasNextLine()) {
String info = s.nextLine();
String[] piecesOfInfo = info.split(",");
String tailNr = piecesOfInfo[4];
Flight flight = new Flight(piecesOfInfo[6], piecesOfInfo[9], Integer.parseInt(piecesOfInfo[21]));
flightData.addFlight(tailNr, flight);
}
s.close();
} catch (FileNotFoundException e) {
System.out.println("Cannot open: " + flightFile);
}
return flightData;
}
}
class FlightData extends HashMap<String, List<Flight>> {
void addFlight(String tailNr, Flight flight) {
computeIfAbsent(tailNr, flights -> new ArrayList<>()).add(flight);
}
int getDistanceTraveled(String tailNr) {
int distance = 0;
for (Flight f : get(tailNr)) distance += f.distance;
return distance;
}
}
class Flight {
String from;
String to;
int distance;
public Flight(String from, String to, int distance) {
this.from = from;
this.to = to;
this.distance = distance;
}
}
英文:
Here is a bit more OOP way of doing it.
You extend the HashMap and add two new methods, one for adding flights and another for calculating total distance.
This way you are not constantly removing and adding back modified values into the HashMap.
You can expand on this to fit your needs.
import java.io.File;
import java.io.FileNotFoundException;
import java.util.*;
public class Main {
public static void main(String[] args) {
FlightData flightData = getFlightDataFromFile();
flightData.getDistanceTraveled("tail number");
}
public static FlightData getFlightDataFromFile() {
File flightFile = new File("flights.csv");
FlightData flightData= new FlightData();
try {
Scanner s = new Scanner(flightFile);
while (s.hasNextLine()) {
String info = s.nextLine();
String[] piecesOfInfo = info.split(",");
String tailNr= piecesOfInfo[4];
Flight flight = new Flight(piecesOfInfo[6], piecesOfInfo[9], Integer.parseInt(piecesOfInfo[21]));
flightData.addFlight(tailNr, flight);
}
s.close();
} catch (FileNotFoundException e) {
System.out.println("Cannot open: " + flightFile);
}
return flightData;
}
}
class FlightData extends HashMap<String,List<Flight>> {
void addFlight(String tailNr, Flight flight) {
computeIfAbsent(tailNr, flights -> new ArrayList<>()).add(flight);
}
int getDistanceTraveled(String tailNr) {
int distance = 0;
for (Flight f : get(tailNr)) distance+= f.distance;
return distance;
}
}
class Flight {
String from;
String to;
int distance;
public Flight(String from, String to, int distance) {
this.from = from;
this.to = to;
this.distance = distance;
}
}
答案2
得分: 1
public static void main(String[] args) throws IOException {
Map<String, String[]> map = createMap();
Map<String, Long> planeMileages = map
.entrySet()
.stream()
.collect(Collectors.groupingBy(o -> o.getValue()[4],
Collectors.collectingAndThen(
Collectors.summarizingInt(value ->
Integer.parseInt(value.getValue()[21])), IntSummaryStatistics::getSum
)
));
String maxKey = planeMileages.entrySet().stream().max(Comparator.comparing(Map.Entry::getValue)).get().getKey();
System.out.println("max key: " + maxKey);
}
public static Map<String, String[]> createMap() throws IOException {
try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
return a.lines().map(s -> s.split(","))
.collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity()));
}
}
public static Map<String, String[]> createMapLastDupWins() throws IOException {
try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
return a.lines().map(s -> s.split(","))
.collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity(), (strings, strings2) -> {
//if this helps than data is duplicated
return strings2;
}));
}
}
英文:
Hello can you check this?
public static void main(String[] args) throws IOException {
Map<String, String[]> map = createMap();
Map<String, Long> planeMileages = map
.entrySet()
.stream()
.collect(Collectors.groupingBy(o -> o.getValue()[4],
Collectors.collectingAndThen(
Collectors.summarizingInt(value ->
Integer.parseInt(value.getValue()[21])), IntSummaryStatistics::getSum
)
));
String maxKey = planeMileages.entrySet().stream().max(Comparator.comparing(Map.Entry::getValue)).get().getKey();
System.out.println("max key: "+ maxKey);
}
public static Map<String, String[]> createMap() throws IOException {
try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
return a.lines().map(s -> s.split(","))
.collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity()));
}
}
public static Map<String, String[]> createMapLastDupWins() throws IOException {
try (BufferedReader a = Files.newBufferedReader(Paths.get("flights.csv"))) {
return a.lines().map(s -> s.split(","))
.collect(Collectors.toMap(piecesOfInfo -> String.join("_", piecesOfInfo[4], piecesOfInfo[2], piecesOfInfo[11]), Function.identity(), (strings, strings2) -> {
//if this helps than data is duplicated
return strings2;
}));
}
}
答案3
得分: 1
试一试这个,如果里程非常大,请将Integer更改为Long,然后进行检查:
HashMap<String, Integer> planeMileages = new HashMap<>();
for (String[] value : flightsMap.values()) {
if (planeMileages.containsKey(value[4])) {
planeMileages.put(value[4], planeMileages.get(value[4]) + Integer.valueOf(value[21]));
} else {
planeMileages.put(value[4], Integer.valueOf(value[21]));
}
}
英文:
Try this , and if the miles are very huge change Integer to Long and then check
HashMap<String, Integer> planeMileages = new HashMap<>();
for (String [] value : flightsMap.values()) {
if(planeMileages.containsKey(value[4])) {
planeMileages.put(value[4], planeMileages.get(value[4])+Integer.valueOf(value[21]));
} else {
planeMileages.put(value[4], Integer.valueOf(value[21]));
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论