如何使用Java从数据库中读取大量数据?

huangapple go评论69阅读模式
英文:

How to Read Large data from database using java?

问题

以下是您提供的代码的翻译部分:

我在我的表中有超过2GB的数据我需要从单个表中读取超过1GB的数据我知道在数据库端有多种可用的选项来实现这一点但我需要更好的Java代码方法有人可以用示例Java代码告诉我比如在多线程中进行并行处理

示例代码

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class SelectRowsExample {

  public static void main(String[] args) {

    Connection connection = null;
    try {

      // 加载MySQL JDBC驱动程序

      String driverName = "com.mysql.jdbc.Driver";

      Class.forName(driverName);

      String serverName = "localhost";

      String schema = "test";

      String url = "jdbc:mysql://" + serverName +  "/" + schema;

      String username = "username";

      String password = "password";

      connection = DriverManager.getConnection(url, username, password);

      System.out.println("成功连接到数据库!");

    } catch (ClassNotFoundException e) {

      System.out.println("找不到数据库驱动程序:" + e.getMessage());
    } catch (SQLException e) {

      System.out.println("无法连接到数据库:" + e.getMessage());
    }

    try {

      Statement statement = connection.createStatement();

      ResultSet results = statement.executeQuery("SELECT * FROM employee ORDER BY dept");

      while (results.next()) {

        String empname = results.getString("name");

        System.out.println("通过列索引获取行 " + results.getRow() + " 的数据:" + empname);

        String department = results.getString("department");

        System.out.println("通过列名获取行 " + results.getRow() + " 的数据:" + department);

      }

    } catch (SQLException e) {

      System.out.println("无法从数据库中检索数据:" + e.getMessage());
    }

  }
}

在这里,我的查询将返回姓名和部门详情,每个部门将返回超过1GB的数据。如果我使用这种方式,它肯定会减慢应用程序的速度。这就是为什么我考虑使用多线程中的并行处理。请有人善意地给我提供读取大量数据的快速方法的建议。

英文:

I am having more then 2gb data in my table i need to read more the 1gb data from the single table, i know various option available in db side to achieve this but i need better approach in java code, can any one tell with example java code like parallel processing in multi threading.

example Code

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
public class SelectRowsExample {
public static void main(String[] args) {
Connection connection = null;
try {
// Load the MySQL JDBC driver
String driverName = "com.mysql.jdbc.Driver";
Class.forName(driverName);
String serverName = "localhost";
String schema = "test";
String url = "jdbc:mysql://" + serverName +  "/" + schema;
String username = "username";
String password = "password";
connection = DriverManager.getConnection(url, username, password);
System.out.println("Successfully Connected to the database!");
} catch (ClassNotFoundException e) {
System.out.println("Could not find the database driver " + e.getMessage());
} catch (SQLException e) {
System.out.println("Could not connect to the database " + e.getMessage());
}
try {
Statement statement = connection.createStatement();
ResultSet results = statement.executeQuery("SELECT * FROM employee orderby dept");
while (results.next()) {
String empname = results.getString("name");
System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
String department = results.getString("department");
System.out.println("Fetching data by column name for row " + results.getRow() + " : " + department);
}
} catch (SQLException e) {
System.out.println("Could not retrieve data from the database " + e.getMessage());
}
}
}

Here my query will return name and department details more the 1gb data will come for each department. if i use this way it will surly slow down the application. that's why i thought go for parallel processing in multithreading. any one kindly give me the suggestion to read the huge amount of data quickly.

答案1

得分: 1

在你的示例中,你不必使用像并行处理这样高级的工具。而且这并不能一定解决你的问题,因为由于硬件、网络等原因可能会存在许多瓶颈,正如luk2302所提到的。

有两个简单得多的调整方法:

  • 只选择那些你真正需要的数据。即使你的员工记录有3列,你也可以节省1/3的数据,从而提高速度并降低内存消耗。更不用说如果它有更多的列。
ResultSet results = statement.executeQuery("SELECT name, department FROM employee orderby dept");
  • 默认的fetchSize可能不够。它的值取决于驱动程序,但是例如,默认情况下,当Oracle JDBC运行查询时,它会一次从数据库游标中检索10行结果集。我知道你在使用MySQL,但应该差不多。增加它可以减少到数据库游标的总体访问次数,而这是代价高昂的。因此,我建议将其增加到500或1000,甚至可以尝试更高的值。有关fetchSize的更多信息:https://stackoverflow.com/questions/1318354/what-does-statement-setfetchsizensize-method-really-do-in-sql-server-jdbc-driv
Statement statement = connection.createStatement();
statement.setFetchSize(1000);
  • +1 - System.out.println 也会减慢你的代码。你可以在这里阅读关于它的信息:https://stackoverflow.com/questions/4437715/why-is-system-out-println-so-slow 但最好是用一个日志记录库来替换,或者至少在测试目的下,你可以使用类似这样的东西:
if(results.getRow()%1000 == 0) {
    System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
}

祝好,
Nandor

英文:

In your example you don't have to use high calibre gun like paralellism. Also it doesn't necessarily solves your problem because there could be a lot of bottlenecks because of hardware, network, etc as luk2302 mentioned it.

There are two much easier tweaks:

  • Select only those data that you really need. Even if your employee record has 3 columns you can spare the 1/3 of the data which results in speed increase and lower memory consumption. Not to mention if it has much more columns.
ResultSet results = statement.executeQuery("SELECT name, department FROM employee orderby dept");
  • The default fetchSize won't be enough. It's value depends on the driver, but for example by default when Oracle JDBC runs a query, it retrieves a result set of 10 rows at a time from the database cursor. I know that you are using MySql but it should be about the same. Increasing it you can reduce the overall trip count to the database cursor which is costy. Therefore I recommend it to increase it to 500 or 1000, but you can even experiment with higher values. More info on fetchSize: https://stackoverflow.com/questions/1318354/what-does-statement-setfetchsizensize-method-really-do-in-sql-server-jdbc-driv
Statement statement = connection.createStatement();
statement.setFetchSize(1000);
if(results.getRow()%1000 == 0) {
System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
}

Br,
Nandor

huangapple
  • 本文由 发表于 2020年9月2日 18:45:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/63703887.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定