英文:
How to Read Large data from database using java?
问题
以下是您提供的代码的翻译部分:
我在我的表中有超过2GB的数据,我需要从单个表中读取超过1GB的数据,我知道在数据库端有多种可用的选项来实现这一点,但我需要更好的Java代码方法,有人可以用示例Java代码告诉我,比如在多线程中进行并行处理。
示例代码:
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
public class SelectRowsExample {
public static void main(String[] args) {
Connection connection = null;
try {
// 加载MySQL JDBC驱动程序
String driverName = "com.mysql.jdbc.Driver";
Class.forName(driverName);
String serverName = "localhost";
String schema = "test";
String url = "jdbc:mysql://" + serverName + "/" + schema;
String username = "username";
String password = "password";
connection = DriverManager.getConnection(url, username, password);
System.out.println("成功连接到数据库!");
} catch (ClassNotFoundException e) {
System.out.println("找不到数据库驱动程序:" + e.getMessage());
} catch (SQLException e) {
System.out.println("无法连接到数据库:" + e.getMessage());
}
try {
Statement statement = connection.createStatement();
ResultSet results = statement.executeQuery("SELECT * FROM employee ORDER BY dept");
while (results.next()) {
String empname = results.getString("name");
System.out.println("通过列索引获取行 " + results.getRow() + " 的数据:" + empname);
String department = results.getString("department");
System.out.println("通过列名获取行 " + results.getRow() + " 的数据:" + department);
}
} catch (SQLException e) {
System.out.println("无法从数据库中检索数据:" + e.getMessage());
}
}
}
在这里,我的查询将返回姓名和部门详情,每个部门将返回超过1GB的数据。如果我使用这种方式,它肯定会减慢应用程序的速度。这就是为什么我考虑使用多线程中的并行处理。请有人善意地给我提供读取大量数据的快速方法的建议。
英文:
I am having more then 2gb data in my table i need to read more the 1gb data from the single table, i know various option available in db side to achieve this but i need better approach in java code, can any one tell with example java code like parallel processing in multi threading.
example Code
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
public class SelectRowsExample {
public static void main(String[] args) {
Connection connection = null;
try {
// Load the MySQL JDBC driver
String driverName = "com.mysql.jdbc.Driver";
Class.forName(driverName);
String serverName = "localhost";
String schema = "test";
String url = "jdbc:mysql://" + serverName + "/" + schema;
String username = "username";
String password = "password";
connection = DriverManager.getConnection(url, username, password);
System.out.println("Successfully Connected to the database!");
} catch (ClassNotFoundException e) {
System.out.println("Could not find the database driver " + e.getMessage());
} catch (SQLException e) {
System.out.println("Could not connect to the database " + e.getMessage());
}
try {
Statement statement = connection.createStatement();
ResultSet results = statement.executeQuery("SELECT * FROM employee orderby dept");
while (results.next()) {
String empname = results.getString("name");
System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
String department = results.getString("department");
System.out.println("Fetching data by column name for row " + results.getRow() + " : " + department);
}
} catch (SQLException e) {
System.out.println("Could not retrieve data from the database " + e.getMessage());
}
}
}
Here my query will return name and department details more the 1gb data will come for each department. if i use this way it will surly slow down the application. that's why i thought go for parallel processing in multithreading. any one kindly give me the suggestion to read the huge amount of data quickly.
答案1
得分: 1
在你的示例中,你不必使用像并行处理这样高级的工具。而且这并不能一定解决你的问题,因为由于硬件、网络等原因可能会存在许多瓶颈,正如luk2302所提到的。
有两个简单得多的调整方法:
- 只选择那些你真正需要的数据。即使你的员工记录有3列,你也可以节省1/3的数据,从而提高速度并降低内存消耗。更不用说如果它有更多的列。
ResultSet results = statement.executeQuery("SELECT name, department FROM employee orderby dept");
- 默认的fetchSize可能不够。它的值取决于驱动程序,但是例如,默认情况下,当Oracle JDBC运行查询时,它会一次从数据库游标中检索10行结果集。我知道你在使用MySQL,但应该差不多。增加它可以减少到数据库游标的总体访问次数,而这是代价高昂的。因此,我建议将其增加到500或1000,甚至可以尝试更高的值。有关fetchSize的更多信息:https://stackoverflow.com/questions/1318354/what-does-statement-setfetchsizensize-method-really-do-in-sql-server-jdbc-driv
Statement statement = connection.createStatement();
statement.setFetchSize(1000);
- +1 - System.out.println 也会减慢你的代码。你可以在这里阅读关于它的信息:https://stackoverflow.com/questions/4437715/why-is-system-out-println-so-slow 但最好是用一个日志记录库来替换,或者至少在测试目的下,你可以使用类似这样的东西:
if(results.getRow()%1000 == 0) {
System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
}
祝好,
Nandor
英文:
In your example you don't have to use high calibre gun like paralellism. Also it doesn't necessarily solves your problem because there could be a lot of bottlenecks because of hardware, network, etc as luk2302 mentioned it.
There are two much easier tweaks:
- Select only those data that you really need. Even if your employee record has 3 columns you can spare the 1/3 of the data which results in speed increase and lower memory consumption. Not to mention if it has much more columns.
ResultSet results = statement.executeQuery("SELECT name, department FROM employee orderby dept");
- The default fetchSize won't be enough. It's value depends on the driver, but for example by default when Oracle JDBC runs a query, it retrieves a result set of 10 rows at a time from the database cursor. I know that you are using MySql but it should be about the same. Increasing it you can reduce the overall trip count to the database cursor which is costy. Therefore I recommend it to increase it to 500 or 1000, but you can even experiment with higher values. More info on fetchSize: https://stackoverflow.com/questions/1318354/what-does-statement-setfetchsizensize-method-really-do-in-sql-server-jdbc-driv
Statement statement = connection.createStatement();
statement.setFetchSize(1000);
- +1 - System.out.println also slows down your code. You can read about it here: https://stackoverflow.com/questions/4437715/why-is-system-out-println-so-slow But it's better to replace with a logger library or at least for testing purposes you can use something like this:
if(results.getRow()%1000 == 0) {
System.out.println("Fetching data by column index for row " + results.getRow() + " : " + empname);
}
Br,
Nandor
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论