在VB.NET和SQL中筛选大数据并限制结果

huangapple go评论63阅读模式
英文:

Filter Big data with limit result in VB.NET and SQL

问题

我在实现一个基于多个字段的大数据搜索表单时遇到了困难,最多可包含 500,000 条记录,这些字段以文本框的形式出现,并在文本更改事件中进行搜索。

我尝试了两种解决方案,但都没有令人满意的结果。
第一种方法,在文本更改事件中选择前 1000 条记录进行筛选,然后将数据表绑定到数据源。

结果非常慢。

第二种方法是检索所有数据,然后在数据表上应用筛选条件:

dt.DefaultView.RowFilter = filter

筛选非常快,但在加载搜索表单时检索数据需要一段时间,而且如果用户清除文本,表格将包含 500,000 条记录。

我需要将数据限制在 1000 条以内。我尝试了使用 'take',但由于丢失引用导致了其他不良点,因为使用了 CopyToDataTable。我应该如何继续?

英文:

I struggle with implementing a big data search form of up to 500,000 records based on many fields, as text boxes, on the text change event.

I've tried two solutions with no satisfactory result.
First one, on text change: select top 1000 with filtering then, bind datatable to datasource.

The result was very slow.

Second one, retrieve all data then apply filtering on datatable:

dt.DefaultView.RowFilter = filter

The filtering is very fast, but retrieving data on load of the search form take a period of time in addition to if the user clear text the grid will contains 500,000 records.

I need to limit data to 1000. I tried 'take' that led to another undesirable points because of the loss of reference due to
CopyToDataTable. How can I proceed?

答案1

得分: 1

首先,现在的情况是,50万行数据在当今来说是一个小表格了。

使用VB或甚至MS Access,我经常在SQL服务器中处理包含超过1000万行的表,响应时间只需要大约1/10秒。

因此,不清楚为什么你的数据查询会很慢。

但是,毫无疑问,你绝不会拉取50万行数据,然后再尝试筛选数据。

我是说,当你使用网络进行搜索时,你会把整个互联网下载到浏览器中,然后让用户按Ctrl-F来搜索结果吗?当然不会!

当你使用银行机器时,机器会下载每个帐户然后再向你询问帐户号码吗?当然不会!

当你启动并使用会计软件时,它会下载每个公司,然后你会使用Ctrl-F来查找一个公司吗?当然不会!

所以,不管是2000行还是1000万行,你的搜索不会导致那么多行数据,因此你可以根据用户输入的条件在查询提取数据时进行筛选。

所以,请想象一下过去50多年来你使用过的任何软件。在每种情况下,都会使用某种类型的搜索提示,然后进行数据库提取。你不会从系统中提取10万份发票然后再进行搜索。

你会提示用户输入发票号码,然后显示该发票表单。因此,无论你有10行还是1000万行,该发票表单都将立即显示和启动。

想象一下,如果你的系统上有10个用户。如果每个用户提取50万行数据(并且没有做任何有用的工作),那么你已经创建了500万行的数据库服务器负载!

数据网格适用于最多几百行。如果超过这个数量,你就完全错了。

所以,构建一个搜索表单,允许用户输入一些条件,然后再将数据提取到数据网格中。

你没有提到你要搜索什么类型的数据。

假设我想要搜索一个数据库来“选择”一个城市。世界上的任何城市。因此,该数据库将有大约150,000行数据(地球上的每个城市都有一个)。然而,我甚至不会遇到超过1/1000秒的延迟!

所以,这个例子:

在VB.NET和SQL中筛选大数据并限制结果

所以上面是一些你需要考虑的概念。

因此,你可能例如想要筛选酒店。

因此,你可能会这样说:

在VB.NET和SQL中筛选大数据并限制结果

所以,无论你如何处理这个问题,你都需要在执行查询之前询问并具有某种过滤和条件。

你还希望确保使用的查询是:

  • 参数安全(防止SQL注入)
  • 是我们所谓的“可搜索的”(意味着可以使用索引)。
  • 你已经在被搜索的列上创建了索引。

所以,你可能希望展示你正在使用的查询,并提供一些关于你想要筛选的想法,然后我可以在这里发布一些额外的代码。

编辑:文本框在键入搜索时

如上所示,点击4次非常容易,我们缩小了城市范围,甚至没有触碰键盘!

但是,让我们搜索150,000行数据,但在键入时搜索。

所以,在这个示例中,为了不使数据库服务器负载过重。

所以,我们只有在用户至少输入3个字符时才会执行搜索。这样,我们就可以大大减少记录的数量。

一个简单的文本框,数据网格视图。

这是KeyUp事件的代码。

Private Sub TextBox1_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox1.KeyUp

    If TextBox1.Text.Length >= 3 Then
        ' 搜索数据库
        Dim strSQL As String =
            "SELECT City, state_name, country_name 
            FROM vCities WHERE City like @City + '%'"
        Dim cmdSQL As New SqlCommand(strSQL)
        cmdSQL.Parameters.Add("@City", SqlDbType.NVarChar).Value = TextBox1.Text

        DataGridView1.DataSource = MyRstP(cmdSQL, My.Settings.Countries)

        lblCount.Text = "Row count = " & DataGridView1.RowCount
    End If

End Sub

结果如下:

在VB.NET和SQL中筛选大数据并限制结果

请注意,我们似乎从来没有超过大约300条记录。

所以,它是快速的,瞬间的,我们仍然没有使数据服务器不堪重负。确保在要搜索的列上有数据库索引(在这个示例中是City列)。

英文:

Well, first of all, 500,000 rows is a small table these days.

With VB or even MS Access, I often hit a table with a little over 10 million rows in SQL server, and the response time is like 1/10th of a second.

So, it not clear why your query of data would be slow.

However, without question you would never pull 500,000 rows and then try to filter the data.

I mean, when you use the web to search, do you download the WHOLE internet into a browser, and THEN have the user hit Ctrl-F to search the results? Of course not!

When you use a bank machine, does the machine download every account and then ask you for the account number? Of course not!

When you launch and use an accounting package, does it download every company and then you use say Ctrl-F to find the one company? Of course not!

So, be it 2000 rows, or 10 million, your search is not going to result in that many rows, and thus you filter at the query pull time based in the criteria entered by the user.

So, think of any software you have ever used in the last 50+ years. In every case, a simple prompt of some type of search is used, and then you do a database pull. You don't pull 100,000 invoices from a system and then search.

You prompt the user for the invoice number, and then display that invoice form. So, speed of the database will not matter if you have 10 rows, or 10 million rows. That invoice form is going to display and launch instant.

Think of what would occur if you have 10 users on the system. If each user pulls 500,000 rows (and not done one bit of useful work), you already created a database server load of 5 million rows!

A datagrid is good for a few 100 rows tops. Anything more, and you doing this all wrong.

So, build a search form, allow the user to enter some criteria and THEN pull that data into the data grid.

You don't mention what kind of data you looking to search for.

Say I want to search a database to "select" a city. Any city in the world. So, that database will have about 150,000 rows. (one for each city on planet earth).

Yet, I don't encounter a delay of even 1/1000th of a second!

So, this example:

在VB.NET和SQL中筛选大数据并限制结果

So, above is a example of some of the concepts you need to consider here.

So, you might for example want to filter hotels.

So, you might say have this:

在VB.NET和SQL中筛选大数据并限制结果

So, no matter how you approach this, you need to ask and have some kind of filter and criteria before you execute that query, and only pull the data based on that query.

You also want to ensure that the query used is:

  • Parameter safe (SQL injection safe)
  • Is what we call "sargable" (means indexing can be used).
  • That you have index(s) on the columns being searched.

So, you might want to show a query that you working with, and with some idea(s) on what you want to filter, then I can post some additional code here.

Edit: Text box as you type search

As above shows, 4 super easy clicks and we narrowed down the City, and did not even touch the keyboard!

But, let's hit the 150,000 rows, but search as we type.

So, in this example, to not kill the database server.

So, so we will simple NOT search unless the user types in at least 3 characters. That way, we get a fantastic reduction in the number of records.

So, a simple text box, datagrid view.

This code for the KeyUp event.

Private Sub TextBox1_KeyUp(sender As Object, e As KeyEventArgs) Handles TextBox1.KeyUp

    If TextBox1.Text.Length >= 3 Then
        ' search database
        Dim strSQL As String =
            "SELECT City, state_name, country_name 
            FROM vCities WHERE City like @City + '%'"

        Dim cmdSQL As New SqlCommand(strSQL)
        cmdSQL.Parameters.Add("@City", SqlDbType.NVarChar).Value = TextBox1.Text

        DataGridView1.DataSource = MyRstP(cmdSQL, My.Settings.Countries)

        lblCount.Text = "Row count = " & DataGridView1.RowCount

    End If

End Sub

And the result is this:

在VB.NET和SQL中筛选大数据并限制结果

Note how we really never seem to hit/get more then about 300 records.

So, it is fast, instant, and we still not beating the data server to death. Make sure you have a database index on the column to search (in this example the City column).

huangapple
  • 本文由 发表于 2023年6月19日 23:24:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76508030.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定