SQL Server – 为大表创建多列索引?还是为每个报告创建一个索引?

huangapple go评论68阅读模式
英文:

SQL Server - indexing a large table with multiple columns? or one index for every report?

问题

我有百万行的这个表格:记录(Id、CampaignId、CallDateTime、Phone等)

而且我在这个表格上有两个不同的报告:

--按日期:
声明 @CampaignId int = 1, @Date date = '2023-06-27'
选择 * from Recording
其中 CampaignId = @CampaignId and 将 CallDateTime 转换为日期 = @Date

--按电话:
声明 @CampaignId int = 1, @Phone varchar(9) = '987654321'
选择 * from Recording
其中 CampaignId = @CampaignId and Phone = @Phone

它应该有索引,但怎样做?用多个列(CampaignId、CallDateTime、Phone)?还是每个报告一个索引(CampaignId、CallDateTime)和(CampaignId、Phone)?

谢谢。

英文:

I have this table with millions of rows: Recording (Id, CampaignId, CallDateTime, Phone, etc)

And I have 2 different reports on that table:

--By Date:
declare @CampaignId int = 1, @Date date = '2023-06-27'
select * from Recording
where CampaignId = @CampaignId and cast(CallDateTime as date) = @Date

--By Phone:
declare @CampaignId int = 1, @Phone varchar(9) = '987654321'
select * from Recording
where CampaignId = @CampaignId and Phone = @Phone

It should have index, but how? With multiple columns (CampaignId, CallDateTime, Phone)? Or one index for every report (CampaignId, CallDateTime) and (CampaignId, Phone)?

Thanks.

答案1

得分: 1

First, you should fix the first query, or no index will be useful. It should be:

--按日期:
声明 @CampaignId int = 1, @Date date = '2023-06-27'
从Recording中选择 *
其中 CampaignId = @CampaignId 
并且 CallDateTime >= @Date
并且 CallDateTime < dateadd(day,1,@Date)

通用规则是永远不要将列包装在表达式中;而是始终编写查询,使参数替代列。并检查数据类型优先级以确保列无需转换为参数的类型。在此处,Datetime高于Date,因此不需要在参数上进行显式转换。

接下来,从一个有用的单一聚集索引开始,也许是 (CampaignId, CallDateTime)。这将为第一个查询提供最佳计划,为第二个查询提供合理的计划。

如果你有数百万行数据和许多需要不同索引以实现最佳性能的不同报表查询,考虑添加非聚集或聚集列存储索引。这将切换到具有最高压缩率和最快扫描速度的列式存储。使用列存储,您不需要为每个报表查询创建完美的索引,因为在列存储中扫描几百万行数据非常快速。

英文:

First, you should fix the first query, or no index will be useful. It should be:

--By Date:
declare @CampaignId int = 1, @Date date = &#39;2023-06-27&#39;
select * from Recording
where CampaignId = @CampaignId 
and CallDateTime &gt;= @Date
and CallDateTime &lt; dateadd(day,1,@Date)

The general rule is never wrap the column in an expresson; always write the query so the parameter is instead. And check the Data Type Precedence to ensure that the column won't have to be converted to the parameter's type. Here Datetime is higher than Date, so no explicit conversion on the parameter is needed.

Next, start with the a single useful clustered index, perhaps (CampaignId,CallDateTime). This will give you an optimal plan for the first query and a reasonable plan for the second query.

If you have millions of rows and lots of different reporting queries that would require many different indexes for optimal performance, consider adding a non-clustered or clustered columnstore index. This switches to columnar storage with the highest compression and fastest scan rate. With a columnstore you don't need the perfect index for each reporting query because scanning a few million rows in a columnstore is really fast.

答案2

得分: 0

要创建一个适合某个查询的良好索引:

  • 从使用 = 进行测试的列开始

  • 然后,您可以添加另一列

    其中 CampaignId = @CampaignId
    且 CallDateTime >= @Date
    且 CallDateTime < dateadd(day,1,@Date)

正如大卫所指出的,使测试可搜索是重要的。然后

(CampaignId, CallDateTime)   -- 按此顺序

对于这个:

其中 CampaignId = @CampaignId and Phone = @Phone

(CampaignId, Phone)  -- 顺序不重要

对于另一个查询,这两个索引都不是很有用。为每个查询创建最佳索引,然后查看是否有相同或足够相似的索引。

例如:INDEX(a,b,c) 对于需要 INDEX(a,b) 的查询将工作得很好。

英文:

To create a good index for one query:

  • Start with column(s) tested with =

  • After that, you can add another column

    where CampaignId = @CampaignId
    and CallDateTime >= @Date
    and CallDateTime < dateadd(day,1,@Date)

As David pointed out, making the tests Sargable is important. Then

(CampaignId, CallDateTime)   -- In this order

For this:

where CampaignId = @CampaignId and Phone = @Phone

have

(CampaignId, Phone)  -- The order is not important

Neither index is very useful for the other query. Make the optimal index for each query, then see if some indexes are the same or similar enough.

For example: INDEX(a,b,c) will work fine for a query that needs INDEX(a,b)

huangapple
  • 本文由 发表于 2023年6月27日 17:54:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76563674.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定