Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

huangapple go评论87阅读模式
英文:

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

问题

我已经在开发领域工作了5年。但这对我来说似乎是全新的。最近,我正在处理一个遗留应用程序,其中ms sql server中的dateinteger值存储在CHAR列中。而令人震惊的事情(至少对我来说是这样,因为我以前从未考虑过这个方向),是关系运算符可以用于存储在CHAR列中的date值。

我担心这会引起性能问题。这是否正确,还是我错了?在应用关系运算符之前,ms sql server会进行隐式转换或转换吗?或者在ms sql server中这是正常的做法?

我们计划迁移这个应用程序,想要检查是否可以继续使用这个设置,还是需要将date存储在DATE类型列中,将integer存储在int类型列中,ms sql server

我还注意到所有的日期都以**“YYYY/MM/DD”**格式存储在表中。

以下是我创建和测试的测试表。

CREATE TABLE [dbo].[TestTable](
	[Date] [char](10) NULL,
	[Integer] [char](10) NULL
) ON [PRIMARY]
GO

SELECT * FROM TESTTABLE

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

SELECT * FROM TESTTABLE where [Date] > '1993/02/10'

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

请提供建议。

英文:

I have been in development for 5 years. But this looks completely new to me. Recently I'm working on Legacy application where date and integer values are stored in CHAR column in ms sql server. And the shocking thing (at-least for me because I have never ever thought in this direction before) is that relational operators work with date values stored in CHAR column.

I'm afraid that this will have a performance issue. Is that correct or I'm wrong? Any implicit casting or conversion will happen in ms sql server before applying relation operators? Or this is normal to do in ms sql server?

We are planning to migrate this and want to check if I can proceed with this setup or date needs to be stored in DATE type column and integer to be in int type column in ms sql server?

I also noted that all the dates are stored as string in "YYYY/MM/DD" format in table.

Here is the test table that I have created and tested.

CREATE TABLE [dbo].[TestTable](
	[Date] [char](10) NULL,
	[Integer] [char](10) NULL
) ON [PRIMARY]
GO

SELECT * FROM TESTTABLE

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

SELECT * FROM TESTTABLE where [Date] > '1993/02/10'

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

Please advise.

答案1

得分: 7

不要将日期存储为字符串。请使用相关的日期类型而不是字符串。

虽然像YYYY/MM/DD这样的字符串格式允许您进行相等性和不等式比较以及排序,但其他限制也会出现,例如:

  • 使用字符串进行日期操作,例如添加或减去间隔,会很繁琐。
  • 无法强制执行数据完整性(如何确保您的字符串是有效的日期?)

为了解决这些问题,您很快会发现自己在查询中将字符串转换为日期,这是非常低效的。

英文:

Don't store dates as strings. Use the relevant date-like datatype instead.

While a string format like YYYY/MM/DD allows you to do equality and inequality comparisons and sorting, other limitations will pop up, for example:

  • date manipulations, eg adding or substracting intervals, are tedious with strings
  • data integrity cannot be enforced (how to you ensure that your string is a valid date?)

To work around these, you will soon find yourself converting strings to dates in your queries, which is highly inefficient.

答案2

得分: 4

以下是已翻译的内容:

更好的做法是使用正确的数据类型存储值。

然而,如果日期以YYYYMMDD格式一致存储,那么比较和排序操作将正常工作。也就是说,将其视为字符串的比较与将其视为日期的比较相同。

如果有机会修复数据,可以尝试。但要注意,部分修复可能会对数据库性能产生重大影响。如果这些列用于连接操作,例如,类型转换通常会影响优化。当然,如果两列中的值具有相同的数据类型,那就没问题。

如果这是一个正在运行的遗留应用程序,那么当应用程序被替换时计划进行修复。

英文:

It is better to store values using the correct data types.

However, if dates are stored as YYYYMMDD format -- consistently -- then comparison and ordering operations work fine. That is, the comparisons as strings are the same as the comparisons as dates.

If you have an opportunity to fix the data, you can try. Do be warned that that partial fixes might have a big impact on database performance. If such columns are used for joins -- for instance -- then the type conversion generally kills optimizations. Of course, it is fine if the values have the same types in both columns.

If this is a working legacy applications, then just plan on fixing it when the application is replaced.

答案3

得分: 3

I think that you're going to have more trouble with Date conversion from different formats, than with performance.

假设您的系统(或一些系统)决定发送类似于'1993-02-10'的内容,您将不得不"处理"这个新字符串,因为它永远不会匹配您的查询。

回到代码和性能问题,我做了这个快速示例,以查看是否可以捕获"隐式转换"的成本,但似乎微不足道(请查看查询计划的最后的CONVERT_IMPLICIT?):

-- 创建一个用于测试的临时表
CREATE TABLE #TestTable (
    [Date] [char](10) NULL,
    [Integer] [char](10) NULL
)

-- 创建一个用于搜索[Integer]的索引
CREATE INDEX Idx_Integer ON #TestTable ([Integer]);

-- 创建一些数据
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/01/01', 1)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/02/01', 2)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/03/01', 3)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/04/01', 4)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/05/01', 5)
GO 100000

-- 添加一个唯一的记录,以便SQL使用索引
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/05/01', 7)

-- 带有查询计划的选择(Ctrl+M)
SELECT [Integer]
FROM #TestTable
WHERE [Integer] = 7

DROP TABLE #TestTable;

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

VS:

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

愉快的编码!

英文:

I think that you're going to have more trouble with Date conversion from different formats, than with performance.

Lets say that your system (or some of your systemS), decide to send something like '1993-02-10', you'd have to "treat" this new string because it's never going to match for your queries.

Coming back to code and performance issues, I've made this quick example to see if we can catch the cost of "Implicit casting", but it seems to be despicable (See the CONVERT_IMPLICIT at the end of the Query plan?):

-- Create a temp table to test
CREATE TABLE #TestTable (
    [Date] [char](10) NULL,
    [Integer] [char](10) NULL
)

-- And an index to search for [Integer]
CREATE INDEX Idx_Integer ON #TestTable ([Integer]);

// Create some data
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/01/01', 1)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/02/01', 2)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/03/01', 3)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/04/01', 4)
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/05/01', 5)
GO 100000

// And a unique record so SQL uses the index
INSERT INTO #TestTable( [Date], [Integer] ) VALUES('2019/05/01', 7)

// Select with Query-Plan (Ctrl+M)
SELECT      [Integer]
    FROM #TestTable
    WHERE   [Integer] = 7

DROP TABLE #TestTable;

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

VS:

Is it efficient to use relational operators on date or integer values stored in char column in ms sql server?

Happy coding!

huangapple
  • 本文由 发表于 2020年1月6日 21:31:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/59613029.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定