SQL Server Local DB稳定性问题(2023年春季)

huangapple go评论59阅读模式
英文:

SQL Server Local DB Stability Issues (Spring 2023)

问题

We have a WPF .NET desktop application that uses SQL Server Local DB as its database. Our application is installed on hundreds of computers and has been released for about six years now.

我们有一个使用SQL Server Local DB作为数据库的WPF .NET桌面应用程序。我们的应用程序安装在数百台计算机上,已经发布了大约六年。

We started running into multiple issues this Spring with database stability. Everything seemed to be fine last year. Admittedly, 80% of our app's usage is in the Spring. So, we could have been running into issues in the Fall of 2022 and not been aware of it. The only thing that really changed in our software was that we moved the app away from ADO and over to explicitly concatenated SQL statements. We've been seeing these issues on both Windows 10 and Windows 11 machines. The app could have been running for hours or even days without issue before running into a brick wall of database instability. Alternatively, sometimes the app won't even start because it can't connect to the database without a computer restart. Although, some installs of the app seem to have more issues than others. Some installs of the app have never encountered a single issue. It's completely random.

今年春季,我们在数据库稳定性方面遇到了多个问题。去年一切似乎都很好。诚然,我们应用程序的80%使用量出现在春季。所以,我们可能在2022年秋季遇到了问题,但并不知情。在我们的软件中唯一真正改变的是,我们将应用程序从ADO迁移到了显式连接的SQL语句。我们在Windows 10和Windows 11机器上都遇到了这些问题。应用程序在出现数据库不稳定问题之前可能已经运行了数小时甚至数天。另外,有时应用程序甚至无法启动,因为它无法在不重新启动计算机的情况下连接到数据库。尽管如此,某些应用程序的安装似乎比其他的问题更多。某些应用程序的安装从未遇到过任何问题。这完全是随机的。

I've tried out various things to address the issues:

  1. The app was creating a new connection every time it needed to communicate. I switched this over to a static connection for reads and instance connections for transactions. This seemed to alleviate most of the issues, but not all.
  2. When a statement failed, I added logic to kill the SQL Server instance itself and manually recreate it. Then retry the statement. This 'second chance' execution fixed more issues, but the app logs can sometimes show an excessive amount of SQL Server instance bounces.
  3. We had been installing SQL Server 2016, but we're starting to upgrade users to SQL Server 2022. This seems to help too.
  4. I added a 'connection test' to every caller needing a database connection. I have the app run "SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;" to make sure that the connection is open. If not, I go into the second change logic from item 2.

我尝试了各种方法来解决这些问题:

  1. 应用程序每次需要通信时都会创建一个新的连接。我将其更改为读取使用静态连接,而事务使用实例连接。这似乎减轻了大部分问题,但并非所有问题都得到了解决。
  2. 当语句失败时,我添加了逻辑来终止SQL Server实例本身,然后手动重新创建它。然后重试该语句。这种“第二次机会”的执行解决了更多问题,但应用程序日志有时会显示过多的SQL Server实例弹跳。
  3. 我们之前安装的是SQL Server 2016,但我们开始升级用户到SQL Server 2022。这也似乎有所帮助。
  4. 我为每个需要数据库连接的调用者添加了“连接测试”。我让应用程序运行"SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;"来确保连接是打开的。如果没有,我会执行第2项中的第二次更改逻辑。

We seem to see two errors at startup. The first is significantly more common.

我们似乎在启动时看到了两个错误。第一个错误明显更常见。

Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=12085; handshake=0;
at System.Data.ProviderBase.DbConnectionpool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultileObjectTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)

连接超时已过期。在尝试消耗预登录握手确认时,超时时间已过。这可能是因为预登录握手失败或服务器无法及时回应。尝试连接到该服务器时所花费的时间为 - [Pre-Login] initialization=12085; handshake=0;
在System.Data.ProviderBase.DbConnectionpool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultileObjectTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)处

A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error 50 - Local Database Runtime error occurred. Error occurred during LocalDB instance startup: SQL Server process failed or start.)
at System.Data.ProviderBase.DbConnectionpool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultileObjectTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)

在建立与SQL Server的连接时发生了与网络或特定实例相关的错误。找不到服务器或无法访问。请验证实例名称是否正确,以及SQL Server是否配置为允许远程连接。 (提供程序:SQL Network Interfaces,错误50 - 本地数据库运行时发生错误。在LocalDB实例启动期间发生错误:SQL Server进程失败或启动。)
在System.Data.ProviderBase.DbConnectionpool.TryGetConnection(DbConnection

英文:

We have a WPF .NET desktop application that uses SQL Server Local DB as its database. Our application is installed on hundreds of computers and has been released for about six years now.

We started running into multiple issues this Spring with database stability. Everything seemed to be fine last year. Admittedly, 80% of our app's usage is in the Spring. So, we could have been running into issues in the Fall of 2022 and not been aware of it. The only thing that really changed in our software was that we moved the app away from ADO and over to explicitly concatenated SQL statements. We've been seeing these issues on both Windows 10 and Windows 11 machines. The app could have been running for hours or even days without issue before running into a brick wall of database instability. Alternatively, sometimes the app won't even start because it can't connect to the database without a computer restart. Although, some installs of the app seem to have more issues than others. Some installs of the app have never encountered a single issue. It's completely random.

I've tried out various things to address the issues:

  1. The app was creating a new connection every time it needed to communicate. I switched this over to a static connection for reads and instance connections for transactions. This seemed to alleviate most of the issues, but not all.
  2. When a statement failed, I added logic to kill the SQL Server instance itself and manually recreate it. Then retry the statement. This 'second chance' execution fixed more issues, but the app logs can sometimes show an excessive amount of SQL Server instance bounces.
  3. We had been installing SQL Server 2016, but we're starting to upgrade users to SQL Server 2022. This seems to help to.
  4. I added a 'connection test' to every caller needing a database connection. I have the app run "SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;" to make sure that the connection is open. If not, I go into the second change logic from item 2.

We seem to see two errors at startup. The first is significantly more common.

Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=12085; handshake=0;
  at System.Data.ProviderBase.DbConnectionpool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultileObjectTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)

A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error 50 - Local Database Runtime error occurred. Error occurred during LocalDB instance startup: SQL Server process failed or start.)
  at System.Data.ProviderBase.DbConnectionpool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultileObjectTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)

During runtime, we see these a lot. The first two are by far the most common, but the others pop up from time to time as well.

The wait operation timed out.

Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.

A transport-level error has occurred when receiving results from the server. (provider: Session Provider, error: 19 - Physical connection is not usable)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)

ExecuteNonQuery requires an open and available Connection. The connection's current state is closed.
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)

ExecuteReader requires an open and available Connection. The connection's current state is open.
   at System.Data.SqlClient.SqlConnection.GetOpenConnection(String method)

The ConnectionString property has not been initialized.
   at System.Data.SqlClient.SqlConnection.PermissionDemand()

Thoughts?

答案1

得分: 1

你的更改只会让情况变得更糟,只是在问题上贴补。

  • 当需要时,你应该创建一个新连接,然后使用 using 进行处理,而不是使用静态连接对象。你应该依赖连接池来处理它。
  • 杀死实例可能是错误的做法,它可能只是卡在执行时间很长的查询上。你应该调查相关的查询计划并尝试改进它。
  • 版本本身通常不会产生如此大的差异,但你可能会得到一个不会卡住的不同查询计划。
  • READ UNCOMMITTED 是一个非常糟糕的主意,除非你了解不正确数据的影响,不要这样做。

你应该检查的事项:

  • 默认情况下,LocalDB 不会一直运行。尝试在应用程序启动时手动启动它,然后等待至少 5 秒,使其准备就绪。

  • 或者可以使用用户登录时的批处理文件设置它自动启动。

  • 设置数据库以不自动关闭

    ALTER DATABASE YourDb SET AUTO_CLOSE OFF;
    
  • 通常情况下,LocalDB 被设置为自动关闭。将实例设置为仅在很长一段时间后自动关闭,参见 https://stackoverflow.com/questions/14153509/how-to-prevent-sql-server-localdb-auto-shutdown

    EXEC sp_configure 'show advanced options', 1;
    RECONFIGURE;
    GO
    
    EXEC sp_configure 'user instance timeout', 65535;
    RECONFIGURE;
    
  • 考虑使用完整安装的 SQL Server Express,以避免上述大部分问题。

  • 看起来你还需要设置最大服务器内存,请参阅文档

英文:

Your changes are just making things worse by papering over the cracks.

  • You are supposed to make a new connection when needed, and then dispose with a using, not have a static connection object. You should rely on connection pooling to handle it.
  • Killing the instance was probably the wrong thing, it's probably just stuck on a query taking a long time. You should investigate the relevant query plan and try improve it
  • The version by itself wouldn't usually make such a difference, but you may get a different query plan which doesn't get stuck going slow.
  • READ UNCOMMITTED is a really bad idea, don't do it unless you understand the implications of incorrect data.

Things you should check:

  • LocalDB does not run the whole time by default. Try starting it manually at application startup, then wait at least 5 seconds for it to be ready.

  • Alternatively set it to autostart using a batch file on user login.

  • Set the database so it doesn't AUTOCLOSE

    ALTER DATABASE YourDb SET AUTO_CLOSE OFF;
    
  • LocalDB is normally set to auto-shutdown. Set the instance to only auto-shutdown after a long time, see also https://stackoverflow.com/questions/14153509/how-to-prevent-sql-server-localdb-auto-shutdown

    EXEC sp_configure 'show advanced options', 1;
    RECONFIGURE;
    GO
    
    EXEC sp_configure 'user instance timeout', 65535;
    RECONFIGURE;
    
  • Consider using a full installation of SQL Server Express instead, to avoid most of the above problems.

  • It looks like you also need to set the max server memory, see the documentation.

huangapple
  • 本文由 发表于 2023年5月17日 22:24:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76273178.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定