分开的 DbContext 实例依次执行相同记录的不同操作未按预期工作。

huangapple go评论55阅读模式
英文:

Separate DbContext instances performing different operations for the same record sequentially not working as expected

问题

以下是翻译的代码部分:

项目是一个使用Entity Framework Core和DI的.NET Core 7.0 Web API。

我有两个DbContext实例,一个实例用于检查实体或记录是否已创建,如果没有,它会调用类中的方法来创建该记录并传递Id。然后,它再次使用Id搜索记录以进行更新。

我注意到,如果创建记录的dbContext实例没有正确释放,当我在第二个实例中更新相同记录的属性时,实际上并没有进行更新。它似乎再次更新了另一个实例(我认为是这样)。

由于dbContext具有并发约束,我专门创建了两个不同的实例,以实现这种关注点的分离。

第一个具有通过依赖注入注入的dbContext的类:

if (study is null)
{
    var logMessage = $"无法在数据库中找到{StudyInstanceUid}。创建新的研究并将其标记为异常。";
    _logger.LogWarning(logMessage);

    var studyId = await _cs.CreateStudy(myStudy);
    study = await (from s in _cdb.Studies where s.Id == studyId select s).FirstAsync();
}

具有创建方法的服务类:

public async Task<int> CreateStudy(studyNotification record)
{
    var _optionsBuilder = new DbContextOptionsBuilder<MyDbContext>();
    _optionsBuilder.UseSqlServer(_config.GetConnectionString("MyDb"));
    using var _cdb = new MyDbContext(_optionsBuilder.Options);

    var study = new Study()
    {
        // ...
    };

    _cdb.Studies.Add(study);
    await _cdb.SaveChangesAsync();
    return study.Id;
}

如果我修改上面的代码如下:

await _cdb.SaveChangesAsync();
int id = study.Id;
return id;

它会按预期工作。虽然我不理解Entity Framework Core的内部工作原理,但我本以为这两个不同的实例不会互相干扰。我想要了解为什么会出现这个问题。

英文:

The project is a .net core 7.0 web api using entity framework core and DI.

I have two instances of DbContext, one instance checks to see if the entity or record is created, if it isn't it calls a method in a class to create that record and pass the Id. Then it searches for the record again using the Id to update it.

I noticed that if the dbContext instance that creates the record wasn't disposed correctly, when I update properties in the second instance for the same record, it doesn't actually update it. It seems to be updating the other instance again (I think).

Due to dbContext having concurrency constraints, I specifically created two separate instances, to have this separation of concerns.

First class with dbContext that is injected via Dependency Injection

if (study is null)
            {
                var logMessage = $&quot;Unable to find {StudyInstanceUid} in the database. Creating a new study and marking it as an exception.&quot;;
                _logger.LogWarning(logMessage);

                var studyId = await _cs.CreateStudy(myStudy);
                study = await (from s in _cdb.Studies where s.Id == studyId select s).FirstAsync();
            }

Service class that has the create methods

public async Task&lt;int&gt; CreateStudy(studyNotification record)
        {
            var _optionsBuilder = new DbContextOptionsBuilder&lt;MyDbContext&gt;();
            _optionsBuilder.UseSqlServer(_config.GetConnectionString(&quot;MyDb&quot;));
            using var _cdb = new MyDbContext(_optionsBuilder.Options);

            var study = new Study()
            {
                ...
            };


            _cdb.Studies.Add(study);
            await _cdb.SaveChangesAsync();
            return study.Id;
        }

If I modify the code above to:

            await _cdb.SaveChangesAsync();
            int id = study.Id;
            return id;

It works as expected. While I do not understand the inner workings of entity framework core, I would have thought that the two different instances would not interfere with each other. I would like to understand why this issue occurs?

答案1

得分: 1

以下是代码部分的翻译:

await _cdb.SaveChangesAsync();
return study.Id;

await _cdb.SaveChangesAsync();
int id = study.Id;
return id;

差异之处在于,您将会看到不同的 DbContext 加载的实体之间的差异以及事件的相对时间,通常取决于被跟踪的实例。

如果某个 DbContext 实例恰好加载了一个 Study 实例(比如 Id = 5),无论是直接加载还是间接加载(作为另一个数据读取的结果),默认情况下,该实体实例将被 DbContext 实例跟踪(缓存)。如果另一个新的 DbContext 实例尝试加载该 Study ID#5,它将从数据库加载该记录。假设第二个 DbContext 修改并保存该记录到数据库。如果第一个 DbContext 再次尝试使用以下方式读取它:

var study = context.Studies.Single(x => x.StudyId == studyId);

您可能期望从数据库获取更新后的 Study 记录... 附有性能分析器,您甚至会看到 EF 执行了一个针对数据库的查询,但您将得到的是已缓存、已跟踪的实例,即在第二个 DbContext 进行更改之前获取的实例。

确保在获取实体时获取当前数据库状态的最简单方法是明确告诉 EF 不要跟踪该实体。这意味着 EF 不会将其添加到跟踪缓存中,更重要的是,它也不会从跟踪缓存中读取它。

var study = context.Studies.AsNoTracking().Single(x => x.StudyId == studyId);

每当处理可能在应用程序中的线程之间或外部进程/用户之间同时修改的数据时,重要的是要么为操作使用新的 DbContext 实例,要么使用不跟踪的查询,以确保每次查询都将数据状态作为真实来源。这可以通过像上面的 AsNoTracking() 这样做,或者基于使用 SelectProjectTo 进行的投影的读取操作。跟踪查询应仅保留用于您要更新数据的情况,并尽快完成。

编辑: 如果您确实要更新实体并使用更改跟踪,但确保在这种情况下实体是最新的,那么可以检查本地缓存,并在找到实体时重新加载它,否则确保检索的实体来自数据库。如果并发编辑在您的应用程序中是一个重要因素,那么我建议在相关表中实现并发标记,比如时间戳或行版本,以检查并帮助防止旧的更新。

// 检查本地缓存,如果找到,发出重新加载以确保其是最新的:
var study = _cdb.Studies.Local.FirstOrDefault(x => x.StudyId == studyId);
if (study != null)
    _cdb.Entry(study).Reload();
else
    study = _cdb.Studies.Single(x => x.StudyId == studyId);

这种方法的主要问题是,只要您只希望获取 Study 实体而不依赖于相关数据存在,它就能正常工作,因为我们不知道已跟踪的 Study 实例是否已加载了所有或任何相关数据。例如,如果 Study 具有引用的集合,通常我们会执行类似以下的操作:

else
    study = _cdb.Studies.Include(x => x.References).Single(x => x.StudyId == studyId);

...以确保引用已加载,但问题在于,如果我们发现了一个本地的 Study 实例,我们不能安全地假设加载它的代码也已经预加载了引用。该集合中可能有引用,但只有 DbContext 也刚好跟踪它们。在这种情况下,更安全的做法是分离任何已跟踪的实体并重新加载它:

var study = _cdb.Studies.Local.FirstOrDefault(x => x.StudyId == studyId);
if (study != null)
    _cdb.Entry(study).State = EntityState.Detached;

study = _cdb.Studies
   .Include(x => x.References)
   .Single(x => x.StudyId == studyId);

这检查缓存,如果找到,从缓存中移除 study,接下来的语句将从数据库中加载 study 和相关数据。

英文:

There should be no difference between:

await _cdb.SaveChangesAsync();
return study.Id;

and

await _cdb.SaveChangesAsync();
int id = study.Id;
return id;

Differences you will see between entities loaded by different DbContexts and the timing of events relative to one another will often come down to tracked instances.

If a DbContext instance happens to have loaded an instance of a Study (say Id = 5) either directly, or indirectly as the result of another data read, by default that entity instance will be tracked (cached) by the DbContext instance. If another new DbContext instance goes to load that Study ID #5, it will load that record from the database. Say that 2nd DbContext modifies and saves that record to the database. If the first DbContext tries to read it again using something like:

var study = context.Studies.Single(x =&gt; x.StudyId == studyId);

You might expect that you'll get the updated Study record from the database... With a profiler attached you'll even see EF execute a query against the database, but what you will get back is the cached, tracked instance that DbContext already has, taken before the 2nd DbContext made changes.

The simplest way to ensure that you get the current database state when fetching an entity is to tell EF explicitly not to track the entity. This means EF won't add it to the tracking cache, and more importantly, it won't read it from the tracking cache either.

var study = context.Studies.AsNoTracking().Single(x =&gt; x.StudyId == studyId);

Whenever working with data that can be concurrently modified, either between threads in the application or external processes / users, it is important to either use fresh DbContext instances for operations, or use non-tracking queries to ensure that each query takes the data state each time as the source of truth. This can be done with AsNoTracking() like above, or by basing read operations on Projections using Select or ProjectTo. (Automapper) Tracking queries should be reserved solely for situations where you want to update data, and completed as quickly as possibly.

Edit: If you do want to update the entity and use change tracking, but ensure that the entity in that case is up to date, then you can check the local cache and reload the entity if found, otherwise ensure that the retrieved entity is coming from the database. If concurrent editing is a significant factor in your application(s) then I would also suggest implementing a concurrency marker in the relevant tables such as a Timestamp or RowVersion to inspect and help guard against stale updates.

// Check the local cache, if found, issue a reload to ensure it is up to date:
var study = _cdb.Studies.Local.FirstOrDefault(x =&gt; x.StudyId == studyId);
if (study != null)
    _cdb.Entry(study).Reload();
else
    study = _cdb.Studies.Single(x =&gt; x.StudyId == studyId);

The main issue with this approach is that it will work just fine so long as you only want the Study entity and not rely on related data being present since we don't know that the tracked Study instance has all or any related data loaded. For instance if a Study has a collection of references and we'd normally do something like:

else
    study = _cdb.Studies.Include(x =&gt; x.References).Single(x =&gt; x.StudyId == studyId);

... to ensure the references are loaded, the problem is that if we do find a local study instance cached we cannot safely assume that the code that might have loaded it also eager loaded the references. There may be References in the collection, but only ones that the DbContext also happened to be tracking. In this case it would be safer to detach any tracked entity and re-load it:

var study = _cdb.Studies.Local.FirstOrDefault(x =&gt; x.StudyId == studyId);
if (study != null)
    _cdb.Entry(study).State = EntityState.Detached;

study = _cdb.Studies
   .Include(x =&gt; x.References)
   .Single(x =&gt; x.StudyId == studyId);

This checks the cache, if found, removes the study from the cache, and the next statement will load the study and related data from the database.

huangapple
  • 本文由 发表于 2023年4月11日 06:41:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/75981254.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定