Eclipselink – 脱离的实体内存泄漏

huangapple go评论72阅读模式
英文:

Eclipselink - detached entities memory leak

问题

# 设置 #

我们目前在JakartaEE应用程序中使用wildfly和eclipselink作为JPA实现。应用程序本身是一个带有REST、Service和DAO层的RESTful web服务器。DAO是唯一一个使用EntityManager的层次。由于各种原因,我们总是在分离实体。

* 防止eclipselink自动进行状态检查和将更改刷新到数据库
* 防止eclipselink在多次读取时重用同一个对象
...

然而,通过使用这种方法,我们注意到内存使用量激增,在某些情况下会导致“OutOfMemory”错误。

# 诊断 #

使用VisualVM,我们已经找出问题是内存中有大量实体实例。

## 测试代码 ##

这是我们遇到问题的代码示例(迁移一些历史数据)

    LinkedList<SomeEntity> entities; //这里加载了一组要处理的实体
    while(!entities.isEmpty()) {
        SomeEntity entity = entities.removeFirst(); //我们以队列的方式迭代,以便GC可以从内存中删除已处理的项目
        if (entity.getItems().isEmpty()) {
            //此调用是事务性的
            entityService.delete(entity.getId());
         } else if (entity.getItems().stream().anyMatch(item -> item.getQuantity() > 0.0)){
            //对实体进行一些更改
            //此调用是事务性的
            entityService.update(operation);
         }
         entity = null;
    }
    entities = null;

## 观察 ##

* 在分析内存使用情况时,我们可以看到实体类的数量不断增加。在测试代码中处理的不是同一个实体,但它是其他对象大部分时间引用的实体。有时其中的一部分会被清除,但总体数量在一段时间后会增加
* 实例的数量远远超过数据库中的记录数
* 这意味着每次在关系中引用对象时,都会创建一个新的实例(这没问题)
* 当我们创建了堆转储并查看从哪里引用对象时,只有eclipselink的内部结构显示出来,例如
```relationshipSourceObject in org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder#90312```
```owner in org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener#26713```,...

# 我们尝试过的方法 #

这些都没有帮助:

* 将eclipselink.cache.type.default设置为WEAK、SOFT甚至NONE
* 在while循环结束时手动调用EntityManager.clear

依我之见,WEAK应该足以防止eclipselink长时间存储引用,并防止GC。但它仍然在某个地方存储,由于这些引用从GC根节点可访问,它们从未被清除。有谁能解释这种行为,或者指点我应该从哪个方向查找?

# 编辑 #

回应评论和Chris的答案。关于我们如何使用EM和事务的更多信息。

我们使用EntityManager.detach方法进行分离,引用(@OneToMany、@ManyToMany等)应用了Cascade.DETACH。在分离之前,会加载必要的延迟加载引用。

我同意关于重新获取实体的部分。我不介意在内存中有同一个实体的多个实例一段时间。我的问题是为什么它们不被垃圾回收。

示例代码中的实体列表在一个事务中加载,在随后的数据库UPDATE或DELETE操作中(这也会将一些片段加载到内存中,从而创建更多实例)是另一个实体的事务。我可能会预期在初始调用期间使用大部分堆,然后慢慢清除或保持大致相同。

关于使用EntityManager

我们使用wildfly作为JakartaEE容器。默认情况下,它附带了hibernate作为JPA提供程序,但我们已经添加了eclipselink作为模块,并在persistence.xml中配置了提供程序。

根据[文档][1],容器管理的EntityManager会根据需要创建实例。

  [1]: https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_application_platform/7.1/html/development_guide/java_persistence_api_jpa#container_managed_entitymanager
英文:

Setup

We are currently using wildfly with eclipselink as JPA implementation in JakartaEE application. Application itself is RESTful web server with REST, Service and DAO layers. DAO is the only layer that is using EntityManager. We are always detaching entities for various reasons.

  • To prevent eclipselink from automatic state checking and flushing changes to database
  • To prevent eclipselink from reusing same object on multiple reads
    ...

However by using this approach we have noticed spike in memory usage that in some cases lead to OutOfMemory errors.

Diagnostics

Using VisualVM we have pinpointed problem to be having a great number of instances of entities in memory.

Test code

This is sample of code we are experiencing problems with (migration of some historic data)

LinkedList&lt;SomeEntity&gt; entities; //Here is loaded set of entities to process
while(!entities.isEmpty()) {
    SomeEntity entity = entities.removeFirst(); //We are iterating in quee fashion to allow GC to remove already processed items from memory
    if (entity.getItems().isEmpty()) {
        //this call is transactional
        entityService.delete(entity.getId());
     } else if (entity.getItems().stream().anyMatch(item -&gt; item.getQuantity() &gt; 0.0)){
        //DO SOME CHANGES ON ENTITY
        //this call is transactional
        entityService.update(operation);
     }
     entity = null;
}
entities = null;

Observations

  • While profiling memory usage we can see ever increasing count of entity classes in memory. It is not the same entity that is being worked with in test code, but it is entity, that is referenced at most time by other objects. Sometimes part of them are cleared but overall number increases after some time
  • Number of instances greatly outnumbers records in database
  • This means that every time object is referenced in relation, new instance is created (this is OK)
  • When we have created heap dump and looked from where the objects are referenced only eclipselink internal structures shows like
    relationshipSourceObject in org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder#90312
    owner in org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener#26713, ...)

What we have tried

None of this helped:

  • Setting eclipselink.cache.type.default to WEAK, SOFT or even NONE
  • Manually calling EntityManager.clear at end of the while

In my understanding WEAK should be enough to prevent eclipselink from storing references for too long and prevent GC. But it is stored somewhere anyway and since that references are accessible from GC roots they are newer cleared. Can anyone explain this behavior or point me at direction where to look?

EDITS

Addressing comment and Chris answer. More information about how we use EM and transactions.

We are detaching using EntityManager.detach method and references (@OneToMany, @ManyToMany, etc) have Cascade.DETACH applied. Loading necessary lazy loaded references is done prior to detach.

I agree about the part about re-fetching entities. I would not mind having multiple instances of the same entity in memory for some time. My problem is why it is not garbage collected.

List of entities in sample code is loaded in one transaction on subsequent database UPDATE or DELETE (this also fetches some bits into memory creating more instances) is another transaction per entity. I would probably expect most of the heap used during the initial call and then slowly clearing or remaining roughly same.

About using EntityManager

We are using wildfly as JakartaEE container. By default it is shipped with hibernate as JPA provider but we have added eclipselink as module and configured provider in persistence.xml

According to documentation container managed EntityManager creates instances as needed.

答案1

得分: 1

你是否正在缓存实体?仅清除是不足以让你有效地进行缓存的,如果你正试图这样做,很可能与你当前的问题有关。从EntityManager加载的所有内容都与该EntityManager有关的引用,因此我猜测你正在读取一个大量的部分获取的实体列表并对其进行缓存,然后使用EntityManager.clear()来尝试将它们分离。

然后,这些实体将不再是'受管理'的,但仍然引用着EntityManager。一旦你获取了某些内容,比如你在代码中展示的entity.getItems()调用,假设这是一个标准的OneToMany关系,带有默认的延迟加载的反向指针,这将强制将所有'items'都加载到内存中。由于它们有一个反向引用,并且'this'实体不再被EntityManager引用,因此Item必须重新获取实体。所以现在你在内存中有两个相同的实体实例Entity1' -> Item1 -> Entity1。

这在更复杂的对象图和重复的clear调用中可能会很容易积累。

虽然这不能被完全解决,但可以通过减少在EntityManager中执行的操作范围来减少开销,这样它可以被重新用于与该对象图相关的标识目的,并且在使用它进行读取的对象也被GC清除时(并由GC清除)。

英文:

Are you caching entities? Clear is not enough to allow you to effectively cache, as if that is what you are trying, is likely related to your current issue. Everything loaded from a EntityManager has are reference to that EntityManager, so I would guess that you are reading in a large list of entities that are partially fetched and caching them, then using EntityManager.clear() to try to detach them.

Those entities are then no longer 'managed' but still reference the EntityManager. As soon as you fetch something, such as the entity.getItems() call you've shown in code, assuming this is a standard OneToMany with a back pointer which defaults to be lazily loaded, this will force fetching all 'items' into memory. As they have a back reference and 'this' entity isn't referenced by the EntityManager, the Item then has to refetch the entity. So you now have two instances of the same Entity in memory Entity1' -> Item1 -> Entity1.

This can easily build up with more complex object graph and repeated clear calls.

This can be, not solved, but the overhead reduced by reducing the scope of what you do in an EntityManager, so that it can be reused for identity purposes related to that object graph, and garbage collected (and cleared by GC) when objects it was used to read are also cleared by GC.

huangapple
  • 本文由 发表于 2020年9月8日 16:48:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/63790373.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定