英文:
Should I include distinct in this JPQL query?
问题
背景
我在Stack Overflow(SO)上看到了关于在JPQL JOIN FETCH
查询中使用distinct
关键字以及PASS_DISTINCT_THROUGH
查询提示的多个答案和问题,以及在许多流行博客中的讨论。
例如,看看这两个问题:
以及这些博客文章:
- The best way to use the JPQL DISTINCT keyword with JPA and Hibernate
- The DISTINCT pass-through Hibernate Query Hint
- Hibernate Tips: How To Apply DISTINCT to Your JPQL But Not Your SQL Query
我的疑问
我现在的问题是,我不能完全理解在JPQL查询中何时必须包含distinct
关键字。更具体地说,我想知道这是否取决于使用哪种方法来执行查询(getResultList
还是getSingleResult
)。
以下是一个示例,以阐明我的意思。
从现在开始,我所写的一切都在Ubuntu Linux 18.04上进行测试,使用Java 8、Hibernate 5.4.13和一个内存中的H2数据库(版本为1.4.200)。
假设我有一个Department
实体,它与一个DepartmentDirector
实体具有延迟的双向一对多关系:
// Department.java
@Entity
public class Department {
// ...
private Set<DepartmentDirector> directors;
@OneToMany(mappedBy = "department", fetch = FetchType.LAZY)
public Set<DepartmentDirector> getDirectors() {
return directors;
}
// ...
}
// DepartmentDirector.java
@Entity
public class DepartmentDirector {
// ...
private Department department;
@ManyToOne
@JoinColumn(name = "department_fk")
public Department getDepartment() {
return department;
}
// ...
}
假设我的数据库当前包含一个部门(department1
)和两名与之关联的主管。
现在,我想通过其UUID(主键)检索部门,以及所有的主管。可以使用以下带有JOIN FETCH
的JPQL查询来完成:
String query = "select department from Department department left join fetch "
+ "department.directors where department.uuid = :uuid";
由于上述查询在子集合上使用了join fetch
,我预期在发出查询时会返回两个重复的部门:然而,仅当使用getResultList
方法而不是使用getSingleResult
方法时才会发生这种情况。这在某种程度上是合理的,但我发现Hibernate的getSingleResult
实现在幕后使用了getResultList
,因此我期望会抛出NonUniqueResultException
异常。
我也简要地查阅了JPA 2.2规范,但没有在处理两种方法之间的重复项时提到区别,并且关于此问题的每个代码示例都使用了getResultList
方法。
结论
在我的示例中,我发现使用getSingleResult
执行的JOIN FETCH
查询不会出现在我在背景部分链接的资源中解释的重复实体问题。
如果上述说法是正确的,那么同样的JOIN FETCH
查询在使用getResultList
执行时需要distinct
,但在使用getSingleResult
执行时不需要。
我需要有人解释一下这是否是预期行为,或者我是否误解了什么。
附录
两个查询的结果:
-
使用
getResultList
方法运行的查询。我得到了两个预期的重复部门(这仅是为了测试查询的行为,应该使用getSingleResult
):List<Department> resultList = entityManager.createQuery(query, Department.class) .setParameter("uuid", department1.getUuid()) .getResultList(); assertThat(resultList).containsExactly(department1, department1); // 通过
-
使用
getSingleResult
方法运行的查询。我期望检索到相同的重复部门,从而抛出NonUniqueResultException
异常。然而,实际上检索到一个部门,并且一切工作正常:Department singleResult = entityManager.createQuery(query, Department.class) .setParameter("uuid", department1.getUuid()) .getSingleResult(); assertThat(singleResult).isEqualTo(department1); // 通过
英文:
Background
I have seen multiple answers and questions here on SO and in many popular blogs about the necessity of the distinct
keyword in JPQL JOIN FETCH
queries and about the PASS_DISTINCT_THROUGH
query hint.
For example, see these two questions
and these blog posts
- The best way to use the JPQL DISTINCT keyword with JPA and Hibernate
- The DISTINCT pass-through Hibernate Query Hint
- Hibernate Tips: How To Apply DISTINCT to Your JPQL But Not Your SQL Query
What I am missing
Now my problem is that I cannot fully understand when exactly the distinct
keyword must be included in the JPQL query. More specifically, if it depends on which method is used to execute the query (getResultList
or getSingleResult
).
The following is an example to clarify what I mean.
Everything I am writing from now on was tested on Ubuntu Linux 18.04, with Java 8, Hibernate 5.4.13 and an in-memory H2 database (version 1.4.200).
Suppose I have a Department
entity which has a lazy bidirectional one to many relationship with a DepartmentDirector
entity:
// Department.java
@Entity
public class Department {
// ...
private Set<DepartmentDirector> directors;
@OneToMany(mappedBy = "department", fetch = FetchType.LAZY)
public Set<DepartmentDirector> getDirectors() {
return directors;
}
// ...
}
// DepartmentDirector.java
@Entity
public class DepartmentDirector {
// ...
private Department department
@ManyToOne
@JoinColumn(name = "department_fk")
public Department getDepartment() {
return department;
}
// ...
}
Suppose that my database currently contains one department (department1
) and two directors associated with it.
Now I want to retrieve the department by its uuid (the primary key), along with all its directors. This can be done with the following JOIN FETCH
JPQL query:
String query = "select department from Department department left join fetch "
+ "department.directors where department.uuid = :uuid";
As the preceding query uses a join fetch
with a child collection, I expected it to return two duplicated departments when issued: however this only happens when using the query with the getResultList
method and not when using the getSingleResult
method. This is somehow reasonable, but I have found that the Hibernate implementation of getSingleResult
uses getResultList
behind the curtains so I expected a NonUniqueResultException
to be thrown.
I also briefly went through JPA 2.2 specification but no distinction in treating the duplicates between the two methods is mentioned, and every code sample concerning this issue uses the getResultList
method.
Conclusion
In my example I found out that JOIN FETCH
queries executed with getSingleResult
do not suffer the duplicated entities problem explained in the resources I linked in the section Background.
If the above claim would be correct, it would mean that the same JOIN FETCH
query would need distinct
if executed with getResultList
, but would not need it when executed with getSingleResult
.
I need someone to explain me if this is expected or if I misunderstood something.
Appendix
Results of the two queries:
-
Query ran with the
getResultList
method. I get two duplicated departments as expected (this was done just to test the behaviour of the query,getSingleResult
should be used instead for this):List<Department> resultList = entityManager.createQuery(query, Department.class) .setParameter("uuid", department1.getUuid()) .getResultList(); assertThat(resultList).containsExactly(department1, department1); // passes
-
Query ran with the
getSingleResult
method. I would expect the same duplicated departments to be retrieved, and thus aNonUniqueResultException
to be thrown. Instead, a single department is retrieved and everything works nice:Department singleResult = entityManager.createQuery(query, Department.class) .setParameter("uuid", department1.getUuid()) .getSingleResult(); assertThat(singleResult).isEqualTo(department1); // passes
答案1
得分: 3
有趣的问题。
首先,我要指出getSingleResult()
是针对那些由于其性质总是返回单个结果的查询而设计的(意思是:大多数聚合查询,如SELECT SUM(e.id) FROM Entity e
)。基于某些业务特定规则,您认为应该返回单个结果的查询实际上并不符合条件。
话虽如此,JPA规范规定,当查询返回多于一个结果时,getSingleResult()
应该抛出NonUniqueResultException
异常:
当调用
Query.getSingleResult
或TypedQuery.getSingleResult
并且查询返回多个结果时,持久性提供程序会抛出NonUniqueResultException
异常。此异常不会导致当前事务(如果存在)标记为回滚。
然而,看看Hibernate的实现:
@Override
public R getSingleResult() {
try {
final List<R> list = list();
if ( list.size() == 0 ) {
throw new NoResultException( "No entity found for query" );
}
return uniqueElement( list );
}
catch ( HibernateException e ) {
if ( getProducer().getFactory().getSessionFactoryOptions().isJpaBootstrap() ) {
throw getExceptionConverter().convert( e );
}
else {
throw e;
}
}
}
public static <R> R uniqueElement(List<R> list) throws NonUniqueResultException {
int size = list.size();
if ( size == 0 ) {
return null;
}
R first = list.get( 0 );
for ( int i = 1; i < size; i++ ) {
if ( list.get( i ) != first ) {
throw new NonUniqueResultException( list.size() );
}
}
return first;
}
事实证明,Hibernate对“多个结果”的解释似乎是“多个唯一结果”。
实际上,我对所有JPA提供程序进行了测试,结果表明:
- Hibernate确实从
getResultList()
返回重复项,但由于getSingleResult()
的实现方式特殊,不会抛出异常。 - EclipseLink是唯一一个在
getResultList()
中不会出现重复结果问题的提供程序,因此getSingleResult()
也不会抛出异常(对我来说,这种行为是合乎逻辑的,但事实证明,这完全是一种解释方式)。 - OpenJPA和DataNucleus都会从
getResultList()
返回重复结果,并且会从getSingleResult()
抛出异常。
总之,我需要有人向我解释这是否是预期的情况,或者我是否误解了某些内容。
这实际上归结为您如何解释规范。
英文:
Interesting question.
First of all let me point out that getSingleResult()
was meant for queries that due to their nature always return a single result (meaning: mostly aggregate queries like SELECT SUM(e.id) FROM Entity e
). A query that you think, based on some business domain-specific rule, should return a single result, does not really qualify.
That being said, the JPA Spec states that getSingleResult()
should throw NonUniqueResultException
when the query returns more than one result:
> The NonUniqueResultException
is thrown by the persistence provider when Query.getSingleResult
or TypedQuery.getSingleResult
is invoked and there is more than one result from the query. This exception will not cause the current transaction, if one is active, to be marked for rollback.
However, looking at the Hibernate implementation:
@Override
public R getSingleResult() {
try {
final List<R> list = list();
if ( list.size() == 0 ) {
throw new NoResultException( "No entity found for query" );
}
return uniqueElement( list );
}
catch ( HibernateException e ) {
if ( getProducer().getFactory().getSessionFactoryOptions().isJpaBootstrap() ) {
throw getExceptionConverter().convert( e );
}
else {
throw e;
}
}
}
public static <R> R uniqueElement(List<R> list) throws NonUniqueResultException {
int size = list.size();
if ( size == 0 ) {
return null;
}
R first = list.get( 0 );
for ( int i = 1; i < size; i++ ) {
if ( list.get( i ) != first ) {
throw new NonUniqueResultException( list.size() );
}
}
return first;
}
it turns out Hibernate's interpretation of 'more than one result' seems to be 'more than one unique result'.
In fact, I tested your scenario with all JPA providers, and it turns out that:
- Hibernate does indeed return duplicates from
getResultList()
, but does not throw the exception due to the peculiar waygetSingleResult()
is implemented - EclipseLink is the only one that does not suffer from the duplicate result bug in
getResultList()
and consequently,getSingleResult()
does not throw an exception, either (to me, this behaviour is only logical, but as it turns out, it is all a matter of interpretation) - OpenJPA and DataNucleus both return duplicate results from
getResultList()
and throw an exception fromgetSingleResult()
Tl;DR
> I need someone to explain me if this is expected or if I misunderstood something.
It really boils down to how you interpret the specification
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论