英文:
Spring Data loses some entries on inserting
问题
当我尝试在PostgreSQL中保存一个包含77832个元素的大实体列表时,但在执行"saveAll"方法后,表中只有49207个条目(在添加项目之前,表是空的)。根据调试器,列表的大小没有改变。在保存数据期间,应用程序和数据库日志中都没有错误。
以下是实体类:
@Getter
@Setter
@Entity
@Table(name = "faction")
@NoArgsConstructor
@AllArgsConstructor
public class Faction {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "name", unique = true, nullable = false)
private String name;
@ManyToOne(cascade = CascadeType.MERGE, optional = false)
@JoinColumn(name = "allegiance_id", nullable = false)
private Allegiance allegiance;
@ManyToOne(cascade = CascadeType.MERGE, optional = false)
@JoinColumn(name = "government_id", nullable = false)
private Government government;
@Column(name = "is_player_faction", nullable = false)
private Boolean isPlayerFaction;
}
@Entity
@Table(name = "allegiance")
@Getter
@Setter
@ToString
@NoArgsConstructor
@AllArgsConstructor
public class Allegiance {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "name", unique = true, nullable = false)
private String name;
}
以及实现保存数据逻辑的方法:
public List<FactionDto> saveFactions(List<FactionDto> factionDtos) {
var factions = factionDtos.stream()
.map(factionMapper::toEntity)
.toList();
var governments = factionDtos.stream()
.map(FactionDto::getGovernment)
.collect(Collectors.toSet())
.stream()
.map(item -> new Government(null, item.getName()))
.collect(Collectors.toSet());
Map<String, Government> governmentMap = governmentRepository
.saveAll(governments)
.stream()
.collect(Collectors.toMap(Government::getName, item -> item));
var allegiances = factionDtos.stream()
.map(FactionDto::getAllegiance)
.collect(Collectors.toSet())
.stream()
.map(item -> new Allegiance(null, item.getName()))
.collect(Collectors.toSet());
Map<String, Allegiance> allegianceMap = allegianceRepository
.saveAll(allegiances)
.stream()
.collect(Collectors.toMap(Allegiance::getName, allegiance -> allegiance));
factions = factions.stream()
.peek(faction -> {
var allegiance = allegianceMap.get(faction.getAllegiance().getName());
faction.setAllegiance(allegiance);
var government = governmentMap.get(faction.getGovernment().getName());
faction.setGovernment(government);
})
.collect(Collectors.toList());
return factionRepository.saveAll(factions).stream()
.map(factionMapper::toDto)
.toList();
}
调试器显示传递用于保存的集合中确实有77832个元素。没有重复项。
在我看来,应该创建相同数量的条目,或者至少如果存在冲突应该显示错误消息。
英文:
when I'm trying to save big list of entities (77832 elements) in PostgreSQL. But after performing "saveAll" method there are only 49207 entries in table (table was empty before adding items). According to debugger list size doesn't change. During saving data there are no errors in application and database log.
Here is entity classes:
@Getter
@Setter
@Entity
@Table(name = "faction")
@NoArgsConstructor
@AllArgsConstructor
public class Faction {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "name", unique = true, nullable = false)
private String name;
@ManyToOne(cascade = CascadeType.MERGE, optional = false)
@JoinColumn(name = "allegiance_id", nullable = false)
private Allegiance allegiance;
@ManyToOne(cascade = CascadeType.MERGE, optional = false)
@JoinColumn(name = "government_id", nullable = false)
private Government government;
@Column(name = "is_player_faction", nullable = false)
private Boolean isPlayerFaction;
}
@Entity
@Table(name = "allegiance")
@Getter
@Setter
@ToString
@NoArgsConstructor
@AllArgsConstructor
public class Allegiance {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "name", unique = true, nullable = false)
private String name;
}
And method which implements saving data logic:
public List<FactionDto> saveFactions(List<FactionDto> factionDtos) {
var factions = factionDtos.stream()
.map(factionMapper::toEntity)
.toList();
var governments = factionDtos.stream()
.map(FactionDto::getGovernment)
.collect(Collectors.toSet())
.stream()
.map(item -> new Government(null, item.getName()))
.collect(Collectors.toSet());
Map<String, Government> governmentMap = governmentRepository
.saveAll(governments)
.stream()
.collect(Collectors.toMap(Government::getName, item -> item));
var allegiances = factionDtos.stream()
.map(FactionDto::getAllegiance)
.collect(Collectors.toSet())
.stream()
.map(item -> new Allegiance(null, item.getName()))
.collect(Collectors.toSet());
Map<String, Allegiance> allegianceMap = allegianceRepository
.saveAll(allegiances)
.stream()
.collect(Collectors.toMap(Allegiance::getName, allegiance -> allegiance));
factions = factions.stream()
.peek(faction -> {
var allegiance = allegianceMap.get(faction.getAllegiance().getName());
faction.setAllegiance(allegiance);
var government = governmentMap.get(faction.getGovernment().getName());
faction.setGovernment(government);
})
.collect(Collectors.toList());
return factionRepository.saveAll(factions).stream()
.map(factionMapper::toDto)
.toList();
}
Debugger shows there are exactly 77832 elements in collection passed for saving. There are no duplicates
In my opinion there are should be same number of entries created or at least error message if there are conflicts
答案1
得分: 2
faction.name的值包含重复项。name列是唯一的,因此只保留唯一的名称。在保存之前立即执行以下测试。
var uniqueFactionByName = factions.stream()
.map(faction -> faction.name)
.collect(Collectors.toSet())
System.out.println(uniqueFactionByName.size());
英文:
The faction.name values contain duplicates. The name column is unique so only the unique names remain. Test it by doing the following immediately before the save.
var uniqueFactionByName = factions.stream()
.map(faction -> faction.name)
.collect(Collectors.toSet())
System.out.println(uniqueFactionByName.size());
答案2
得分: 1
> changing bulk save to factions.forEach(factionRepository::save);
made saving longer but there are at least all entries present. However I still have no idea why bulk save works this way
改变批量保存为 factions.forEach(factionRepository::save);
使保存时间变长,但至少所有条目都存在。但我仍然不知道为什么批量保存会这样工作。
This works because your business method is not annotated with @Transactional
and thus you now save each Faction
object in a separate transaction. See the code of SimpleJpaRepository
encapsulating the basic functionality for Spring Data's JpaRepository
:
这是因为您的业务方法没有使用 @Transactional
注解,因此您现在将每个 Faction
对象保存在单独的事务中。请查看封装Spring Data的 JpaRepository
基本功能的 SimpleJpaRepository
代码:
//one transaction for all items
@Transactional
public <S extends T> List<S> saveAll(Iterable<S> entities) {
Assert.notNull(entities, "Entities must not be null!");
List<S> result = new ArrayList();
Iterator var3 = entities.iterator();
while(var3.hasNext()) {
S entity = var3.next();
result.add(this.save(entity));
}
return result;
}
//a separate transaction for each item
@Transactional
public <S extends T> S save(S entity) {
Assert.notNull(entity, "Entity must not be null.");
if (this.entityInformation.isNew(entity)) {
this.em.persist(entity);
return entity;
} else {
return this.em.merge(entity);
}
}
如您所见,JpaRepository.saveAll()
只是在循环中调用 JpaRepository.save()
,因此逻辑是相同的,唯一的区别是只有1个事务而不是n个。
Now, why some of the items are lost. The cause is @GeneratedValue(strategy = GenerationType.IDENTITY)
which is not supported by PostgreSQL, see https://vladmihalcea.com/hibernate-identity-sequence-and-table-sequence-generator/.
现在,为什么有些项目丢失了。原因是 @GeneratedValue(strategy = GenerationType.IDENTITY)
不受 PostgreSQL 支持,请参阅 https://vladmihalcea.com/hibernate-identity-sequence-and-table-sequence-generator/。
With GenerationType.IDENTITY
you insert a row into your table without ID and your DBMS assigns it automatically, so you have no control over it. If DBMS for some reason detects two rows as duplicated within one and the same transaction it might squash them into one and I suspect this is what happens in this case.
使用 GenerationType.IDENTITY
,您将一行插入到表中而没有ID,您的DBMS会自动分配它,因此您无法控制它。如果由于某种原因,DBMS 在同一事务中检测到两行重复,它可能会将它们合并成一行,我怀疑这就是本例中发生的情况。
I suggest you to use sequence for primary key generation, this would solve your problem. If sequence cannot be applied in your case then try to split the entire collection of saved Faction
objects into smaller chunks, e.g. of size 500, something like:
我建议您使用序列来生成主键,这将解决您的问题。如果在您的情况下无法使用序列,那么请尝试将保存的整个 Faction
对象集合拆分成较小的块,例如大小为500,类似于:
Lists.partition(factions, 500)
.stream()
.map(factionRepository::saveAll)
.flatMap(List::stream)
.map(factionMapper::toDto)
.toList();
您仍然会有 n/500
个事务而不是1个,但这可能会有所帮助。然而,正确的解决方案是使用序列。
英文:
> changing bulk save to factions.forEach(factionRepository::save);
made saving longer but there are at least all entries present. However I still have no idea why bulk save works this way
This works because your business method is not annotated with @Transactional
and thus you now save each Faction
object in a separate transaction. See the code of SimpleJpaRepository
encapsulating the basic functionality for Spring Data's JpaRepository
:
//one transaction for all items
@Transactional
public <S extends T> List<S> saveAll(Iterable<S> entities) {
Assert.notNull(entities, "Entities must not be null!");
List<S> result = new ArrayList();
Iterator var3 = entities.iterator();
while(var3.hasNext()) {
S entity = var3.next();
result.add(this.save(entity));
}
return result;
}
//a separate transaction for each item
@Transactional
public <S extends T> S save(S entity) {
Assert.notNull(entity, "Entity must not be null.");
if (this.entityInformation.isNew(entity)) {
this.em.persist(entity);
return entity;
} else {
return this.em.merge(entity);
}
}
As you see JpaRepository.saveAll()
just calls JpaRepository.save()
in a loop, so the logic is the same but the only difference is about having 1 transaction instead of n.
Now, why some of the items are lost. The cause is @GeneratedValue(strategy = GenerationType.IDENTITY)
which is not supported by PostgreSQL, see https://vladmihalcea.com/hibernate-identity-sequence-and-table-sequence-generator/.
With GenerationType.IDENTITY
you insert a row into your table without ID and your DBMS assigns it automatically, so you have no control over it. If DBMS for some reason detects two rows as duplicated within one and the same transaction it might squash them into one and I suspect this is what happens in this case.
I suggest you to use sequence for primary key generation, this would solve your problem. If sequence cannot be applied in your case then try to split entire collection of saved Faction
objects into smaller chunks, e.g. of size 500, something like:
Lists.partition(factions, 500)
.stream()
.map(factionRepository::saveAll)
.flatMap(List::stream)
.map(factionMapper::toDto)
.toList();
You'll still have n/500
transactions instead of 1, but this could help. The correct solution, however, is to use sequence.
答案3
得分: 1
我猜你没有自己实现 FactionRepository.saveAll
,而是依赖于Spring Data的默认行为?Spring Data不知道save
应该映射到persist
还是merge
,因此它会尝试通过查看实体来理解应该做什么,例如,如果实体具有已设置的主键(PK),它将假定使用merge
。我不知道你使用的版本是否有其他确定此行为的策略,但我认为这可能是你问题的根源。
你确定对象不包含id属性的值,即factions.stream().noneMatch(f -> f.getId() != null)
吗?
虽然有大量的DTO转换代码,但很难看出发生了什么。我也猜想你只是发布了一部分代码而不是实际的代码?因此,可能存在其他可能不匹配的地方,这使得我们很难帮助你,因为错误可能位于你未显示的代码中。
无论如何,我认为这是使用Blaze-Persistence Entity Views的绝佳用例。
我创建了这个库,以便轻松地在JPA模型和自定义接口或抽象类定义的模型之间进行映射,有点像强大的Spring Data投影。思路是你可以按照你喜欢的方式定义目标结构(领域模型),并通过JPQL表达式将属性(getter)映射到实体模型。
针对你的用例,使用Blaze-Persistence Entity-Views,DTO模型可以如下所示:
@EntityView(Faction.class)
@CreatableEntityView
public interface FactionDto {
@IdMapping
Long getId();
String getName();
void setName(String name);
Boolean getIsPlayerFaction();
void setIsPlayerFaction(Boolean isPlayerFaction);
GovernmentDto getGovernment();
void setGovernment(GovernmentDto government);
AllegianceDto getAllegiance();
void setAllegiance(AllegianceDto allegiance);
@EntityView(Allegiance.class)
@CreatableEntityView
interface AllegianceDto {
@IdMapping
Long getId();
String Name();
void setName(String name);
}
@EntityView(Government.class)
@CreatableEntityView
interface GovernmentDto {
@IdMapping
Long getId();
String getName();
void setName(String name);
}
}
查询只需要将实体视图应用于查询,最简单的方式就是按照id进行查询:
FactionDto a = entityViewManager.find(entityManager, FactionDto.class, id);
Spring Data集成允许你几乎像Spring Data投影一样使用它:https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
Page<FactionDto> findAll(Pageable pageable);
最棒的部分是,它只会获取实际必要的状态!
保存也得到支持,定义并在存储库中使用这个方法同样简单:
void saveAll(List<FactionDto> dtos);
英文:
I guess you didn't implement FactionRepository.saveAll
yourself but rely on the Spring Data default? Spring Data doesn't know if save
should map to persist
or merge
, so it tries to understand what to do by looking at the entity e.g. if the entity has a PK set, it will assume merge
. I don't know if the version you are using has some other strategies to determine this, but I think that could be the source of your problem.
Are you certain that the objects do not contain a value for the id attribute i.e. factions.stream().noneMatch(f -> f.getId() != null)
?
With all that DTO transformation code though, it's hard to see what's going on. I also guess that you just posted an excerpt and not the actual code? So there are further possibly mismatches which makes it hard for us to help you, as the bug could lie somewhere in the code you are not showing.
Either way, I think this is a perfect use case for Blaze-Persistence Entity Views.
I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.
A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:
@EntityView(Faction.class)
@CreatableEntityView
public interface FactionDto {
@IdMapping
Long getId();
String getName();
void setName(String name);
Boolean getIsPlayerFaction();
void setIsPlayerFaction(Boolean isPlayerFaction);
GovernmentDto getGovernment();
void setGovernment(GovernmentDto government);
AllegianceDto getAllegiance();
void setAllegiance(AllegianceDto allegiance);
@EntityView(Allegiance.class)
@CreatableEntityView
interface AllegianceDto {
@IdMapping
Long getId();
String getName();
void setName(String name);
}
@EntityView(Government.class)
@CreatableEntityView
interface GovernmentDto {
@IdMapping
Long getId();
String getName();
void setName(String name);
}
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
FactionDto a = entityViewManager.find(entityManager, FactionDto.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
Page<FactionDto> findAll(Pageable pageable);
The best part is, it will only fetch the state that is actually necessary!
Saving is also supported and as simple as defining and using this method in a repository:
void saveAll(List<FactionDto> dtos);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论