英文:
Kotlin and realm: How to only insert nested RealmObjects when they do not exist?
问题
I work with Kotlin and the following dependencies:
id("io.realm.kotlin") version "1.7.0"
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.6.4")
implementation("io.realm.kotlin:library-base:1.7.0")
General use case:
我下载多个csv文件,将它们转换为RealmObjects,然后尝试保存RealmObjects列表。当然,在这种情况下,可能会多次保存具有相同PrimaryKey的RealmObjects,例如,QuantityRealmObject
在父对象(或根对象)中多次使用。我认为不存在于Kotlin中的UpdatePolicy.Modified
可能会正好做到这一点。
正如Jay所描述的,Upserting
可能是这种情况下的方法,但我不确定:
- 由于缺乏
modified
策略,我是否必须手动实现此操作:对于另一个RealmObject
中具有PrimaryKey的每个RealmObject
进行检查,以查看是否已经管理它(Realm.query()
),如果是,则使用已管理的对象,否则插入它 - 或者是否可以以一种优雅、通用的方式实现这一点
- 或者我的整体方法是否错误,我应该以另一种方式处理它
我尝试保存数据到RealmDB,这是有效的,但我有带有@PrimaryKeys的嵌套RealmObjects。目前,我的代码仅在将UpdatePolicy设置为ALL时才起作用,这可能导致许多不必要的更新(可能导致更大的文件大小?)但实际数据比使用EmbeddedRealmObjects少。
编辑:
我的问题是,我的Example
对象引用其他具有PrimaryKey的RealmObjects(例如QuantityRealmObject
)或引用其他RealmObjects,这些RealmObjects还引用QuantityRealmObjects
。如果使用UpdatePolicy.ALL,我可以轻松调用copyToRealm(exampleObject),并且所有引用都会正确保存。如果对于“嵌套”数量对象引用存在重复的PrimaryKey,它只会使用相同的值进行更新,但引用仍然正常。如果我想要像您建议的那样进行upsert,当然可以工作,但对于每个copyToRealm(exampleObject)调用,我将不得不检查许多“嵌套”realm对象引用:
val exampleObject
//查询exampleObject.field1.quantityRealmObject是否已存在,如果不存在,则创建它,如果存在,则将此字段设置为已管理的实例
//查询exampleObject.quantityRealmObject是否已存在,如果不存在,则创建它,如果存在,则将此字段设置为已管理的实例
//查询exampleObject.field2.field3.quantityRealmObject是否已存在,如果不存在,则创建它,如果存在,则将此字段设置为已管理的实例
// ... 为许多引用执行此操作
realm.copyToRealm(exampleObject)
//与
realm.copyToRealm(exampleObject, UpdatePolicy.ALL)
我喜欢错误处理的想法,但是我不确定在错误情况下如何设置ExampleObject中的正确引用。
fun createRealm(dbName: String, data: List<DataRealmObject>, schema: String) {
val config = RealmConfiguration.Builder(setOf(
// 一些RealmObject类
)
.compactOnLaunch()
.build()
val realm = Realm.open(config)
realm.writeBlocking {
data.forEach {
this.copyToRealm(it, UpdatePolicy.ALL)
}
}
realm.close()
}
如果我不将UpdatePolicy设置为ALL,当然会出现异常,指出具有PrimaryKey的对象已存在。是否有好的解决方案来处理这个问题,而不必将UpdatePolicy设置为ALL?理想情况下应该是:如果具有给定PrimaryKey的对象不存在,则插入它,否则使用已存在的对象。
我怀疑对已经存在的对象进行大规模更新对realmDb的文件大小产生了负面影响。
如何解决这个问题?我可以在每个copy调用之前查询每个嵌套的RealmObject是否已经存在,但由于有一些基本类型出现在许多不同的字段中,这将非常复杂。
编辑:
示例对象可能如下所示:
Example(): RealmObject{
var field1: String = ""
var anotherRealmObjectRef: Quantity? = null
var anotherRealmObjectRef2: Another? = null
// 其他可能包含对具有PrimaryKey的对象引用的字段
}
Quantity(): RealmObject{
@PrimaryKey
var id = ""
var value: Double = 0.0
var unit: String = ""
// 构造函数将id设置为例如value_unit
}
Another(): RealmObject {
// 其他字段
var price: Quantity? = null
}
正如我所说,我下载带有数据的csv文件,将每一行转换为Example
realm对象。对于每个这样的对象,我必须在多个字段中创建Quantity
对象。我将id字段作为PrimaryKey添加到Quantity
,因为实际上我创建了大约1百万个示例对象,但只有10k个唯一的Quantity
对象。因此,我只想要在我的realmdb中保存唯一的Quantity
实例,以节省空间并保持文件大小较小。在创建每个Quantity
对象之前,我可以检查是否有另一个Example
对象已包含具有此PrimaryKey的Quantity
对象,就像您在代码示例中所示的那样。由于类结构略显复杂,这将导致大量的代码,我不确定是否真的可行或是否合适。
UpdatePolicy.ALL
基本上为我解决了这个问题,因为结果的realm db只包含唯一的quantity对象。然而,它可能会对这些对象进行许多不必要的更新。我目前唯一的真正问题是,生成的realm db具有意外的文件大小(目前约为400-500mb)。使用Swift SDK创建的相似realm db大小约为200mb。如果这是由于大规模
英文:
I work with kotlin and the following dependencies:
id("io.realm.kotlin") version "1.7.0"
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.6.4")
implementation("io.realm.kotlin:library-base:1.7.0")
General usecase:
I download multiple csv files, convert them to RealmObjects and then I try to save the list of RealmObjects. Of course, in this case it is possible that RealmObjects with the same PrimaryKey are saved multiple times: e.g. QuantityRealmObject
is used in several RealmObjects within the parent (or root) object. I thought the UpdatePolicy.Modified which does not exist for Kotlin (?) would do exactly that.
As Jay described, Upserting
is probably the way to go in this case, however I am not sure if
- I must implement this manually due to the lack of the
modified
policy: Check for eachRealmObject
with a
PrimaryKey within anotherRealmObject
if it is already managed
(Realm.query()
), if yes use the managed object, else insert it - Or if this can be achieved in an elegant, generic way
- Or if my whole approach is faulty and I should go about it in another way
I try to save data RealmDB which works but I do have nested RealmObjects with @PrimaryKeys. Currently my code only works with setting the UpdatePolicy to ALL which probably leads to a lot of unnecessary updates (and possibly a bigger filesize?) but less actual data in the db than when working with EmbeddedRealmObjects.
EDIT:
My problem is that my Example
objects have references to other RealmObjects with PrimaryKeys (e.g. QuantityRealmObject
) or references to other RealmObjects which have also references to QuantityRealmObjects
. With UpdatePolicy.ALL if have the luxury that I can just call copyToRealm(exampleObject) and all references are saved correctly. If there is a duplicate primary key for a "nested" quantity object reference, it just updates it with the same values, but the references are still ok. If I want to upsert like you suggest, which of course works, I would have to check lots of "nested" realm object references for each copyToRealm(exampleObject) call:
val exampleObject
//query if exampleObject.field1.quantityRealmObject already exists, if not create it, if yes set this field to the already managed instance
//query if exampleObject.quantityRealmObject already exists, if not create it, if yes set this field to the already managed instance
//query if exampleObject.field2.field3.quantityRealmObject already exists, if not create it, if yes set this field to the already managed instance
// ... do that for lots of references
realm.copyToRealm(exampleObject)
//vs.
realm.copyToRealm(exampleObject, UpdatePolicy.ALL)
I like the idea with error handling, however I am not sure how I could set the correct references in the ExampleObject in an error case.
fun createRealm(dbName: String, data: List<DataRealmObject>, schema: String) {
val config = RealmConfiguration.Builder(setOf(
// a few RealmObject classes
)
.compactOnLaunch()
.build()
val realm = Realm.open(config)
realm.writeBlocking {
data.forEach {
this.copyToRealm(it, UpdatePolicy.ALL)
}
}
realm.close()
}
When I do not set the UpdatePolicy to ALL, of course I get exceptions stating that an object with the PrimaryKey already exists. Is there a good solution to deal with this without setting the UpdatePolicy to ALL? Ideal would be something like: if an object with the given PrimaryKey does not exist, insert it, else use the already existing object.
I do suspect that the massive updates on already existing objects has a negative effect on the filesize of the realmDb.
How could I solve this problem? I could query before each copy call if each nested RealmObject already exists, however this would be very complex since there are some basic types which occur in a lot of different fields.
EDIT:
An example object could look like this:
Example(): RealmObject{
var field1: String = ""
var anotherRealmObjectRef: Quantity? = null
var anotherRealmObjectRef2: Another? = null
// other fields who can contain references to objects with PrimaryKeys
}
Quantity(): RealmObject{
@PrimaryKey
var id = ""
var value: Double = 0.0
var unit: String = ""
// constructor sets id to e.g. value_unit
}
Another(): RealmObject {
// other fields
var price: Quantity? = null
}
So as I said, I download csv files with data, convert each row to, in this case, Example
realm objects. For each of those objects I must create also Quantity
objects in multiple fields. I added an id field as PrimaryKey to Quantity
because in reality I create maybe 1 mio example objects but there will be only 10k unique Quantity
objects. So I only want unique Quantity
instances in my realmdb to save space and keep the filesize small. I could potentially check before I create each Quantity
object, if there is currently another Example
object which contains already Quantity
objects with this PrimaryKey like you showed in your code example. Due to the somewhat complex class structure this would result in a lot of code and I am not sure if that is really feasible or good to do.
UpdatePolicy.ALL
basically solves this for me, because the resulting realm db only consists of unique quantity objects. However it does probably a lot of unnecessary updates on those objects.
The only real problem for me currently is that the resulting realm db has an unexpected filesize (currently around 400-500mb). A comparable realm db created with the swift sdk has around 200mb. If this is due to the mass updates (resulting in a lot of object versions?) it would be worth for me to solve the issue.
答案1
得分: 1
以下是您要翻译的内容的翻译部分:
有几个问题,让我尝试解决它们。我更喜欢在答案中包含代码,但也许关于 Realm 如何工作的一些澄清会更有益。这个答案的一部分是我的个人意见,所以请相应评估。
TL;DR - 跳到编辑部分
问题中的代码不能直接运行,因为它试图强行添加一个具有重复主键的对象到现有对象中;主键必须是唯一的,因此不允许有两个具有相同主键的对象。
.all
强制重新写入对象的所有属性,无论它们是否已更改。这是一大堆要传输的数据,我发现这种情况相当罕见。
.modified
只写出已修改的字段,因此通常情况下,这是更少数据的首选选项。它还允许 Upsert
,这正是你试图做的。
.error
;如果你想阻止更新现有对象,error 将在已存在具有相同主键的对象时抛出错误。
Upsert
是一个过程,如果一个对象存在,它将被更新。如果不存在,它将被插入。要引发此行为,当操作对象时,将更新标志设置为 .modified
,如果需要的话,它将自动插入,否则只会更新现有对象上的已修改字段。请注意,您还可以通过传递主键和要更新的值子集来部分更新对象。
问题提到了“嵌套对象”,这在 Realm 方面有点模糊(依我看来)。不幸的是,文档有点混合了“嵌套”,这可能会导致混淆。
嵌套:在树上有一个鸟巢,里面有鸟蛋。鸟蛋是嵌套的;它们是巢中的一部分,存在于巢内并作为巢的一部分存在;它们不会存在于其他巢中,只存在于那个巢中。
被管理的对象(并具有主键)并添加到另一个对象中并不真正“嵌套” - 它们不会成为父对象的一部分,因为它们是按引用存储的。这两个对象都是被管理的,彼此独立,可以在没有对方的情况下存在,并且在引用对象的情况下,可以从多个其他对象引用(所以不真正嵌套)。
另一方面,嵌入对象更类似于“嵌套”对象;它们不会单独管理,不能包含主键,并且是父对象图的一部分。
要更新嵌入(嵌套)对象,可以通过点符号表示,从父对象开始 parentObject.embeddedChildToUpdate.fieldToUpdate
,并且不会使用 .modified 或 .all(在这种情况下),因为字段是直接写入的。(嵌套对象永远不会被插入,因为它们不能在没有父对象的情况下存在)
看起来您似乎没有使用嵌套对象 - 一切似乎都是按引用的,所以有点偏题,但我希望这可以帮助您。
编辑
如果目标是持久保存具有唯一主键的对象,并忽略重复项,可以使用以下方法。尝试读取具有给定主键的对象,如果不存在,则保存新对象;如果存在,则忽略它并继续下一个。
for (widget in widgetList.find()) {
realm.write {
val widget = // 通过其主键获取小部件
// 如果没有具有该主键的小部件,请保存它
if (widget == null) {
widget.copyToRealm(WidgetClass().apply {
_id = ObjectId()
// 如果需要,填充属性
}
})
// 如果到达这里,存在具有该主键的小部件
// 所以不要保存它(例如,忽略它)
}
}
这个过程不需要 .all 或 .modified,甚至不需要 upsert。
另一种选项是尝试写入每个对象 - 如果存在具有现有主键的对象,将会抛出错误。优雅地处理错误(基本上什么都不做),然后继续下一个对象。
英文:
There are several questions within the question so let me try to tackle them all. I prefer including code in answers but perhaps some clarity about how Realm works would be more beneficial. Some of this answer is IMO so evaluate accordingly.
TL;DR - skip to the Edit
The code in the question doesn't work as is because it's trying to brute force add an object, which had a duplicate primary key to an existing object; primary keys must be unique so having two objects with the same primary key would not be allowed.
The difference between .all and .modified are related to how the data is written (keep reading: Upsert, which may be an answer).
.all
forces all properties of an object to be re-written, whether they have changed or not. That's a whole lot of data to push around and I would find use cases for this kinda rare.
.modified
only writes out fields that have been modified so in general, it's far less data and the preferred option. It will also allow for Upsert
which is what you're attempting to do.
.error
; if you want to prevent updating an existing object, error will throw an error if an object that the same primary key already exists
Upsert
'ing is the process where if an object exists, it will be updated. If it does not exist, it will be inserted. To cause this behavior, when an object is being manipulated, set the update flag to .modified
and it will magically be inserted if needed, otherwise just the modified fields will be updated on the existing object. Note that you can also partially update an object by passing the primary key and a subset of the values to update.
The question mentions "nested objects" and that's a bit ambiguous (IMO) when is comes to Realm. Unfortunately the documentation kinda of mixes 'nested' in so that can lead to confusion.
Nested: In a tree sits a birds nest with eggs. The eggs are nested; they are part of the nest and exist within and as part of the nest; they do not exist in other nests, only that nest.
Objects that are managed (and have a primary key) and are added to another object are not really "nested" - they do not become part of the parent object as they are stored by reference. The two objects are both managed and independent of each other, can exist without the other and in the case of a referenced object, can be referenced from multiple other objects (so, not really nested)
Embedded objects on the other hand are more akin to a 'nested' object; they are not managed separately, do not/can not contain a primary key and are part of the parent object's graph.
To update an embedded (nested) object would be done through dot notation starting with the parent parentObject.embeddedChildToUpdate.fieldToUpdate
and would not be done using .modified or .all (in this context) since the field is being written directly. (and embedded objects would not ever be upserted since they cannot existing without the parent)
It doesn't appear you're using embedded objects - everything seems to be by reference so a bit OT but I hope that helps.
Edit
If the goal is to persist objects that have a unique primary key and to ignore those that are duplicates, this should do it. Attempt to read an object with a given primery key, if it does not exist, persist a new object; if it does exist, ignore it and move on to the next one.
for (widget in widgetList.find()) {
realm.write {
val widget = //fetch the widget via it's primary key
//if there is no widget with that primary key, persist it
if (widget == null) {
widget.copyToRealm(WidgetClass().apply {
_id = ObjectId()
//populate the properties if needed
}
})
//if we get here, a widget with that primary key exists
// so don't persist it (e.g. ignore it)
}
}
This process does not need .all or .modified or even an upsert
The other option is to attempt to write each object - if an object with an existing primary key exists, and error will be thrown. Handle the error elegantly (pretty much do nothing) and then move on to the next object.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论