英文:
how does java deep copy utility reserve objects' inner relationships?
问题
我曾使用Apache的SerializationUtils
来深度复制对象,发现了一些令人惊奇的事情。例如,对象A
有2个成员B1
、B2
,而且它们都有相同的成员C
(指向同一个对象)。
深度复制后,创建了A'
,我期望B1'
有成员C1'
,B2'
有成员C2'
。但实际情况是B1'
和B2'
都有相同的成员C'
。
看起来,在深度复制后,对象的层次结构和关系得到了保持。这是如何实现的?
英文:
I was using apache SerializationUtils
to deep copy objects and found something amazing. For example, Object A
has 2 member B1
,B2
, and both of them has the same member C
(refering to same object).
After deep copy, A'
was created , and I was expecting B1'
has member C1'
, B2'
has member C2'
. But it occurs that both B1'
and B2'
has the same member C'
.
It seems that after deep copy, the object hierarchy and relationship is maintained. How is that implemented?
答案1
得分: 1
我不了解Apache库,但很可能它会保留迄今为止复制的实例的映射。如果它遇到要复制的实例C
,它首先检查是否已经存在一个C'
的副本。如果是这样,它会使用现有的副本。如果没有,它会创建C
的深层副本,给予C'
并将该副本存储在映射中。
需要考虑的一个细节是:我猜想Apache基于==
运算符而不是equals()
方法进行存在性测试,因为==
运算符会提供最接近原始引用结构的清晰结果。如果不是这样,两个完全不同的实例C1
和C2
只是恰好满足equals()
测试,最终会成为一个单一的副本C'
。
英文:
I don't know the Apache library, but most probably it keeps a map of instances copied so far. And if it encounters an instance C
to be copied, it first checks whether there already exists a copy C'
of that instance. If so, it uses the existing copy. If not, it creates a deep copy of C
, giving C'
and stores that copy in the map.
One fine point to consider: I guess that Apache bases the existence test on the on the ==
operator and not the equals()
method, as the ==
operator will give the cleanest result, best resembling the original references structure. If not, two distinct instances C1
and C2
that just happen to satisfy the equals()
test would end up as a single copy C'
.
答案2
得分: 1
"Let's consider two references as having 'the same identity' if they are pointing to the same object. That is, given:
Object a = ...;
Object b = ...;
Then a
and b
are 'identical' if this holds: a == b
, which will only hold if they point at the same object.
Note that a.equals(b)
is different; any two references for which that holds can be considered 'equal,' but 2 objects may be involved. Trivial example:
String a = new String("Hello");
String b = new String("Hello");
a == b; // this is false
a.equals(b); // this is true
It is possible to figure out if 2 references are identical and not just equal.
One easy check is literally what I just showed you: ==
, which checks identical and not equal.
Most likely, the code in SerializationUnits uses WeakHashMap
which is a map that maps on identity (more or less, 'the pointer'). WHM is mostly an internal implementation, but note that you can always get the identity hashcode via System.identityHashCode
which returns the same value for the same object, even if that object is mutated. In theory, a.hashCode()
can return a different value (and for mutable objects, it tends to), but System.identityHashCode(a)
is the same value for any given instance for the lifetime of a VM.
Plain jane HashMap
uses a.hashCode()
to know which bucket to look at, and then a.equals(b)
to scan for equality.
A WeakHashMap
uses System.identityHashCode(a)
to know which bucket to look at, and then a == b
to scan for equality.
Armed with that, writing a serializer that preserves hierarchy and relationship is then trivial.
Note also that without such a mechanism, solid serialization is impossible. After all, imagine this structure:
List<Object> list = new ArrayList<Object>();
list.add(list); // ooooh, recursion!
without tools like WeakHashMap
, any attempt to serialize this construct will result in a StackOverflowError
, for obvious reasons."
英文:
Let's consider two references as having 'the same identity' if they are pointing to the same object. That is, given:
Object a = ...;
Object b = ...;
Then a
and b
are 'identical' if this holds: a == b
, which will only hold if they point at the same object.
Note that a.equals(b)
is different; any two references for which that holds can be considered 'equal', but 2 objects may be involved. Trivial example:
String a = new String("Hello");
String b = new String("Hello");
a == b; // this is false
a.equals(b); // this is true
It is possible to figure out if 2 references are identical and not just equal.
One easy check is literally what I just showed you: ==
, which checks identical and not equal.
Most likely, the code in SerializationUnits uses WeakHashMap
which is a map that maps on identity (more or less, 'the pointer'). WHM is mostly an internal implementation, but note that you can always get the identity hashcode via System.identityHashCode
which returns the same value for the same object, even if that object is mutated. In theory, a.hashCode()
can return a different value (and for mutable objects, it tends to), but System.identityHashCode(a)
is the same value for any given instance for the lifetime of a VM.
Plain jane HashMap
uses a.hashCode()
to know which bucket to look at, and then a.equals(b)
to scan for equality.
A WeakHashMap
uses System.identityHashCode(a)
to know which bucket to look at, and then a == b
to scan for equality.
Armed with that, writing a serializer that preserves hierarchy and relationship is then trivial.
Note also that without such a mechanism, solid serialization is impossible. After all, imagine this structure:
List<Object> list = new ArrayList<Object>();
list.add(list); // ooooh, recursion!
without tools like WeakHashMap
, any attempt to serialize this construct will result in a StackOverflowError
, for obvious reasons.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论