Java深拷贝工具是如何保留对象的内部关系的?

huangapple go评论60阅读模式
英文:

how does java deep copy utility reserve objects' inner relationships?

问题

我曾使用Apache的SerializationUtils来深度复制对象,发现了一些令人惊奇的事情。例如,对象A有2个成员B1B2,而且它们都有相同的成员C指向同一个对象)。
深度复制后,创建了A',我期望B1'有成员C1'B2'有成员C2'。但实际情况是B1'B2'都有相同的成员C'

看起来,在深度复制后,对象的层次结构和关系得到了保持。这是如何实现的?

英文:

I was using apache SerializationUtils to deep copy objects and found something amazing. For example, Object A has 2 member B1,B2, and both of them has the same member C (refering to same object).
After deep copy, A' was created , and I was expecting B1' has member C1', B2' has member C2'. But it occurs that both B1' and B2' has the same member C'.

It seems that after deep copy, the object hierarchy and relationship is maintained. How is that implemented?

答案1

得分: 1

我不了解Apache库,但很可能它会保留迄今为止复制的实例的映射。如果它遇到要复制的实例C,它首先检查是否已经存在一个C'的副本。如果是这样,它会使用现有的副本。如果没有,它会创建C的深层副本,给予C'并将该副本存储在映射中。

需要考虑的一个细节是:我猜想Apache基于==运算符而不是equals()方法进行存在性测试,因为==运算符会提供最接近原始引用结构的清晰结果。如果不是这样,两个完全不同的实例C1C2只是恰好满足equals()测试,最终会成为一个单一的副本C'

英文:

I don't know the Apache library, but most probably it keeps a map of instances copied so far. And if it encounters an instance C to be copied, it first checks whether there already exists a copy C' of that instance. If so, it uses the existing copy. If not, it creates a deep copy of C, giving C' and stores that copy in the map.

One fine point to consider: I guess that Apache bases the existence test on the on the == operator and not the equals() method, as the == operator will give the cleanest result, best resembling the original references structure. If not, two distinct instances C1 and C2 that just happen to satisfy the equals() test would end up as a single copy C'.

答案2

得分: 1

"Let's consider two references as having 'the same identity' if they are pointing to the same object. That is, given:

Object a = ...;
Object b = ...;

Then a and b are 'identical' if this holds: a == b, which will only hold if they point at the same object.

Note that a.equals(b) is different; any two references for which that holds can be considered 'equal,' but 2 objects may be involved. Trivial example:

String a = new String("Hello");
String b = new String("Hello");
a == b; // this is false
a.equals(b); // this is true

It is possible to figure out if 2 references are identical and not just equal.

One easy check is literally what I just showed you: ==, which checks identical and not equal.

Most likely, the code in SerializationUnits uses WeakHashMap which is a map that maps on identity (more or less, 'the pointer'). WHM is mostly an internal implementation, but note that you can always get the identity hashcode via System.identityHashCode which returns the same value for the same object, even if that object is mutated. In theory, a.hashCode() can return a different value (and for mutable objects, it tends to), but System.identityHashCode(a) is the same value for any given instance for the lifetime of a VM.

Plain jane HashMap uses a.hashCode() to know which bucket to look at, and then a.equals(b) to scan for equality.

A WeakHashMap uses System.identityHashCode(a) to know which bucket to look at, and then a == b to scan for equality.

Armed with that, writing a serializer that preserves hierarchy and relationship is then trivial.

Note also that without such a mechanism, solid serialization is impossible. After all, imagine this structure:

List<Object> list = new ArrayList<Object>();
list.add(list); // ooooh, recursion!

without tools like WeakHashMap, any attempt to serialize this construct will result in a StackOverflowError, for obvious reasons."

英文:

Let's consider two references as having 'the same identity' if they are pointing to the same object. That is, given:

Object a = ...;
Object b = ...;

Then a and b are 'identical' if this holds: a == b, which will only hold if they point at the same object.

Note that a.equals(b) is different; any two references for which that holds can be considered 'equal', but 2 objects may be involved. Trivial example:

String a = new String(&quot;Hello&quot;);
String b = new String(&quot;Hello&quot;);
a == b; // this is false
a.equals(b); // this is true

It is possible to figure out if 2 references are identical and not just equal.

One easy check is literally what I just showed you: ==, which checks identical and not equal.

Most likely, the code in SerializationUnits uses WeakHashMap which is a map that maps on identity (more or less, 'the pointer'). WHM is mostly an internal implementation, but note that you can always get the identity hashcode via System.identityHashCode which returns the same value for the same object, even if that object is mutated. In theory, a.hashCode() can return a different value (and for mutable objects, it tends to), but System.identityHashCode(a) is the same value for any given instance for the lifetime of a VM.

Plain jane HashMap uses a.hashCode() to know which bucket to look at, and then a.equals(b) to scan for equality.

A WeakHashMap uses System.identityHashCode(a) to know which bucket to look at, and then a == b to scan for equality.

Armed with that, writing a serializer that preserves hierarchy and relationship is then trivial.

Note also that without such a mechanism, solid serialization is impossible. After all, imagine this structure:

List&lt;Object&gt; list = new ArrayList&lt;Object&gt;();
list.add(list); // ooooh, recursion!

without tools like WeakHashMap, any attempt to serialize this construct will result in a StackOverflowError, for obvious reasons.

huangapple
  • 本文由 发表于 2020年7月28日 16:10:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/63129699.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定