最佳更新集合的集合方式

huangapple go评论71阅读模式
英文:

Best way to update set of set

问题

以下是您提供的代码的翻译部分:

有一组连接顶点的边我正在尝试将这些顶点分成彼此连接的组

static class Edge {
    final String from, to;

    Edge(String from, String to) {
        this.from = from;
        this.to = to;
    }
}

static Set<String> find(String k, Set<Set<String>> sets) {
    for (Set<String> set : sets)
        if (set.contains(k))
            return set;
    return null;
}

static Set<Set<String>> connectedVertices(List<Edge> edges) {
    Set<Set<String>> result = new HashSet<>();
    for (Edge e : edges) {
        Set<String> from = find(e.from, result);
        Set<String> to = find(e.to, result);
        if (from == null && to == null) {
            result.add(new HashSet<>(Set.of(e.from, e.to)));
        } else if (from == null) {
            to.add(e.from);
        } else if (to == null) {
            from.add(e.to);
        } else if (from != to) {
            result.remove(to);
            from.addAll(to);
        }
    }
    return result;
}


List<Edge> edges = List.of(
    new Edge("a", "b"),
    new Edge("c", "b"),
    new Edge("c", "d"),
    new Edge("a", "c"),
    new Edge("e", "f"),
    new Edge("x", "y"),
    new Edge("y", "d"));
System.out.println(connectedVertices(edges));

但结果与我预期的不符

预期

[[e, f], [a, b, c, d, x, y]]

实际

[[a, b, c, d, x, y], [a, b, c, d], [e, f]]

通过以下更改我可以获得我期望的结果但这样太冗长是否有更好的方法

static Set<Set<String>> connectedVertices(List<Edge> edges) {
    Set<Set<String>> result = new HashSet<>();
    for (Edge e : edges) {
        Set<String> from = find(e.from, result);
    // 其余部分已省略

请注意,这只是代码的翻译部分,与您提供的原始代码相同。如果您需要任何进一步的帮助,请随时提问。

英文:

There is a list of edges connecting the vertices. I'm trying to divide these vertices into groups that are connected to each other.

static class Edge {
final String from, to;
Edge(String from, String to) {
this.from = from;
this.to = to;
}
}
static Set&lt;String&gt; find(String k, Set&lt;Set&lt;String&gt;&gt; sets) {
for (Set&lt;String&gt; set : sets)
if (set.contains(k))
return set;
return null;
}
static Set&lt;Set&lt;String&gt;&gt; connectedVertices(List&lt;Edge&gt; edges) {
Set&lt;Set&lt;String&gt;&gt; result = new HashSet&lt;&gt;();
for (Edge e : edges) {
Set&lt;String&gt; from = find(e.from, result);
Set&lt;String&gt; to = find(e.to, result);
if (from == null &amp;&amp; to == null) {
result.add(new HashSet&lt;&gt;(Set.of(e.from, e.to)));
} else if (from == null) {
to.add(e.from);
} else if (to == null) {
from.add(e.to);
} else if (from != to) {
result.remove(to);
from.addAll(to);
}
}
return result;
}

And

    List&lt;Edge&gt; edges = List.of(
new Edge(&quot;a&quot;, &quot;b&quot;),
new Edge(&quot;c&quot;, &quot;b&quot;),
new Edge(&quot;c&quot;, &quot;d&quot;),
new Edge(&quot;a&quot;, &quot;c&quot;),
new Edge(&quot;e&quot;, &quot;f&quot;),
new Edge(&quot;x&quot;, &quot;y&quot;),
new Edge(&quot;y&quot;, &quot;d&quot;));
System.out.println(connectedVertices(edges));

But the result is not what I expected.

Expected:

[[e, f], [a, b, c, d, x, y]]

Actual:

[[a, b, c, d, x, y], [a, b, c, d], [e, f]]

I get the results I expect with the following changes, but it's verbose. Is there a better way?

static Set&lt;Set&lt;String&gt;&gt; connectedVertices(List&lt;Edge&gt; edges) {
Set&lt;Set&lt;String&gt;&gt; result = new HashSet&lt;&gt;();
for (Edge e : edges) {
Set&lt;String&gt; from = find(e.from, result);
Set&lt;String&gt; to = find(e.to, result);
if (from == null &amp;&amp; to == null) {
result.add(new HashSet&lt;&gt;(Set.of(e.from, e.to)));
} else if (from == null) {
result.remove(to);
to.add(e.from);
result.add(to);
} else if (to == null) {
result.remove(from);
from.add(e.to);
result.add(from);
} else if (from != to) {
result.remove(from);
result.remove(to);
from.addAll(to);
result.add(from);
}
}
return result;
}

答案1

得分: 1

问题在于哈希集合中的元素不应该是可变的,但在这里,您正在更改外部集合的内部集合,从而改变了它们的哈希码。改变后的哈希码导致 result.remove 调用无法删除 to

remove 尝试使用新的哈希码查找 to,但是 to 存储在不同的桶中,因为在添加时其哈希码不同。

来自文档

> 注意:如果将可变对象用作集合元素,则必须非常小心。如果在对象作为集合中的元素时以影响相等比较的方式更改对象的值,则集合的行为未经指定。这个禁止的特例是,集合不能包含自身作为元素。

我只会使用一系列集合,而不是一组集合。您正在执行的唯一操作,集合确实很擅长的,就是 remove。使用列表将会使 remove 成为一个 O(n) 操作。但由于算法的总体复杂度要比这高得多,额外的 O(n) 操作在长期内不会有太大影响。我会考虑条件 else if (from != to) 是否经常触发,以及您的“继续删除、更改,然后再添加回去”的方法是否对使用基准测试的小输入实际上更快。

static Set<String> find(String k, List<Set<String>> sets) {
    for (Set<String> set : sets)
        if (set.contains(k))
            return set;
    return null;
}

static List<Set<String>> connectedVertices(List<Edge> edges) {
    List<Set<String>> result = new ArrayList<>();
    for (Edge e : edges) {
        Set<String> from = find(e.from, result);
        Set<String> to = find(e.to, result);
        if (from == null && to == null)
            result.add(new HashSet<>(Set.of(e.from, e.to)));
        else if (from == null) {
            to.add(e.from);
        } else if (to == null) {
            from.add(e.to);
        } else if (from != to) {
            result.remove(to);
            from.addAll(to);
        }
    }
    return result;
}

如果您想要一个集合作为结果,您始终可以执行:

new HashSet(connectedVertices(edges))
英文:

The problem is that the elements of a hash set should not be mutable, but here you are changing the inner sets of the outer set, which changes their hash codes. The changed hash code causes the result.remove call to fail to remove to.

remove tries to look for to with the new hash code, but to is stored in a different bucket, because when it was added, its hash code is different.

From the docs:

> Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.

I would just use a list of sets, rather than a set of sets. The only operation that you are doing, that sets are really good at, is remove. Using a list would turn remove into an O(n) operation. But since the algorithm overall has a way higher complexity than that, an extra O(n) operation is not going to matter in the long term. I would consider whether the condition else if (from != to) is hit very often, and whether your approach of "keep on removing, changing, then adding back in" is actually faster for small inputs using a benchmark.

static Set&lt;String&gt; find(String k, List&lt;Set&lt;String&gt;&gt; sets) {
for (Set&lt;String&gt; set : sets)
if (set.contains(k))
return set;
return null;
}
static List&lt;Set&lt;String&gt;&gt; connectedVertices(List&lt;Edge&gt; edges) {
List&lt;Set&lt;String&gt;&gt; result = new ArrayList&lt;&gt;();
for (Edge e : edges) {
Set&lt;String&gt; from = find(e.from, result);
Set&lt;String&gt; to = find(e.to, result);
if (from == null &amp;&amp; to == null)
result.add(new HashSet&lt;&gt;(Set.of(e.from, e.to)));
else if (from == null) {
to.add(e.from);
} else if (to == null) {
from.add(e.to);
} else if (from != to) {
result.remove(to);
from.addAll(to);
}
}
return result;
}

If you would like a set as a result, you could always do

new HashSet(connectedVertices(edges))

huangapple
  • 本文由 发表于 2020年9月25日 12:02:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/64057629.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定