从multimap中删除元素的最佳方法

huangapple go评论73阅读模式
英文:

the best way to remove an element from list in multimap

问题

现在只是简单地遍历列表,如果列表包含该值,则从中删除。

在Scala中:

trait MultiMap[K, V] {

  final val map: mutable.Map[K, ArrayBuffer[V]] = ...


  def values(): Seq[V] = ...

  def remove(value: V) = 
    for {
      (_, buff) <- map if buff.contains(value)
    } 
      buff -= value
}

在Java中:


class MultiMap<K, V> {
    
    private final Map<K, ArrayList<V>> map = ...;

    public void remove(V value) {
        for (ArrayList<V> list: map.values()) {
            if (list.contains(value)) {
                list.remove(value);
                list.trimToSize();
            } 
        }
    }

}

这两种方法的时间复杂度均为O(n^2)。如何改进算法的复杂度?

英文:

now it's simply iterate over the lists and if list contains the value, removes it from

In Scala:

trait MultiMap[K, V] {

  final val map: mutable.Map[K, ArrayBuffer[V] = ...


  def values(): Seq[V] = ...

  def remove(value: V) = 
    for {
      (_, buff) &lt;- map if buff.contains(value)
    } 
      buff -= value
}

In Java:


class MultiMap&lt;K, V&gt; {
    
    private final Map&lt;K, ArrayList&lt;V&gt;&gt; map = ...;

    public void remove(V value) {
        for (ArrayList&lt;V&gt; list: map.values()) {
            if (list.contains(value)) {
                list.remove(value);
                list.trimToSize();
            } 
        }
    }

}

both of the methods have O(n^2) complexity. how to improve the complexity of the algorithm?

答案1

得分: 1

不要删除ArrayBuffer中的项目,然后再检查它是否存在,您可以尝试使用filter操作代替:

trait MultiMap[K, V] {
    private final val map: mutable.Map[K, ArrayBuffer[V]] = mutable.Map.empty[K, ArrayBuffer[V]]
    def values(): Seq[V] = map.values.flatten.toSeq
    def remove(value: V): Unit = map.values.foreach(_.filter(_ != value))
}

如果总值的数量为N,这不会是O(N^2),但由于filter每次都会创建新的缓冲实例,这可能是性能下降的另一种可能性。

更新

为了实现几乎是O(1),更精确地说是O(M),其中M是值重复的数量,另一个选项是创建另一个索引或映射,用于值和它们所在的位置信息(主映射中的键和缓冲区中的索引):

import scala.collection.mutable._

trait MultiMap[K, V] {
    private final val map: Map[K, ArrayBuffer[V]] = Map.empty
    private final val valuesIndex: Map[V, ArrayBuffer[(K, Int)]] = Map.empty

    def add(key: K, value: V): Unit = {
      map += key -> map.get(key).fold(ArrayBuffer(value))(_ += value)
      valuesIndex += value -> valuesIndex.get(value).fold(ArrayBuffer(key -> 0))(_ += (key -> (map(key).length - 1)))
    }

    def remove(value: V): Unit = {
      valuesIndex.getOrElse(value, ArrayBuffer.empty).foreach {
        case (key, index) => map(key).remove(index)
      }
      valuesIndex.remove(value)
    }
}

希望这对您有所帮助!

英文:

Instead of removing item from ArrayBuffer after checking if it exists, you can try use filter operation instead:

trait MultiMap[K, V] {
    private final val map: mutable.Map[K, ArrayBuffer[V]] = mutable.Map.empty[K, ArrayBuffer[V]]
    def values(): Seq[V] = map.values.flatten.toSeq
    def remove(value: V): Unit = map.values.foreach(_.filter(_ != value))
}

it won't be O(N^2) if N total number of values, but because filter creates new buffer instance each time, this might be possibly another performance downside.

UPDATE

Another option, in order to achieve almost O(1) or more precisely O(M), where M amount of value duplicates, is to create another index or map, but for values and information where they are located (key in main map and index in buffer) :

import scala.collection.mutable._

trait MultiMap[K, V] {
    private final val map: Map[K, ArrayBuffer[V]] = Map.empty
    private final val valuesIndex: Map[V, ArrayBuffer[(K, Int)]] = Map.empty

    def add(key: K, value: V): Unit = {
      map += key -&gt; map.get(key).fold(ArrayBuffer(value))(_ += value)
      valuesIndex += value -&gt; valuesIndex.get(value).fold(ArrayBuffer(key -&gt; 0))(_ += (key -&gt; (map(key).length - 1)))
    }

    def remove(value: V): Unit = {
      valuesIndex.getOrElse(value, ArrayBuffer.empty).foreach {
        case (key, index) =&gt; map(key).remove(index)
      }
      valuesIndex.remove(value)
    }
}

Hope this helps!

答案2

得分: 0

你可以通过使用红黑树(Red-black tree)将时间复杂度降低到O(log n)(其中n是树中的值的数量)。C++中的std::multimap实现也在做相同的事情。

顺便提一下,O(n^2)可能不是非常准确。如果你有n个键,每个键最多有m个值数组中的元素,则你的时间复杂度为O(n * m)

public void remove(V value) {
  for (List<V> list: map.values()) { // O(n)
    list.removeIf(element -> element.equals(value)); // O(m)
  }
}

你可以切换到Map<K, HashSet<V>>,这将把时间复杂度降低到O(n)... 但这不满足multiset的要求,无法在一个键下允许重复的值:

public void remove(V value) {
  for (HashSet<V> set: map.values()) { // O(n)
    set.remove(value); // O(1)
  }
}
英文:

You could bring your time complexity down to O(log n) (where n is number of values in tree) by using Red-black tree. Same thing is doing c++ in it's std::multimap implementation.

On a side note O(n^2) might not be exactly precise. If you have n keys, each with at most m elements in value array, your time complexity is O(n * m).

public void remove(V value) {
  for (List&lt;V&gt; list: map.values()) { // O(n)
    list.removeIf(element -&gt; element.equals(value)); // O(m)
  }
}

You could switch to Map&lt;K, HashSet&lt;V&gt;&gt; that would bring you down to O(n)... but that does not satisfy the multiset requirement to allow duplicate values under one key:

public void remove(V value) {
  for (HashSet&lt;V&gt; set: map.values()) { // O(n)
    set.remove(value); // O(1)
  }
}

huangapple
  • 本文由 发表于 2020年4月8日 02:34:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/61087010.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定