英文:
What is the optimal data structure for storing objects with a string key and a bool auxiliary value?
问题
我需要一个像下面这样的数据结构,但我需要能够更改布尔值。其他两个保持初始化时的状态。您会选择什么以获得最佳性能?
Dictionary<string, (object, bool)> dic = new Dictionary<string, (object, bool)>();
我在考虑使用哈希表。但哈希表就像一个带有键/值对的字典。我的示例中的对象和布尔值在概念上不像键/值对,因为外部字典的其他值可以拥有相同的对象(或更好地...对象类型)。我不希望后来查看我的代码的人认为对象和布尔值之间的关系比它们实际上更密切。
编辑:在这个示例中,object
只是一个占位符。实际上,它是一个包含其他对象的复杂对象,依此类推。在此之前的过程中,会生成大量这些对象,其中一些是其他对象的深层复制。它们会传递给此过程。所有对象都按某些规则命名,并存储在字典中。名称显然是唯一的。接下来的过程将使用对象本身的值以及其他布尔值的值来打开和关闭布尔值。该过程将递归执行,直到达到某个状态。
对象的数量(或字典的条目)是任意的,但预计将大于100且小于500。时间复杂度为 O(n)。
我正在使用 .NET 7(标准)。
英文:
I need a data structure like below, but I need to be able to change the bool value. Other two stay the as they were when they were initialized. What would you use for best performance?
Dictionary<string, (object, bool)> dic = new Dictionary<string, (object, bool)>();
I was thinking of hashtable. But hashtable is like a dictionary with key/value. The object and bool in my example are in concept not like a key/value, because other values of the external dictionary can have the same object (or better yet ... object type). I don't want to make someone looking at my code later on thinking that the object and bool are more related they really are.
EDIT: object
in this example is just a place holder. In reality it's a complex object with other objects in it and so on. Procedure before this one makes a bunch of this objects and some of them are deepcopy of the others. They are passed to this procedure. All of the object are here named by some rules and stored in the dictionary. Names are obviously unique. Procedure that comes after will take this dictionary and set the bool value on and off based on the values in the objects themselves and on the values of other bools. Procedure will be recursive until some state is reached.
Number of objects (or dic. entries) is arbitrary but expected to be >100 && <500. Time complexity is O(n).
I am targeting .NET7 (standard).
答案1
得分: 2
You can just reassign value for the key:
var tuples = new Dictionary<string, (object Obj, bool Bool)>
{
{ "1", (new object(), true) }
};
tuples["1"] = (tuples["1"].Obj, false); // or tuples["1"] = (tuples["1"].Item1, false);
Or
if (tuples.TryGetValue("1", out var c))
{
tuples["1"] = (c.Obj, false);
}
Personally I would leave it at that, but for really high perf scenarios you can look into CollectionMarshall
instead of second snippet:
ref var v = ref CollectionsMarshal.GetValueRefOrNullRef(tuples, "1");
if (!Unsafe.IsNullRef(ref v))
{
v.Bool = false;
}
A bit more info - here.
英文:
> but I need to be able to change the bool value.
You can just reassign value for the key:
var tuples = new Dictionary<string, (object Obj, bool Bool)>
{
{ "1", (new object(), true) }
};
tuples["1"] = (tuples["1"].Obj, false); // or tuples["1"] = (tuples["1"].Item1, false);
Or
if (tuples.TryGetValue("1", out var c))
{
tuples["1"] = (c.Obj, false);
}
Personally I would leave it at that, but for really high perf scenarios you can look into CollectionMarshall
instead of second snippet:
ref var v = ref CollectionsMarshal.GetValueRefOrNullRef(tuples, "1");
if (!Unsafe.IsNullRef(ref v))
{
v.Bool = false;
}
A bit more info - here.
答案2
得分: 1
对于“性能”方面:
.NET Dictionary使用哈希来查找所需的项,这非常快(与HashTable相当)。我不指望与此相关的性能问题,或者至少没有什么可以通过其他数据结构改进的东西。
此外,除非您需要连续执行一百万次某项操作并且实际上某项操作花费了可测量的时间,否则不必担心性能问题。
对于“更改布尔值”方面:
... 这是一个相当长的故事。
.NET中有2种元组变体:
-
值元组,通过执行
var x = (myObj, myBool)
创建,就像您所做的那样。
x
是一个结构体,因此是值类型。您实际上可以很好地更改x.Item1
或x.Item2
的值。
然而... 如果将x
放入字典中,实际上是将x
的副本(包含其值的副本) 放入字典中,因为这是值类型的本质。
当您再次从字典中检索它时,又会创建另一个副本,这使得在字典中修改实际元组变得不可能;任何尝试这样做的尝试只会修改您获取的最后一个副本。
附带故事: .NET编译器知道这一点,这就是为什么它拒绝编译类似dic[yourKey].Item2 = newBool;
的代码,因为这样的代码不会执行您可能希望它执行的操作。您基本上是在告诉编译器创建一个副本,修改副本,然后... 丢弃副本。编译器要求在其余操作开始之前存储副本的变量,但我们没有提供变量。 -
Tuple
泛型类,或者说一系列泛型类的实例,可以使用类似var x = Tuple.Create(myObj, myBool)
的调用来创建。但是,这些类禁止更改它们的任何属性,它们始终是只读的。Tuple类的实例可以放入字典中,但它们仍然是只读的。
那么真正有哪些选项可以“修改元组中的值”在字典中?
-
继续使用值元组,但接受在字典中“更改”元组,您将不得不创建一个新实例(要么是副本,要么是从头开始),将其设置为所需的属性,然后将该实例(或实际上是副本...)放入字典中:
// 初始化 var dict = new Dictionary<string, (object, bool)>(); var obj = new object(); dict["abc"] = (obj, true); // 修改 var tmpTuple = dict["abc"]; // 获取副本 tmpTuple.Item2 = false; // 修改副本 dict["abc"] = tmpTuple; // 存储另一个副本 // 或者如果您想避免使用临时变量 dict["abc"] = (dict["abc"].Item1, false)
-
使用自定义类而不是值元组或Tuple类,然后将其放入字典中:
public class MyPair { public object O { get; set; } public bool B { get; set; } } // 初始化 var dict = new Dictionary<string, MyPair>(); var obj = new object(); dict["abc"] = new MyPair { O = obj, B = true }; // 修改 dict["abc"].B = false;
因此,对于您不打算执行太多操作的对象,两种类型的元组都可以使用。但它们在使用上都有一定的限制,迟早您可能需要开始使用类。
英文:
For the 'performance' aspect:
The .NET Dictionary uses hashes to look up the item you need, which is very fast (comparable to a HashTable). I don't expect much performance issues related to this, or at least nothing that can be improved on with other data structures.
Also, you shouldn't worry about performance unless you are doing things a million times in a row + it turns out (in practice) that something is taking a measurable amount of time.
For the 'changing a bool' aspect:
... that is quite a long story.
There are 2 tuple variants in .NET:
-
The value tuple, created by doing
var x = (myObj, myBool)
, like you are doing.
Thex
is a struct, and therefore a Value Type. You can actually changex.Item1
orx.Item2
to a new value just fine.
However... if you putx
into a Dictionary then you actually put a copy ofx
(with a copy of its values) into the dictionary, because that is the nature of value types.
When you retrieve it again from the Dictionary, yet another copy is made - which makes modifying the actual tuple inside the Dictionary impossible; any attempt to do so would only modify the last copy you got.
Side story: The .NET Compiler knows this, which is why its refuses to compile code likedic[yourKey].Item2 = newBool;
because such code wouldn't do what you might hope it would do. You're basically telling the compiler to create a copy, modify the copy, and then... discard the copy. The compiler requries a variable to store the copy before the rest can even start, but we provided no variable. -
The
Tuple
generic class, or rather a range of generic classes, an instance of which can be created using calls likevar x = Tuple.Create(myObj, myBool)
. These classes however forbid that you change any of their properties, they are always readonly. Tuple class instances can be put in a Dictionary, but they will still be readonly.
So what options are there really to 'modify a value in a tuple' a Dictionary?
-
Keep using a value tuple, but accept that in order to "change" the tuple inside the Dictionary you'll have to make a new instance (either a copy, or from scratch), set it to the properties that you want, and put that instance (or actualy a copy...) into the dictionary:
// initialize it var dict = new Dictionary<string, (object, bool)>(); var obj = new object(); dict["abc"] = (obj, true); // change it var tmpTuple = dict["abc"]; // get copy tmpTuple.Item2 = false; // alter copy dict["abc"] = tmpTuple; // store another copy // or if you want to avoid the tmp variable dict["abc"] = (dict["abc"].Item1, false)
-
Use a custom class instead of the value tuple or a Tuple class, and then put that into the Dictionary:
public class MyPair { public object O { get; set; } public bool B { get; set; } } // initialize it var dict = new Dictionary<string, MyPair>(); var obj = new object(); dict["abc"] = new MyPair { O = obj, B = true }; // change it dict["abc"].B = false;
So both types of Tuples are OK for objects that you don't want to do a lot with. But both have certain limits in their usage, and sooner or later you may need to start using classes.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论