如何在Julia中使用haskey()和in()函数处理复杂字典键?

huangapple go评论96阅读模式
英文:

How to use haskey() and in() functions with complex dictionnary keys in Julia?

问题

haskey()和in()函数在Julia中对于测试字典内容非常有用:

julia> dict = Dict("a" => 1, "b" => 2, "c" => 3, "d" => 4, "e" => 5)
Dict{String,Int64} with 5 entries:
  "c" => 3
  "e" => 5
  "b" => 2
  "a" => 1
  "d" => 4

julia> haskey(dict, "a")
true

julia> in(("a" => 1), dict)
true

但是,对于复杂的键,它们的行为让我感到惊讶:

julia> immutable MyT
           A::String
           B::Int64
       end

julia> a = Dict(MyT("Tom",191)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1)
Dict{MyT,Int64} with 4 entries:
  MyT("Tom",191) => 1
  MyT("Jo",315)  => 1
  MyT("Bob",20)  => 1
  MyT("Luc",493) => 1

julia> keys(a)
Base.KeyIterator for a Dict{MyT,Int64} with 4 entries. Keys:
  MyT("Tom",191)
  MyT("Jo",315)
  MyT("Bob",20)
  MyT("Luc",493)

julia> haskey(a, MyT("Tom",191))
false

julia> in((MyT("Tom",191) => 1), a)
false

我做错了什么?非常感谢您的评论!

感谢@Michael K. Borregaard,我可以提供以下解决方案:

a = Dict{MyT, Int64}()

keyArray = Array{MyT,1}()
keyArray = [MyT("Tom",191),MyT("Bob",20),MyT("Jo",315),MyT("Luc",493)]

for i in keyArray
    a[i] = 1
end

println(a)
# Dict(MyT("Tom",191)=>1,MyT("Tom",191)=>1,MyT("Luc",493)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Bob",20)=>1)

keyArray[1]            # MyT("Tom",191)
haskey(a, keyArray[1]) # true

但是我必须将键存储在一个单独的数组中。这意味着无法保证键的唯一性,而这正是字典的优势和我选择使用它的原因 如何在Julia中使用haskey()和in()函数处理复杂字典键?

因此,我必须使用另一步:

unique(keyArray)

另一个更好的解决方案是:

function CompareKeys(k1::MyT, k2::MyT)
    if k1.A == k2.A &&  k1.B == k2.B
        return true
    else 
        return false
    end
end

function ExistKey(k::MyT, d::Dict{MyT, Int64})
    for i in keys(d)
        if CompareKeys(k, i)
            return true
        end
    end
    return false
end

a = Dict(MyT("Tom",191)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1)

ExistKey(MyT("Tom",192),a) # false

ExistKey(MyT("Tom",191),a) # true

与Julia相比,Go对于这个问题更加直接:

package main

import (
	"fmt"
)

type MyT struct {
	A string
	B int
}

func main() {

	dic := map[MyT]int{MyT{"Bob", 10}: 1, MyT{"Jo", 21}: 1}
	
	if _, ok := dic[MyT{"Bob", 10}]; ok {
		fmt.Println("key exists")
	}
}
// 输出结果为 "key exists"
英文:

haskey() and in() functions are very useful to test the content of dictionaries in Julia :

julia> dict = Dict("a" => 1, "b" => 2, "c" => 3, "d" => 4, "e" => 5)
Dict{String,Int64} with 5 entries:
  "c" => 3
  "e" => 5
  "b" => 2
  "a" => 1
  "d" => 4

julia> haskey(dict, "a")
true

julia> in(("a" => 1), dict)
true

but I was surprised by their behavior with complex keys :

julia> immutable MyT
           A::String
           B::Int64
       end

julia> a = Dict(MyT("Tom",191)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1)
Dict{MyT,Int64} with 4 entries:
  MyT("Tom",191) => 1
  MyT("Jo",315)  => 1
  MyT("Bob",20)  => 1
  MyT("Luc",493) => 1

julia> keys(a)
Base.KeyIterator for a Dict{MyT,Int64} with 4 entries. Keys:
  MyT("Tom",191)
  MyT("Jo",315)
  MyT("Bob",20)
  MyT("Luc",493)

julia> haskey(a, MyT("Tom",191))
false

julia> in((MyT("Tom",191) => 1), a)
false

What I did wrong ?
Thank you very much for your comments !

Thanks to @Michael K. Borregaard, I can propose this solution :

a = Dict{MyT, Int64}()

keyArray = Array{MyT,1}()
keyArray = [MyT("Tom",191),MyT("Bob",20),MyT("Jo",315),MyT("Luc",493)]

for i in keyArray
    a[i] = 1
end

println(a)
# Dict(MyT("Tom",191)=>1,MyT("Tom",191)=>1,MyT("Luc",493)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Bob",20)=>1)

keyArray[1]            # MyT("Tom",191)
haskey(a, keyArray[1]) # true

But I have to store keys in a separate array. This means that can't warranty the unicity of the keys which is the strength of the dictionaries and why I choose to use it 如何在Julia中使用haskey()和in()函数处理复杂字典键?

So I have to use another step :

unique(keyArray)

Another better solution :

function CompareKeys(k1::MyT, k2::MyT)
    if k1.A == k2.A &&  k1.B == k2.B
        return true
    else 
        return false
    end
end

function ExistKey(k::MyT, d::Dict{MyT, Int64})
    for i in keys(d)
        if CompareKeys(k, i)
            return true
        end
    end
    return false
end

a = Dict(MyT("Tom",191)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1)

ExistKey(MyT("Tom",192),a) # false

ExistKey(MyT("Tom",191),a) # true

Compared to Julia, Go is more straightforward for this problem :

package main

import (
	"fmt"
)

type MyT struct {
	A string
	B int
}

func main() {

	dic := map[MyT]int{MyT{"Bob", 10}: 1, MyT{"Jo", 21}: 1}
	
	if _, ok := dic[MyT{"Bob", 10}]; ok {
		fmt.Println("key exists")
	}
}
// answer is "key exists"

答案1

得分: 6

你只需要教会你的MyT类型在比较时考虑其组合字段的相等性:

immutable MyT
    A::String
    B::Int64
end
import Base: ==, hash
==(x::MyT, y::MyT) = x.A == y.A && x.B == y.B
hash(x::MyT, h::UInt) = hash(x.A, hash(x.B, hash(0x7d6979235cb005d0, h)))

a = Dict(MyT("Tom",191)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1)

haskey(a, MyT("Tom",191))
in((MyT("Tom",191) => 1), a)

这样,你就可以使用MyT类型作为字典的键,并进行相等性比较了。

英文:

You just need to teach your MyT type that you want it to consider equality in terms of its composite fields:

julia> immutable MyT
           A::String
           B::Int64
       end
       import Base: ==, hash
       ==(x::MyT, y::MyT) = x.A == y.A && x.B == y.B
       hash(x::MyT, h::UInt) = hash(x.A, hash(x.B, hash(0x7d6979235cb005d0, h)))

julia> a = Dict(MyT("Tom",191)=>1,MyT("Bob",20)=>1,MyT("Jo",315)=>1,MyT("Luc",493)=>1)
Dict{MyT,Int64} with 4 entries:
  MyT("Jo", 315)  => 1
  MyT("Luc", 493) => 1
  MyT("Tom", 191) => 1
  MyT("Bob", 20)  => 1

julia> haskey(a, MyT("Tom",191))
true

julia> in((MyT("Tom",191) => 1), a)
true

答案2

得分: 3

这里有很多好的答案,我只想补充一点细微之处:这部分是因为==在检查结构相等性时调用了===而不是递归调用==,同时也因为相等(==)的字符串通常不是相同的(===)。具体来说,MyT("foo", 1) != MyT("foo", 1)是因为"foo" !== "foo"

字符串只是“按约定不可变” - 从技术上讲,它们是可变的,但是Julia不提供用于改变它们的API,并鼓励你不要改变它们。但是,你可以访问它们的底层字节并进行修改,这样可以编写一个通过修改一个字符串而不是另一个字符串来区分它们的程序。这意味着它们不能以Henry Baker的“EGAL”谓词(也可参见此处)的意义上进行===比较。如果你有一个只包含“原始”类型字段的不可变类型,那么这种情况就不会发生:

immutable MyT2
    A::Float64
    B::Int
end

x = MyT2(1, 1)
y = MyT2(1, 1)

x == y
true

x === y
true

我已经提议改变这一点,让==递归调用==。这个问题应该被修复,只需要有人来做这个工作。此外,在Julia 1.0中,我们可以使字符串真正不可变,而不仅仅是按约定不可变,从而使"foo" === "foo"为真。我创建了一个问题来讨论和跟踪这个改变。

英文:

There are lots of good answers here, I'd just like to add a subtlety: this is partly because == calls === rather than recursively calling == when checking for structural equality, and partly because equal (==) strings are not generally identical (===) currently. Specifically, the fact that MyT("foo", 1) != MyT("foo", 1) is because "foo" !== "foo".

Strings are only "immutable by convention" – they are technically mutable, but Julia doesn't expose APIs for mutating them and encourages you not to mutate them. You can, however, access their underlying bytes and mutate that, which allows you to write a program that distinguishes two strings by getting by mutating one and not the other. That means that they cannot be === in the sense of Henry Baker's "EGAL" predicate (also here). If you have an immutable type with only "primitive" type fields, then this does not happen:

julia> immutable MyT2 # `struct MyT2` in 0.6
           A::Float64
           B::Int
       end

julia> x = MyT2(1, 1)
MyT2(1.0, 1)

julia> y = MyT2(1, 1)
MyT2(1.0, 1)

julia> x == y
true

julia> x === y
true

I have already proposed that we change this and have == recursively call ==. This should be fixed, someone just needs to do the work. Moreover, in Julia 1.0 we could make Strings truly immutable rather than merely immutable by convention, and therefore have "foo" === "foo" be true. I've created an issue to discuss and track this change.

答案3

得分: 2

你在haskey调用中创建了一个新对象。但是通过MyT("Tom", 191)创建的两个对象只是具有相同字段值的两个不同的MyT对象。

相反,可以这样做:

key1 = MyT("Tom", 191)
a = Dict(key1 => 1)
haskey(a, key1)

另外还可以这样做:

key2 = MyT("Tom", 191)
key1 == key2 # false

处理这个问题的一种Julia惯用方式是为MyT对象定义一个==方法,这样如果两个对象具有相同的字段值,它们就是相等的。这样你就可以像你所做的那样使用它们。

这取决于你是否需要类型是复杂的。实现你想要的另一种简单且高效的方法是将元组作为键:

a = Dict(("Tom", 191) => 1)
haskey(a, ("Tom", 191)) # true
a[("Tom", 191)] # 1
英文:

You're creating a new object in the haskey call. But two objects created by MyT("Tom", 191) are just two different MyT objects with the same field values.

Instead, do

key1 = MyT("Tom", 191)
a = Dict(key1 => 1)
haskey(a, key1)

see also

key2 = MyT("Tom", 191)
key1 == key2 # false

A julia-ideomatic way to deal with this would be to define an == method for MyT objects, so two objects are equal if they have the same field values. That would allow you to use them like you do.

It depends whether you need the type to be complex. Another easy and performant way to do what you want is to use a Tuple as the key:

a = Dict(("Tom", 191) => 1)
haskey(a, ("Tom", 191)) # true
a[("Tom", 191)] # 1

答案4

得分: 0

我的方法与Matt的方法类似,但更简单一些。元组是完全有效的字典键,所以我会简单地重载相关函数,将你的类型转换为元组形式和反向转换:

julia> immutable M; A::String; B::Int64; end
julia> import Base: =>, haskey, in
julia> =>(a::M, b) = (a.A, a.B)=>b
julia> haskey(a::Dict, b::M) = haskey(a, (b.A, b.B))
julia> in(a::Pair{M, Int64}, b::Int64) = in((a.first.A,a.first.B)=>a.second, b)

与此同时,

julia> a = Dict(M("Dick", 10)=>1, M("Harry", 20)=>2)
Dict{Tuple{String,Int64},Int64} with 2 entries:
  ("Dick", 10)  => 1
  ("Harry", 20) => 2

julia> haskey(a, M("Dick", 10))
true

julia> in(M("Dick", 10)=>1, a)
true

与Julia相比,Go在这个问题上更直接。

确实如此。不过,它也更容易出错(这取决于你的观点)。如果你想区分两个对象(用作键),它们在内存中不对应同一个对象,那么Go的方法只测试“值相等”将会在这里给你带来麻烦(尽管有人可能会认为在比较“键”时,“值相等”通常更有意义)。

英文:

My approach would be similar to Matt's, but a bit simpler(?). Tuples are perfectly valid dictionary keys, so I would simply overload the relevant functions to convert your type back and forth to a tuple:

<!-- language-all: lang-julia -->

julia&gt; immutable M; A::String; B::Int64; end
julia&gt; import Base: =&gt;, haskey, in
julia&gt; =&gt;(a::M, b) = (a.A, a.B)=&gt;b
julia&gt; haskey(a::Dict, b::M) = haskey(a, (b.A, b.B))
julia&gt; in(a::Pair{M, Int64}, b::Int64) = in((a.first.A,a.first.B)=&gt;a.second,b)

<b

julia&gt; a = Dict(M(&quot;Dick&quot;, 10)=&gt;1, M(&quot;Harry&quot;, 20)=&gt;2)
Dict{Tuple{String,Int64},Int64} with 2 entries:
  (&quot;Dick&quot;, 10)  =&gt; 1
  (&quot;Harry&quot;, 20) =&gt; 2

julia&gt; haskey(a, M(&quot;Dick&quot;, 10))
true

julia&gt; in(M(&quot;Dick&quot;, 10)=&gt;1, a)
true

> "Compared to Julia, Go is more straightforward for this problem"

True. It also happens to be more error-prone (depending on your perspective). If you wanted to differentiate between two objects (used as keys) that do not correspond to the same object in memory, then Go's approach of simply testing 'value equality' would have landed you in trouble here (though one could argue 'value equality' generally makes more sense when comparing 'keys').

huangapple
  • 本文由 发表于 2017年6月2日 16:59:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/44324800.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定