枚举实现特质 – 内存问题?

huangapple go评论74阅读模式
英文:

Enum implementing traits - memory issues?

问题

我正在尝试使我的代码更具可维护性,我首次使用了 traits 来增强我的 enums

我想要做的是:对于给定的字符串数组,查找所有与至少一个关键字匹配的枚举(不区分大小写)。

下面的代码似乎工作正常,但我认为当 getSymbolFromIndustries 方法被调用数千次时,它会产生内存泄漏。

这是在运行约10分钟后从 VisualVM 中捕获的截图,列 "Live Objects" 在每次快照后都在增加,与第二行相比,项目数量如此巨大...
枚举实现特质 – 内存问题?

我的堆大小也在不断增加...
枚举实现特质 – 内存问题?

<!-- language-all: lang-groovy -->

该特性:

trait BasedOnCategories {
    String[] categories
    
    static getSymbolFromIndustries(Collection<String> candidates) {
        values().findAll { value ->
            !value.categories.findAll { categorie ->
                candidates.any { candidate ->
                    categorie.equalsIgnoreCase(candidate)
                }
            }
            .unique()
            .isEmpty()
        }
    }
}

我有多个实现了该 trait 的枚举之一:

enum KTC implements BasedOnCategories, BasedOnValues {
    KTC_01([
            'industries': ['Artificial Intelligence','Machine Learning','Intelligent Systems','Natural Language Processing','Predictive Analytics','Google Glass','Image Recognition', 'Apps' ],
            'keywords': ['AI','Voice recognition']
    ]),
    // ... 更多的值
    KTC_43 ([
            'industries': ['Fuel','Oil and Gas','Fossil Fuels'],
            'keywords': ['Petroleum','Oil','Petrochemicals','Hydrocarbon','Refining']
    ]),
    // ... 更多的值
    KTC_60([
            'industries': ['App Discovery','Apps','Consumer Applications','Enterprise Applications','Mobile Apps','Reading Apps','Web Apps','App Marketing','Application Performance Management', 'Apps' ],
            'keywords': ['App','Application']
    ])

    KTC(value) {
        this.categories = value.industries
        this.keywords = value.keywords
    }
}

我的数据驱动测试:

    def "GetKTCsFromIndustries"(Collection<String> actual, Enum[] expected) {
        expect:
        assert expected == KTC.getSymbolFromIndustries(actual)

        where:
        actual                                              | expected
        [ 'Oil and Gas' ]                                   | [KTC.KTC_43]
        [ 'oil and gas' ]                                   | [KTC.KTC_43]
        [ 'oil and gas', 'Fossil Fuels' ]                   | [KTC.KTC_43]
        [ 'oil and gas', 'Natural Language Processing' ]    | [KTC.KTC_01, KTC.KTC_43]
        [ 'apps' ]                                          | [KTC.KTC_01, KTC.KTC_60]
        [ 'xyo' ]                                           | []
    }

我的问题:

  • 如果有人对如何帮助我修复这些泄漏有一些线索...
  • 是否有更优雅的方法来编写 getSymbolFromIndustries 方法?

谢谢。

英文:

I'm trying to DRY my code and I used for that, for the first time, traits to enhance my enums.

What I want to do, is : for a given array of strings, find all the enums matching at least one keyword (non case sensitive)

The code below seems to works fine, but I think it generates me memory leaks when the method getSymbolFromIndustries is called thousands of times.

Here is a capture from VisualVM after about 10 minutes of run, the column Live Objects is always increasing after each snapshot and the number of items compared to the second line is so huge...
枚举实现特质 – 内存问题?

My heap size is always increasing too...
枚举实现特质 – 内存问题?

<!-- language-all: lang-groovy -->

The trait :

trait BasedOnCategories {
    String[] categories
    
    static getSymbolFromIndustries(Collection&lt;String&gt; candidates) {
        values().findAll {
            value -&gt; !value.categories.findAll {
                categorie -&gt; candidates.any {
                    candidate -&gt; categorie.equalsIgnoreCase(candidate)
                }
            }
            .unique()
            .isEmpty()
        }
    }
}

One of the multiple enums I have implementing the trait

enum KTC implements BasedOnCategories, BasedOnValues {
    KTC_01([
            &#39;industries&#39;: [&#39;Artificial Intelligence&#39;,&#39;Machine Learning&#39;,&#39;Intelligent Systems&#39;,&#39;Natural Language Processing&#39;,&#39;Predictive Analytics&#39;,&#39;Google Glass&#39;,&#39;Image Recognition&#39;, &#39;Apps&#39; ],
            &#39;keywords&#39;: [&#39;AI&#39;,&#39;Voice recognition&#39;]
    ]),
    // ... more values
    KTC_43 ([
            &#39;industries&#39;: [&#39;Fuel&#39;,&#39;Oil and Gas&#39;,&#39;Fossil Fuels&#39;],
            &#39;keywords&#39;: [&#39;Petroleum&#39;,&#39;Oil&#39;,&#39;Petrochemicals&#39;,&#39;Hydrocarbon&#39;,&#39;Refining&#39;]
    ]),
    // ... more values
    KTC_60([
            &#39;industries&#39;: [&#39;App Discovery&#39;,&#39;Apps&#39;,&#39;Consumer Applications&#39;,&#39;Enterprise Applications&#39;,&#39;Mobile Apps&#39;,&#39;Reading Apps&#39;,&#39;Web Apps&#39;,&#39;App Marketing&#39;,&#39;Application Performance Management&#39;, &#39;Apps&#39; ],
            &#39;keywords&#39;: [&#39;App&#39;,&#39;Application&#39;]
    ])

    KTC(value) {
        this.categories = value.industries
        this.keywords = value.keywords
    }

My data-driven tests

    def &quot;GetKTCsFromIndustries&quot;(Collection&lt;String&gt; actual, Enum[] expected) {
        expect:
        assert expected == KTC.getSymbolFromIndustries(actual)

        where:
        actual                                              | expected
        [ &#39;Oil and Gas&#39; ]                                   | [KTC.KTC_43]
        [ &#39;oil and gas&#39; ]                                   | [KTC.KTC_43]
        [ &#39;oil and gas&#39;, &#39;Fossil Fuels&#39; ]                   | [KTC.KTC_43]
        [ &#39;oil and gas&#39;, &#39;Natural Language Processing&#39; ]    | [KTC.KTC_01, KTC.KTC_43]
        [ &#39;apps&#39; ]                                          | [KTC.KTC_01, KTC.KTC_60]
        [ &#39;xyo&#39; ]                                           | []
    }

My questions :

  • If someone have some clues to help me fix those leaks...
  • Is there a more elegant way to write the getSymbolFromIndustries method ?

thanks.

答案1

得分: 1

不确定性能问题,但我会重新设计你的特性如下:

trait BasedOnCategories {

    Set<String> categories
    
    void setCategories( Collection<String> cats ) {
        categories = new HashSet( cats*.toLowerCase() ).asImmutable()
    }

    @groovy.transform.Memoized
    static getSymbolFromIndustries(Collection<String> candidates) {
        def lowers = candidates*.toLowerCase()
        values().findAll{ value -> !lowers.disjoint( value.categories ) }
    }
}

接下来的上下文部分:

trait BasedOnValues {
    Set<String> keywords
}

enum KTC implements BasedOnCategories, BasedOnValues  {
    KTC_01([
            'industries': ['Artificial Intelligence','Machine Learning','Intelligent Systems','Natural Language Processing','Predictive Analytics','Google Glass','Image Recognition'],
            'keywords': ['AI','Voice recognition']
    ]),
    // ... 更多值
    KTC_43 ([
            'industries': ['Fuel','Oil and Gas','Fossil Fuels'],
            'keywords': ['Petroleum','Oil','Petrochemicals','Hydrocarbon','Refining']
    ]),
    // ... 更多值
    KTC_60([
            'industries': ['App Discovery','Apps','Consumer Applications','Enterprise Applications','Mobile Apps','Reading Apps','Web Apps','App Marketing','Application Performance Management'],
            'keywords': ['App','Application']
    ])

    KTC(value) {
        this.categories = value.industries
        this.keywords = value.keywords
    }
}

// 一些测试

[
    [ ['Oil and Gas'], [KTC.KTC_43] ],
    [ ['oil and gas'], [KTC.KTC_43] ],
    [ ['oil and gas', 'Fossil Fuels'], [KTC.KTC_43] ],
    [ ['oil and gas', 'Natural Language Processing'], [KTC.KTC_01, KTC.KTC_43] ],
    [ ['xyo'], [] ],
].each{
    assert KTC.getSymbolFromIndustries( it[ 0 ] ) == it[ 1 ]
}

然后测试性能。

英文:

Not sure about performance issues, but I would redesign your trait like that:

https://groovyconsole.appspot.com/script/5205045624700928

trait BasedOnCategories {
Set&lt;String&gt; categories
void setCategories( Collection&lt;String&gt; cats ) {
categories = new HashSet( cats*.toLowerCase() ).asImmutable()
}
@groovy.transform.Memoized
static getSymbolFromIndustries(Collection&lt;String&gt; candidates) {
def lowers = candidates*.toLowerCase()
values().findAll{ value -&gt; !lowers.disjoint( value.categories ) }
}
}

Now the rest of the context

trait BasedOnValues {
Set&lt;String&gt; keywords
}
enum KTC implements BasedOnCategories, BasedOnValues  {
KTC_01([
&#39;industries&#39;: [&#39;Artificial Intelligence&#39;,&#39;Machine Learning&#39;,&#39;Intelligent Systems&#39;,&#39;Natural Language Processing&#39;,&#39;Predictive Analytics&#39;,&#39;Google Glass&#39;,&#39;Image Recognition&#39;],
&#39;keywords&#39;: [&#39;AI&#39;,&#39;Voice recognition&#39;]
]),
// ... more values
KTC_43 ([
&#39;industries&#39;: [&#39;Fuel&#39;,&#39;Oil and Gas&#39;,&#39;Fossil Fuels&#39;],
&#39;keywords&#39;: [&#39;Petroleum&#39;,&#39;Oil&#39;,&#39;Petrochemicals&#39;,&#39;Hydrocarbon&#39;,&#39;Refining&#39;]
]),
// ... more values
KTC_60([
&#39;industries&#39;: [&#39;App Discovery&#39;,&#39;Apps&#39;,&#39;Consumer Applications&#39;,&#39;Enterprise Applications&#39;,&#39;Mobile Apps&#39;,&#39;Reading Apps&#39;,&#39;Web Apps&#39;,&#39;App Marketing&#39;,&#39;Application Performance Management&#39;],
&#39;keywords&#39;: [&#39;App&#39;,&#39;Application&#39;]
])
KTC(value) {
this.categories = value.industries
this.keywords = value.keywords
}
}
// some tests
[
[ [ &#39;Oil and Gas&#39; ], [KTC.KTC_43] ],
[ [ &#39;oil and gas&#39; ], [KTC.KTC_43] ],
[ [ &#39;oil and gas&#39;, &#39;Fossil Fuels&#39; ], [KTC.KTC_43] ],
[ [ &#39;oil and gas&#39;, &#39;Natural Language Processing&#39; ], [KTC.KTC_01, KTC.KTC_43] ],
[ [ &#39;xyo&#39; ], [] ],
].each{
assert KTC.getSymbolFromIndustries( it[ 0 ] ) == it[ 1 ]
}

and then measure the performance

huangapple
  • 本文由 发表于 2020年10月7日 05:55:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/64234339.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定