英文:
Enum implementing traits - memory issues?
问题
我正在尝试使我的代码更具可维护性,我首次使用了 traits
来增强我的 enums
。
我想要做的是:对于给定的字符串数组,查找所有与至少一个关键字匹配的枚举(不区分大小写)。
下面的代码似乎工作正常,但我认为当 getSymbolFromIndustries
方法被调用数千次时,它会产生内存泄漏。
这是在运行约10分钟后从 VisualVM 中捕获的截图,列 "Live Objects" 在每次快照后都在增加,与第二行相比,项目数量如此巨大...
<!-- language-all: lang-groovy -->
该特性:
trait BasedOnCategories {
String[] categories
static getSymbolFromIndustries(Collection<String> candidates) {
values().findAll { value ->
!value.categories.findAll { categorie ->
candidates.any { candidate ->
categorie.equalsIgnoreCase(candidate)
}
}
.unique()
.isEmpty()
}
}
}
我有多个实现了该 trait
的枚举之一:
enum KTC implements BasedOnCategories, BasedOnValues {
KTC_01([
'industries': ['Artificial Intelligence','Machine Learning','Intelligent Systems','Natural Language Processing','Predictive Analytics','Google Glass','Image Recognition', 'Apps' ],
'keywords': ['AI','Voice recognition']
]),
// ... 更多的值
KTC_43 ([
'industries': ['Fuel','Oil and Gas','Fossil Fuels'],
'keywords': ['Petroleum','Oil','Petrochemicals','Hydrocarbon','Refining']
]),
// ... 更多的值
KTC_60([
'industries': ['App Discovery','Apps','Consumer Applications','Enterprise Applications','Mobile Apps','Reading Apps','Web Apps','App Marketing','Application Performance Management', 'Apps' ],
'keywords': ['App','Application']
])
KTC(value) {
this.categories = value.industries
this.keywords = value.keywords
}
}
我的数据驱动测试:
def "GetKTCsFromIndustries"(Collection<String> actual, Enum[] expected) {
expect:
assert expected == KTC.getSymbolFromIndustries(actual)
where:
actual | expected
[ 'Oil and Gas' ] | [KTC.KTC_43]
[ 'oil and gas' ] | [KTC.KTC_43]
[ 'oil and gas', 'Fossil Fuels' ] | [KTC.KTC_43]
[ 'oil and gas', 'Natural Language Processing' ] | [KTC.KTC_01, KTC.KTC_43]
[ 'apps' ] | [KTC.KTC_01, KTC.KTC_60]
[ 'xyo' ] | []
}
我的问题:
- 如果有人对如何帮助我修复这些泄漏有一些线索...
- 是否有更优雅的方法来编写
getSymbolFromIndustries
方法?
谢谢。
英文:
I'm trying to DRY my code and I used for that, for the first time, traits
to enhance my enums
.
What I want to do, is : for a given array of strings, find all the enums matching at least one keyword (non case sensitive)
The code below seems to works fine, but I think it generates me memory leaks when the method getSymbolFromIndustries
is called thousands of times.
Here is a capture from VisualVM after about 10 minutes of run, the column Live Objects is always increasing after each snapshot and the number of items compared to the second line is so huge...
My heap size is always increasing too...
<!-- language-all: lang-groovy -->
The trait :
trait BasedOnCategories {
String[] categories
static getSymbolFromIndustries(Collection<String> candidates) {
values().findAll {
value -> !value.categories.findAll {
categorie -> candidates.any {
candidate -> categorie.equalsIgnoreCase(candidate)
}
}
.unique()
.isEmpty()
}
}
}
One of the multiple enums I have implementing the trait
enum KTC implements BasedOnCategories, BasedOnValues {
KTC_01([
'industries': ['Artificial Intelligence','Machine Learning','Intelligent Systems','Natural Language Processing','Predictive Analytics','Google Glass','Image Recognition', 'Apps' ],
'keywords': ['AI','Voice recognition']
]),
// ... more values
KTC_43 ([
'industries': ['Fuel','Oil and Gas','Fossil Fuels'],
'keywords': ['Petroleum','Oil','Petrochemicals','Hydrocarbon','Refining']
]),
// ... more values
KTC_60([
'industries': ['App Discovery','Apps','Consumer Applications','Enterprise Applications','Mobile Apps','Reading Apps','Web Apps','App Marketing','Application Performance Management', 'Apps' ],
'keywords': ['App','Application']
])
KTC(value) {
this.categories = value.industries
this.keywords = value.keywords
}
My data-driven tests
def "GetKTCsFromIndustries"(Collection<String> actual, Enum[] expected) {
expect:
assert expected == KTC.getSymbolFromIndustries(actual)
where:
actual | expected
[ 'Oil and Gas' ] | [KTC.KTC_43]
[ 'oil and gas' ] | [KTC.KTC_43]
[ 'oil and gas', 'Fossil Fuels' ] | [KTC.KTC_43]
[ 'oil and gas', 'Natural Language Processing' ] | [KTC.KTC_01, KTC.KTC_43]
[ 'apps' ] | [KTC.KTC_01, KTC.KTC_60]
[ 'xyo' ] | []
}
My questions :
- If someone have some clues to help me fix those leaks...
- Is there a more elegant way to write the
getSymbolFromIndustries
method ?
thanks.
答案1
得分: 1
不确定性能问题,但我会重新设计你的特性如下:
trait BasedOnCategories {
Set<String> categories
void setCategories( Collection<String> cats ) {
categories = new HashSet( cats*.toLowerCase() ).asImmutable()
}
@groovy.transform.Memoized
static getSymbolFromIndustries(Collection<String> candidates) {
def lowers = candidates*.toLowerCase()
values().findAll{ value -> !lowers.disjoint( value.categories ) }
}
}
接下来的上下文部分:
trait BasedOnValues {
Set<String> keywords
}
enum KTC implements BasedOnCategories, BasedOnValues {
KTC_01([
'industries': ['Artificial Intelligence','Machine Learning','Intelligent Systems','Natural Language Processing','Predictive Analytics','Google Glass','Image Recognition'],
'keywords': ['AI','Voice recognition']
]),
// ... 更多值
KTC_43 ([
'industries': ['Fuel','Oil and Gas','Fossil Fuels'],
'keywords': ['Petroleum','Oil','Petrochemicals','Hydrocarbon','Refining']
]),
// ... 更多值
KTC_60([
'industries': ['App Discovery','Apps','Consumer Applications','Enterprise Applications','Mobile Apps','Reading Apps','Web Apps','App Marketing','Application Performance Management'],
'keywords': ['App','Application']
])
KTC(value) {
this.categories = value.industries
this.keywords = value.keywords
}
}
// 一些测试
[
[ ['Oil and Gas'], [KTC.KTC_43] ],
[ ['oil and gas'], [KTC.KTC_43] ],
[ ['oil and gas', 'Fossil Fuels'], [KTC.KTC_43] ],
[ ['oil and gas', 'Natural Language Processing'], [KTC.KTC_01, KTC.KTC_43] ],
[ ['xyo'], [] ],
].each{
assert KTC.getSymbolFromIndustries( it[ 0 ] ) == it[ 1 ]
}
然后测试性能。
英文:
Not sure about performance issues, but I would redesign your trait like that:
https://groovyconsole.appspot.com/script/5205045624700928
trait BasedOnCategories {
Set<String> categories
void setCategories( Collection<String> cats ) {
categories = new HashSet( cats*.toLowerCase() ).asImmutable()
}
@groovy.transform.Memoized
static getSymbolFromIndustries(Collection<String> candidates) {
def lowers = candidates*.toLowerCase()
values().findAll{ value -> !lowers.disjoint( value.categories ) }
}
}
Now the rest of the context
trait BasedOnValues {
Set<String> keywords
}
enum KTC implements BasedOnCategories, BasedOnValues {
KTC_01([
'industries': ['Artificial Intelligence','Machine Learning','Intelligent Systems','Natural Language Processing','Predictive Analytics','Google Glass','Image Recognition'],
'keywords': ['AI','Voice recognition']
]),
// ... more values
KTC_43 ([
'industries': ['Fuel','Oil and Gas','Fossil Fuels'],
'keywords': ['Petroleum','Oil','Petrochemicals','Hydrocarbon','Refining']
]),
// ... more values
KTC_60([
'industries': ['App Discovery','Apps','Consumer Applications','Enterprise Applications','Mobile Apps','Reading Apps','Web Apps','App Marketing','Application Performance Management'],
'keywords': ['App','Application']
])
KTC(value) {
this.categories = value.industries
this.keywords = value.keywords
}
}
// some tests
[
[ [ 'Oil and Gas' ], [KTC.KTC_43] ],
[ [ 'oil and gas' ], [KTC.KTC_43] ],
[ [ 'oil and gas', 'Fossil Fuels' ], [KTC.KTC_43] ],
[ [ 'oil and gas', 'Natural Language Processing' ], [KTC.KTC_01, KTC.KTC_43] ],
[ [ 'xyo' ], [] ],
].each{
assert KTC.getSymbolFromIndustries( it[ 0 ] ) == it[ 1 ]
}
and then measure the performance
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论