英文:
Is there a performance penalty when using a Select() inside a lambda Where() clause?
问题
我在尝试评估需要创建哪些事件时偶然发现了一些东西。
我有这样的代码:
var eventsToBeCreated =
requiredEventDates.Where(d => !events.Select(e => e.eventDay).Contains(d));
但是这让我想到了,从性能的角度来看,这可能不是一个很好的主意,因为我认为(我不确定)Select()
会为每个元素单独进行评估,所以我将其更改为:
var existingEventDays =
events.Select(e => e.eventDay);
var eventsToBeCreated =
requiredEventDates.Where(d => !existingEventDays.Contains(d));
但是我也不确定这个。因为 existingEventdays
是一个 IEnumerable<DateTime>
,我猜这仍然会导致可枚举对象被多次解析?所以我将其更改为:
var existingEventDays =
events.Select(e => e.eventDay).ToList();
var eventsToBeCreated =
requiredEventDates.Where(d => !existingEventDays.Contains(d));
...以确保 existingEventDays
只计算一次。
我的假设是否正确,或者第一版和第三版的性能相同?
英文:
Stumbled upon something while trying to evaluate which events I need to create.
I had a code like this:
var eventsToBeCreated =
requiredEventDates.Where(d => !events.Select(e => e.eventDay).Contains(d));
But it made me wonder if this is not such a good idea performance wise, because I believe (I am not sure) the Select()
gets evaluated individually for every element, so I changed it to:
var existingEventDays =
events.Select(e => e.eventDay);
var eventsToBeCreated =
requiredEventDates.Where(d => !existingEventDays.Contains(d));
But I was not sure about this either. As existingEventdays
is an IEnumerable<DateTime>
I guess this would still lead to the enumerable to be resolved multiple times? So I changed it to:
var existingEventDays =
events.Select(e => e.eventDay).ToList();
var eventsToBeCreated =
requiredEventDates.Where(d => !existingEventDays.Contains(d));
..to make sure that the existingEventDays
get calculated only one time.
Are my assumptions correct or is this not necessary and the first version would offer the same performance as the third?
答案1
得分: 3
I'll assume you actually consume the whole query created with Where
, like calling ToList()
. If you don't consume it, then nothing in the Where
lambda is executed. You're just creating a bunch of IEnumerable<T>
s. See Deferred Execution.
Regarding the second snippet, you extracted the Select
call to a variable, this indeed causes events.Select
to only be called once, instead of once for every element in requiredEventDates
. But again, due to Deferred Execution, calling Select
itself is not very expensive. It is the looping that Contains
does that is usually expensive.
Regarding the third snippet, you first made a list out of the dates from the events
. This loops through the entirety of events
. And Contain
loops through the list for each element in requiredEventDates
, on top of that. So you essentially looped through the whole list one more time than necessary.
To avoid all this looping, you can instead put the dates into a HashSet
:
var existingEventDays =
events.Select(e => e.eventDay).ToHashSet();
var eventsToBeCreated =
requiredEventDates.Where(d => !existingEventDays.Contains(d));
Now you only loop through events
once, to create the set. And Contains
looks up d
in the set, which can be a lot faster than looking things up in a list.
英文:
I'll assume you actually consume the whole query created with Where
, like calling ToList()
. If you don't consume it, then nothing in the Where
lambda is executed. You're just creating a bunch of IEnumerable<T>
s. See Deferred Execution.
Regarding the second snippet, you extracted the Select
call to a variable, this indeed causes events.Select
to only be called once, instead of once for every element in requiredEventDates
. But again, due to Deferred Execution, calling Select
itself is not very expensive. It is the looping that Contains
does that is usually expensive.
Regarding the third snippet, you first made a list out of the dates from the events
. This loops through the entirety of events
. And Contain
loops through the list for each element in requiredEventDates
, on top of that. So you essentially looped through the whole list one more time than necessary.
To avoid all this looping, you can instead put the dates into a HashSet
:
var existingEventDays =
events.Select(e => e.eventDay).ToHashSet();
var eventsToBeCreated =
requiredEventDates.Where(d => !existingEventDays.Contains(d));
Now you only loop through events
once, to create the set. And Contains
looks up d
in the set, which can be a lot faster than looking things up in a list.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论