英文:
Fastest way to split object into smaller objects in Javascript?
问题
我有一个像下面这样包含100万个属性的大对象。
{
"Name1":{"some":"object"},
"Name2":{"some":"object"},
"Name1000000":{"some":"object"}
}
我想将这个对象分成N部分,所以我写了以下代码。
var bigObject = {
"Name1": {
"some": "object"
},
"Name2": {
"some": "object"
},
"Name1000000": {
"some": "object"
}
};
const names = Object.keys(bigObject);
const partsCount = 4;
const parts = names
.reduce((acc, name, idx) => {
const reduceIndex = idx % partsCount;
if (acc[reduceIndex] == null) {
acc[reduceIndex] = {};
}
acc[reduceIndex][name] = request.body[name];
return acc;
}, new Array(Math.min(partsCount, names.length)));
虽然这个代码可以工作,但性能有问题。目前需要1.2到1.5秒!!有没有更高性能的写法?我期望在运行速度约为3 GHz的新处理器上,这个操作应该只需要两位数的毫秒时间。我的期望是不是错的?
更新:
我有点不理解为什么有些人在问题之外寻求解决方案,通过询问“为什么”和“你在做什么”来了解。有些人假设这是一个数据库用例。实际上有成千上万种用例,无论如何,我的用例是IOT传感器数据摄入,最多汇总100万个传感器数据到一个批量REST调用中需要处理,这里出现了第二个问题,一旦我透露了用例,人们就会尝试优化它,所以让我明确一点,除了上面显示的代码之外,我不能更改任何东西,它已经是一个解析后的对象,现在我必须将其分割并处理。
英文:
I have a big object like below containing 1mil props.
{
"Name1":{"some":"object"},
"Name2":{"some":"object"},
"Name1000000":{"some":"object"}
}
I want to split this object into N parts so I wrote the following code
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
var bigObject = {
"Name1": {
"some": "object"
},
"Name2": {
"some": "object"
},
"Name1000000": {
"some": "object"
}
};
const names = Object.keys(bigObject);
const partsCount = 4;
const parts = names
.reduce((acc, name, idx) => {
const reduceIndex = idx % partsCount;
if (acc[reduceIndex] == null) {
acc[reduceIndex] = {};
}
acc[reduceIndex][name] = request.body[name];
return acc;
}, new Array(Math.min(partsCount, names.length)));
<!-- end snippet -->
While this works, the problem is with performance. Currently it's taking 1.2 to 1.5 seconds!! Is there any performant way to write this code? My expectation is that this should be hardly double digit milliseconds for new processors running at ~3 GHz. Is my expectation wrong?
Update:
I somewhat don't understand why people are looking for solutions outside of the given problem/question by asking "why" and "what" you are doing with this code. some have assumed it to be database use case. There are literally 1000s of use cases, anyway mine is IOT sensor data ingestion where max of 1 million sensors data is aggregated into a bulk REST call which needs to be processed, now here comes second issue as soon as i reveal the use-case people will try to optimize that, so let me be clear i cannot change anything apart for the code shown above Its already a parsed object now i have to divide it and process it.
答案1
得分: 1
测试在每次迭代中 if (acc[reduceIndex] == null)
看起来效率低下,而且有时访问不存在的属性,可能会显著减慢您的代码。尝试预先填充数组:
const parts = names.reduce((acc, name, idx) => {
acc[idx % partsCount][name] = request.body[name];
return acc;
}, Array.from({length: Math.min(partsCount, names.length)}, () => ({})));
英文:
Testing if (acc[reduceIndex] == null)
in every iteration seems inefficient, and since it sometimes accesses properties that don't exist, might considerably slow down your code. Try to pre-populate the array:
const parts = names.reduce((acc, name, idx) => {
acc[idx % partsCount][name] = request.body[name];
return acc;
}, Array.from({length: Math.min(partsCount, names.length)}, () => ({})));
答案2
得分: 0
但我这里的方法是使用 Object.entries
将对象分成其属性的数组。然后使用 Array.slice
将这些属性分成 N 个部分,然后使用 Object.fromEntries
重新组装每个部分为一个对象,最后将每个对象推入一个部分数组。
function partitionProperties(obj, partsCount) {
const entries = Object.entries(obj);
const parts = [];
for (let i = 1, a = 0, b = 0; i <= partsCount; ++i) {
a = b;
b = i / partsCount * entries.length | 0;
parts.push(Object.fromEntries(entries.slice(a, b)));
}
return parts;
}
重要的是要承认,虽然这将对象分成了部分,但它并没有像你一样将属性随机/排列成相同的组。你像洗牌/分发卡片一样一次将它们分到了不同的组中:[{a, d, g}, {b, e, h}, {c, f, i}]
,而这只是将它们分成了不同的部分 [{a, b, c}, {d, e, f}, {g, h, i}]
。但不清楚你是否关心顺序或分组。
我也不知道这是否是最快的方法,但它相对来说简单和直观,这也有其自身的优点。
英文:
But my approach here is to use Object.entries
to divide the object into an array of its properties. Then divide those properties into N chunks using Array.slice
, and then reassemble each chunk into an object using Object.fromEntries
and finally push each object into an array of parts.
function partitionProperties(obj, partsCount) {
const entries = Object.entries(obj);
const parts = [];
for (let i = 1, a = 0, b = 0; i <= partsCount; ++i) {
a = b;
b = i / partsCount * entries.length | 0;
parts.push(Object.fromEntries(entries.slice(a, b)));
}
return parts;
}
It's important to acknowledge that, while this does divide the object into parts, it doesn't shuffle/arrange the properties into the same groups that you did. You shuffled/distributed them into groups one at a time like the way you would deal cards: [{a, d, g}, {b, e, h}, {c, f, i}]
whereas this just partitions them [{a, b, c}, {d, e, f}, {g, h, i}]
. But it's not clear that you were attached to the order or grouping.
I also don't know if this is the fastest, but it is relatively simple and straightforward, which has its own advantages.
答案3
得分: 0
我尝试了很多优化方法来解决问题代码,从一直使用一个隐藏的类类型(对于v8等)开始,对其进行分析后,我无法将执行时间降低到600毫秒以下。
似乎Object.keys
/Object.entries
需要大约300毫秒,你可以通过预先用250,000个props对象填充parts数组来节省一些时间,但总体上仍然不低于100毫秒,这与我最初的想法不符。
所以我尝试了评论中建议的数组方法(将bigObject更改为对象数组),如下所示的代码,并且我能够将整个过程分割成73毫秒。
结论:由于我不确定是否可以将原始格式更改为新的数组格式,但有一件事非常清楚,对于大于1K的任何元素,在JS中始终使用数组。我之前选择使用对象是因为我不想得到重复的属性名称,但现在看来,在服务器端合并它们也不是那么糟糕,我想。
function main(names) {
let parts = Array.from({ length: Math.min(partsCount, names.length) }, _ => []);
for (let index = 0; index < names.length; index++) {
parts[index % partsCount].push(names[index])
}
return parts;
}
const bigArray = Array.from({ length: 1000000 }, (_, idx) => ({ "tag": `Name${idx}`, "value": idx, "timestamp": date, "qualityh": idx, "qualityl": idx }));
const partsCount = 4;
console.time("B");
main(bigArray);
console.timeEnd("B");
注意:在代码中有一些HTML转义字符,你可能需要处理它们。
英文:
I tried a lot of optimisation for the problem code right from sticking to one hidden class type in for v8 etc, profiling it i cannot take it down below 600ms
It seems Object.keys
/Object.entries
takes roughly 300ms, you can get some savings by pre-populating the parts array with 250000 props object but in total it was not under 100ms as i was originally thinking.
So i tried the array method as suggested in comments(change the bigObject to array of Objects) as code given below and i was able to split the whole thing in 73ms
Conclusion: Since i am not sure if i can change the original format to the new array format, but 1 thing is very clear stick to array in JS for any elements greater than 1K, I had my reasons to use object so that i don't get repeated prop names, but then its 73ms i guess merging them on server side would not be that bad i guess.
function main(names) {
let parts = Array.from({ length: Math.min(partsCount, names.length) }, _ => []);
for (let index = 0; index < names.length; index++) {
parts[index % partsCount].push(names[index])
}
return parts;
}
const bigArray = Array.from({ length: 1000000 }, (_, idx) => ({ "tag": `Name${idx}`, "value": idx, "timestamp": date, "qualityh": idx, "qualityl": idx }));
const partsCount = 4;
console.time("B");
main(bigArray);
console.timeEnd("B");
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论