英文:
Is there a faster way of summing up values for keys across large array of objects?
问题
以下是您提供的代码的翻译部分:
我有一个以如下形式的大数据集:
data = [{ a: 12, b: 8 }, { a: 2, c: 4, d: 14 }, { c: 2, e: 4, f: 14 }]
我想要的是一个包含所有键(这里是a-f)以及它们在数据集中值的总和的对象,如下所示:
{ a: 14, b: 8, c: 6, d: 14, e: 4, f: 14 }
我可以通过以下方式获得所需的结果:
function sum(a, b) { return a + b };
function countTotal(n) {
let ndata = data.filter((i) => Object.keys(i).includes(n))
let cnt = Object.assign(ndata.map((i) => i[n])).reduce(sum);
return {[n]:cnt};
};
let names = 'abcdef'.split('')
let res = Array.from(names).map((n) => countTotal(n))
res = Object.assign({}, ...res);
我的问题是,对于我实际拥有的数据集(相当大),这需要相当长的时间。是否有更高效的方法来实现相同的功能?
以下是创建一个大型虚拟数据集以模拟实际数据集的一些代码:
let dummy_names = [];
for (let i = 0; i < 2000; i++) {
dummy_names.push((Math.random() + 1).toString(36).slice(2,7));
};
dummy_names = [...new Set(dummy_names)];
names = new Set();
function makeResponses() {
let responses = {};
let idx = 0;
for (let j = 0; j <= Math.floor(Math.random() * 7); j++) {
idx = Math.floor(Math.random()*dummy_names.length);
inam = dummy_names[idx];
names.add(inam);
responses[inam] = Math.floor(Math.random()*20);
};
return responses;
};
let data = [];
for (let i = 0; i < 20000; i++) {
data.push(makeResponses());
};
英文:
I have a large dataset in the form:
data = [{ a: 12, b: 8 }, { a: 2, c: 4, d: 14 }, { c: 2, e: 4, f: 14 }]
What I want is an object with all keys (here a-f) and the sum of their values across the data set, like so:
{ a: 14, b: 8, c: 6, d: 14, e: 4, f: 14 }
I can get the desired result like this:
function sum(a, b) { return a + b };
function countTotal(n) {
let ndata = data.filter((i) => Object.keys(i).includes(n))
let cnt = Object.assign(ndata.map((i) => i[n])).reduce(sum);
return {[n]:cnt};
};
let names = 'abcdef'.split('')
let res = Array.from(names).map((n) => countTotal(n))
res = Object.assign({}, ...res);
My problem is that this takes quite long for the actual data set I have (which is quite big). Is there a way to do the same more efficiently?
Below is some code do create a large dummy data set approximating the real data set.
let dummy_names = [];
for (let i = 0; i < 2000; i++) {
dummy_names.push((Math.random() + 1).toString(36).slice(2,7));
};
dummy_names = [...new Set(dummy_names)];
names = new Set();
function makeResponses() {
let responses = {};
let idx = 0;
for (let j = 0; j <= Math.floor(Math.random() * 7); j++) {
idx = Math.floor(Math.random()*dummy_names.length);
inam = dummy_names[idx];
names.add(inam);
responses[inam] = Math.floor(Math.random()*20);
};
return responses;
};
let data = [];
for (let i = 0; i < 20000; i++) {
data.push(makeResponses());
};
答案1
得分: 4
我会使用一个辅助对象来跟踪总和,并循环遍历数组中的对象。
最重要的是只查看每个值一次,以保持复杂度(用大O符号表示)低。有许多迭代的方式,我不确定for循环或.forEach
哪个更快。
这是一个简单的解决方案:
const data = [{a: 12, b: 8}, {a: 2, c: 4, d: 14}, {c: 2, e: 4, f: 14}];
const sums = {};
data.forEach(object => {
Object.entries(object).forEach(([key, value]) => {
if (sums.hasOwnProperty(key)) {
sums[key] += value;
} else {
sums[key] = value;
}
});
});
console.log(sums);
希望这对你有所帮助!
英文:
I'd use a helper object to keep track of the sums and loop through the objects in the array.
The most important thing is only look at each value once to keep the complexity (in terms of O notation) low. There are many ways to iterate, I'm not sure if for loops or .forEach
is faster.
Here is a rough solution:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
const data = [{a: 12, b: 8}, {a: 2, c: 4, d: 14}, {c: 2, e: 4, f: 14}];
const sums = {};
data.forEach(object => {
Object.entries(object).forEach(([key, value]) => {
if (sums.hasOwnProperty(key)) {
sums[key] += value;
} else {
sums[key] = value;
}
});
});
console.log(sums);
<!-- end snippet -->
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论