去重优化

huangapple go评论64阅读模式
英文:

Deduplication optimization

问题

以下是您提供的代码的翻译部分:

typedef struct n_s
{
	int val;
	struct n_s *next;
}
n_t;

// 删除链表中所有值等于del的元素
n_t * delete(n_t *h, int del);
// 在链表中查找第一个与find相等的元素,如果找不到则返回NULL
n_t * find(n_t *h, int find);

n_t *
delFromList(n_t *h, int x)
{
	int val;
	n_t *el, *posInter;

    // 空链表情况
	if (h == NULL)
		return NULL;

    // 处理第一个元素
	val = h->val;
	if ((posInter = find(h->next, val)) 
        && (find(posInter->next, val)))
		h = delete(h, val);

    // 从第二个元素开始循环
	el = h;
	while (el->next)
	{
		val = el->next->val;
		// 检查是否要删除下一个元素,
        // 如果是,则再次检查“新”的下一个元素
        if ((posInter = find(el->next->next, val))                   
            && (find(posInter->next, val)))
		    el->next = delete(el->next, val);
        // 如果没有删除下一个节点,可以继续前进
		else 
			el = el->next;

	}

	return h;
}

请注意,我只翻译了您提供的代码部分,没有包括问题描述和其他内容。如果您需要任何进一步的帮助,请随时告诉我。

英文:

The problem is as follows. I want a function that, given a list and a max number of occurrences "x", deletes all elements of the list that appear more than x times or x times.

I found a pretty straightforward solution, which is to check for each of the elements. This said, to repeat the find and delete functions many times seems computationally-wise not optimal to me.

I was wondering whether you could provide a better algorithm (i excluded allocating memory for a matrix from the min to the max... just too much for the task... say you have few very big numbers and your memory won't do it.)

My code follows.

typedef struct n_s
{
	int val;
	struct n_s *next;
}
n_t;

// deletes all elements equal to del in list with head h
n_t * delete(n_t *h, int del);
// returns the first occurrence of find in list with head h, otherwise gives NULL
n_t * find(n_t *h, int find);

n_t *
delFromList(n_t *h, int x)
{
	int val;
	n_t *el, *posInter;

    // empty list case
	if (h == NULL)
		return NULL;

    // first element
	val=h->val;
	if ( (posInter = find(h -> next,val)) 
        && (find(posInter -> next, val)))
		h = delete(h, val);

    // loop from second element
	el = h;
	while (el -> next)
	{
		val = el -> next -> val;
		// check whether you want to delete the next one, 
        // and then if you do so, check again on the "new" next one
        if ((posInter = find(el -> next -> next, val))                   
            && (find(posInter -> next, val)))
		    el -> next = delete(el -> next, val);
        // in case you did not delete the nexy node, you can move on
		else 
			el = el -> next;

	}

	return h;
}

I know that the el->next->next may look confusing, but I find it less intuitive to use variables such as "next", "past"... so, sorry for your headache.

答案1

得分: 2

  • 定义一个数据结构 D,其中包含两个成员,一个用于列表元素的值,另一个用于计算它出现的次数。
  • 初始化一个按值有序的空平衡树。
  • 遍历列表。对于列表中的每个项,查找它是否在树中存在。如果不存在,则在该树中插入一个 D 结构,其值成员从列表元素复制而来,计数设置为一。如果它在树中存在,则增加其计数。如果其计数等于或超过阈值,则从列表中删除它。

在平衡树中的查找和插入的时间复杂度为 O(log n)。n 个项目的链表使用了其中的 n 个,从链表中删除的时间复杂度为 O(1)。因此,总时间复杂度为 O(n log n)。

英文:

One option for an algorithm with improved performance is:

  • Define a data structure D with two members, one for the value of a list element and one to count the number of times it appears.
  • Initialize an empty balanced tree ordered by value.
  • Iterate through the list. For each item in the list, look it up in the tree. If it is not present, insert a D structure into that tree with its value member copied from the list element and its count set to one. If it is present in the tree, increments its count. If its count equals or exceeds the threshold, remove it from the list.

Lookups and insertions in a balanced tree are O(log n). A linked list of n items uses n of them, and deletions from a linked list are O(1). So the total time is O(n log n).

答案2

得分: 1

使用计数映射来统计每个元素出现的次数。键是元素,值是计数。

然后,再次遍历数组,删除满足阈值的任何元素。

O(n) 时间,O(n) 额外空间。

英文:

Use a counting map to count the number of times each element appears. The keys are the elements, and the values are the counts.

Then, go through your array a second time, deleting anything which meets your threshold.

O(n) time, O(n) extra space.

huangapple
  • 本文由 发表于 2023年1月8日 22:52:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/75048726.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定