英文:
How to find the point that gives the maximum value fast? Java or c++ code please
问题
我需要一种快速找到最大值的方法,当区间重叠时,不像找到重叠最多的点,这里有一个“顺序”的概念。我会有一个 int[][] data
数组,其中包含 2 个值的 int[]
,第一个数是中心点,第二个数是半径,离中心点越近,该点的值就越大。例如,如果我有以下数据:
int[][] data = new int[][]{
{1, 1},
{3, 3},
{2, 4}};
那么在数轴上,它看起来会是这样的:
x 轴: -2 -1 0 1 2 3 4 5 6 7
1 1: 1 2 1
3 3: 1 2 3 4 3 2 1
2 4: 1 2 3 4 5 4 3 2 1
所以为了使我的点的值尽可能大,我需要选择 x = 2 的点,它的总值为 1 + 3 + 5 = 9,这是可能的最大值。是否有一种快速的方法来做到这一点?比如时间复杂度为 O(n) 或 O(nlogn)。
(Note: The translation above provides the information related to your code and the problem statement without repeating the question.)
英文:
I need a fast way to find maximum value when intervals are overlapping, unlike finding the point where got overlap the most, there is "order". I would have int[][] data
that 2 values in int[]
, where the first number is the center, the second number is the radius, the closer to the center, the larger the value at that point is going to be. For example, if I am given data like:
int[][] data = new int[][]{
{1, 1},
{3, 3},
{2, 4}};
Then on a number line, this is how it's going to looks like:
x axis: -2 -1 0 1 2 3 4 5 6 7
1 1: 1 2 1
3 3: 1 2 3 4 3 2 1
2 4: 1 2 3 4 5 4 3 2 1
So for the value of my point to be as large as possible, I need to pick the point x = 2, which gives a total value of 1 + 3 + 5 = 9, the largest possible value. It there a way to do it fast? Like time complexity of O(n) or O(nlogn)
答案1
得分: 3
这可以通过一个简单的O(n log n)算法来完成。
考虑值函数 v(x),然后考虑其离散导数 dv(x) = v(x) - v(x-1)。假设你只有一个区间,比如 {3,3}
。dv(x) 在负无穷到 -1 之间为 0,在 0 到 3 之间为 1,在 4 到 6 之间为 -1,在 7 到正无穷之间为 0。也就是说,导数在 -1 后面变化了 1,在 3 后面变化了 -2,在 6 后面变化了 1。
对于 n 个区间,有 3*n 个导数变化(其中一些可能发生在同一点)。因此找到所有导数变化的列表 (x, change)
,按照它们的 x 进行排序,然后只需迭代这个集合。
代码如下:
intervals = [(1,1), (3,3), (2,4)]
events = []
for mid, width in intervals:
before_start = mid - width - 1
at_end = mid + width
events += [(before_start, 1), (mid, -2), (at_end, 1)]
events.sort()
prev_x = -1000
v = 0
dv = 0
best_v = -1000
best_x = None
for x, change in events:
dx = x - prev_x
v += dv * dx
if v > best_v:
best_v = v
best_x = x
dv += change
prev_x = x
print best_x, best_v
以及 Java 代码:
TreeMap<Integer, Integer> ts = new TreeMap<Integer, Integer>();
for (int i = 0; i < cows.size(); i++) {
int index = cows.get(i)[0] - cows.get(i)[1];
if (ts.containsKey(index)) {
ts.replace(index, ts.get(index) + 1);
} else {
ts.put(index, 1);
}
index = cows.get(i)[0] + 1;
if (ts.containsKey(index)) {
ts.replace(index, ts.get(index) - 2);
} else {
ts.put(index, -2);
}
index = cows.get(i)[0] + cows.get(i)[1] + 2;
if (ts.containsKey(index)) {
ts.replace(index, ts.get(index) + 1);
} else {
ts.put(index, 1);
}
}
int value = 0;
int best = 0;
int change = 0;
int indexBefore = -100000000;
while (ts.size() > 1) {
int index = ts.firstKey();
value += (ts.get(index) - indexBefore) * change;
best = Math.max(value, best);
change += ts.get(index);
ts.remove(index);
}
其中 cows
是数据。
英文:
This can be done with a simple O(n log n) algorithm.
Consider the value function v(x), and then consider its discrete derivative dv(x)=v(x)-v(x-1). Suppose you only have one interval, say {3,3}
. dv(x) is 0 from -infinity to -1, then 1 from 0 to 3, then -1 from 4 to 6, then 0 from 7 to infinity. That is, the derivative changes by 1 "just after" -1, by -2 just after 3, and by 1 just after 6.
For n intervals, there are 3*n derivative changes (some of which may occur at the same point). So find the list of all derivative changes (x,change)
, sort them by their x, and then just iterate through the set.
Behold:
intervals = [(1,1), (3,3), (2,4)]
events = []
for mid, width in intervals:
before_start = mid - width - 1
at_end = mid + width
events += [(before_start, 1), (mid, -2), (at_end, 1)]
events.sort()
prev_x = -1000
v = 0
dv = 0
best_v = -1000
best_x = None
for x, change in events:
dx = x - prev_x
v += dv * dx
if v > best_v:
best_v = v
best_x = x
dv += change
prev_x = x
print best_x, best_v
And also the java code:
TreeMap<Integer, Integer> ts = new TreeMap<Integer, Integer>();
for(int i = 0;i<cows.size();i++) {
int index = cows.get(i)[0] - cows.get(i)[1];
if(ts.containsKey(index)) {
ts.replace(index, ts.get(index) + 1);
}else {
ts.put(index, 1);
}
index = cows.get(i)[0] + 1;
if(ts.containsKey(index)) {
ts.replace(index, ts.get(index) - 2);
}else {
ts.put(index, -2);
}
index = cows.get(i)[0] + cows.get(i)[1] + 2;
if(ts.containsKey(index)) {
ts.replace(index, ts.get(index) + 1);
}else {
ts.put(index, 1);
}
}
int value = 0;
int best = 0;
int change = 0;
int indexBefore = -100000000;
while(ts.size() > 1) {
int index = ts.firstKey();
value += (ts.get(index) - indexBefore) * change;
best = Math.max(value, best);
change += ts.get(index);
ts.remove(index);
}
where cows
is the data
答案2
得分: 1
嗯,一个一般的O(n log n)
或者更好的算法可能会很棘手,可能可以通过线性规划来解决,但那可能会变得相当复杂。
经过一番处理,我认为这个问题可以通过线段相交和函数求和(由线段相交表示)来解决。基本上,将每个问题想象成线段上方的一个三角形。如果输入是(C, R)
,三角形的中心在C
上,半径为R
。线段上的点分别是C - R(值为0)
,C(值为R)
和C + R(值为0)
。三角形的每条线段代表一个值。
考虑任意两个这样的“三角形”,最大值出现在以下两个地方之一:
- 一个三角形的顶点
- 两个三角形相交的交点,或者是两个三角形的整体交点。多个三角形意味着更多可能的交点,不幸的是,可能的交点数量呈二次增长,所以使用这种方法可能无法达到
O(N log N)
或更好的时间复杂度(除非找到一些良好的优化方法),除非交点的数量是O(N)
或更少。
要找到所有的交点,我们可以使用标准的相交点查找算法,但我们需要以一种特定的方式修改事物。我们需要添加一条从每个顶点延伸的线,高度足够高,以至于它会比任何线都高,因此从(C,C)到(C,Max_R)
。然后我们运行算法,输出敏感的交点查找算法为O(N log N + k)
,其中k
是交点的数量。不幸的是,这可能高达O(N^2)
(考虑情况(1,100),(2,100),(3,100)...
一直到(50,100)
。每条线都会与其他每条线相交)。一旦你有了O(N + K)
的交点,对于每个交点,你可以通过对队列中的所有点求和来计算值。可以将累计和保持为缓存值,因此它只会改变O(K)
次,尽管可能不可能,此时复杂度将变为O(N*K)
。使得它可能变为O(N^3)
(在最坏的K
情况下)。尽管如此,这似乎还是可以接受的。对于每个交点,你需要对最多O(N)
条线进行求和,以获取该点的值,但在实际情况下,性能可能会更好。
考虑到你的目标是找到最大值,有一些可以进行的优化。可能有些交点不值得追求,然而,我也可以想象一种情况,它非常接近,你无法将其减少。让我想起了凸包问题。在许多情况下,你可以轻松地减少90%的数据,但也有一些情况,你会看到最坏情况的结果(每个点或几乎每个点都是凸包点)。例如,在实际情况下,肯定存在一些情况,你可以确定总和将小于当前已知的最大值。
另一个优化可能是构建一个区间树。
英文:
Hmmm, a general O(n log n)
or better would be tricky, probably solvable via linear programming, but that can get rather complex.
After a bit of wrangling, I think this can be solved via line intersections and summation of function (represented by line segment intersections). Basically, think of each as a triangle on top of a line. If the inputs are (C,R)
The triangle is centered on C
and has a radius of R
. The points on the line are C-R (value 0)
, C (value R)
and C+R (value 0)
. Each line segment of the triangle represents a value.
Consider any 2 such "triangles", the max value occurs in one of 2 places:
- The peak of one of the triangle
- The intersection point of the triangles or the point where the two triangles overall. Multiple triangles just mean more possible intersection points, sadly the number of possible intersections grows quadratically, so
O(N log N)
or better may be impossible with this method (unless some good optimizations are found), unless the number of intersections isO(N)
or less.
To find all the intersection points, we can just use a standard algorithm for that, but we need to modify things in one specific way. We need to add a line that extends from each peak high enough so it would be higher than any line, so basically from (C,C) to (C,Max_R)
. We then run the algorithm, output sensitive intersection finding algorithms are O(N log N + k)
where k
is the number of intersections. Sadly this can be as high as O(N^2)
(consider the case (1,100), (2,100),(3,100)...
and so on to (50,100)
. Every line would intersect with every other line. Once you have the O(N + K)
intersections. At every intersection, you can calculate the the value by summing the of all points within the queue. The running sum can be kept as a cached value so it only changes O(K)
times, though that might not be posible, in which case it would O(N*K)
instead. Making it it potentially O(N^3)
(in the worst case for K
) instead :(. Though that seems reasonable. For each intersection you need to sum up to O(N)
lines to get the value for that point, though in practice, it would likely be better performance.
There are optimizations that could be done considering that you aim for the max and not just to find intersections. There are likely intersections not worth pursuing, however, I could also see a situation where it is so close you can't cut it down. Reminds me of convex hull. In many cases you can easily reduce 90% of the data, but there are cases where you see the worst case results (every point or almost every point is a hull point). For example, in practice there are certainly causes where you can be sure that the sum is going to be less than the current known max value.
Another optimization might be building an interval tree.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论