英文:
Efficient search for longest ascending subsequence in large lists of integers with Python
问题
如何在Python中开发一个高效可扩展的算法实现来查找大型整数列表中的最长递增子序列?我有一个包含数百万个元素的列表,而我的当前解决方案速度太慢了。以下是我当前的Python代码作为起点:
def longest_increasing_subsequence(nums):
n = len(nums)
lis = [1] * n
for i in range(1, n):
for j in range(0, i):
if nums[i] > nums[j] and lis[i] < lis[j] + 1:
lis[i] = lis[j] + 1
max_length = max(lis)
result = []
for i in range(n - 1, -1, -1):
if lis[i] == max_length:
result.append(nums[i])
max_length -= 1
return result[::-1]
我注意到这段代码对于大型输入列表非常慢。是否有优化此算法的方法,甚至使用其他更适合查找大型整数列表中的最长递增子序列的算法?我愿意听取任何想法来提高运行时间和内存要求。提前感谢!
英文:
How can I develop an efficient and scalable algorithm implementation in Python to find the longest ascending subsequence in a large list of integers? I have a list with millions of elements, and my current solution is too slow. Here is my current Python code as a starting point:
def longest_increasing_subsequence(nums):
n = len(nums)
lis = [1] * n
for i in range(1, n):
for j in range(0, i):
if nums[i] > nums[j] and lis[i] < lis[j] + 1:
lis[i] = lis[j] + 1
max_length = max(lis)
result = []
for i in range(n - 1, -1, -1):
if lis[i] == max_length:
result.append(nums[i])
max_length -= 1
return result[::-1]
I have noticed that this code is very slow for large input lists. Is there any way to optimize this algorithm or even use another algorithm that is better suited to find the longest ascending subsequence in a large list of integers? I am open to any ideas to improve the running time and memory requirements. Thanks in advance!
答案1
得分: 0
import bisect
def longest_increasing_subsequence(nums: list[int]) -> list[int]:
if not nums:
return []
piles = []
indices = [0 for _ in range(len(nums))]
predecessors = [None for _ in range(len(nums))]
for i, num in enumerate(nums):
pile = bisect.bisect_left(piles, num)
if pile == len(piles):
piles.append(num)
else:
piles[pile] = num
indices[pile] = i
predecessors[i] = indices[pile - 1] if pile else None
last_index = indices[len(piles) - 1]
longest_subsequence = []
while last_index is not None:
longest_subsequence.append(nums[last_index])
last_index = predecessors[last_index]
return longest_subsequence[::-1]
示例用法:
>>> longest_increasing_subsequence([10, 9, 2, 5, 3, 7, 101, 18])
[2, 3, 7, 18]
>>> longest_increasing_subsequence([0, 1, 0, 3, 2, 3])
[0, 1, 2, 3]
>>> longest_increasing_subsequence([7, 7, 7, 7, 7, 7])
[7]
英文:
You can find the longest increasing subsequence in O(n log n)
time with an approach that combines dynamic programming (your current approach) with binary search (in this case via the <code>bisect.<b>bisect_left</b></code> method):
import bisect
def longest_increasing_subsequence(nums: list[int]) -> list[int]:
if not nums:
return []
piles = []
indices = [0 for _ in range(len(nums))]
predecessors = [None for _ in range(len(nums))]
for i, num in enumerate(nums):
pile = bisect.bisect_left(piles, num)
if pile == len(piles):
piles.append(num)
else:
piles[pile] = num
indices[pile] = i
predecessors[i] = indices[pile - 1] if pile else None
last_index = indices[len(piles) - 1]
longest_subsequence = []
while last_index is not None:
longest_subsequence.append(nums[last_index])
last_index = predecessors[last_index]
return longest_subsequence[::-1]
Example Usage:
>>> longest_increasing_subsequence([10, 9, 2, 5, 3, 7, 101, 18])
[2, 3, 7, 18]
>>> longest_increasing_subsequence([0, 1, 0, 3, 2, 3])
[0, 1, 2, 3]
>>> longest_increasing_subsequence([7, 7, 7, 7, 7, 7])
[7]
答案2
得分: 0
如上所述,您的解决方案具有二次时间复杂度。然而,根据维基百科的1,最优解可以在O(n log(n))的时间复杂度内找到。
我修复了我在2中找到的代码,以获得一个在最优时间复杂度内运行的函数,并测试结果与您的原始函数一致:
# 二分查找
def GetCeilIndex(arr, T, l, r, key):
while r - l > 1:
m = l + (r - l) // 2
if arr[T[m]] >= key:
r = m
else:
l = m
return r
def LongestIncreasingSubsequence(arr):
n = len(arr)
# 初始化为0
tailIndices = [0 for i in range(n + 1)]
# 初始化为-1
prevIndices = [-1 for i in range(n + 1)]
# 它将始终指向空位置
length = 1
for i in range(1, n):
if arr[i] < arr[tailIndices[0]]:
# 新的最小值
tailIndices[0] = i
elif arr[i] > arr[tailIndices[length - 1]]:
# arr[i] 想要扩展
# 最大子序列
prevIndices[i] = tailIndices[length - 1]
tailIndices[length] = i
length += 1
else:
# arr[i] 想要成为
# 未来子序列的潜在候选
# 它将替换tailIndices中的ceil值
pos = GetCeilIndex(arr, tailIndices, -1, length - 1, arr[i])
prevIndices[i] = tailIndices[pos - 1]
tailIndices[pos] = i
res = []
i = tailIndices[length - 1]
while len(res) < length:
res.append(arr[i])
i = prevIndices[i]
res.reverse()
return res
英文:
As said above, your solution is in quadratic time complexity. However according to wikipedia the optimal solution can be found in O(n log(n)) time complexity.
I fixed a code I found here to get a function that runs in optimal time complexity and I tested that the results are consistent with your original function:
# Binary search
def GetCeilIndex(arr, T, l, r, key):
while r - l > 1:
m = l + (r - l) // 2
if arr[T[m]] >= key:
r = m
else:
l = m
return r
def LongestIncreasingSubsequence(arr):
n = len(arr)
# Initialized with 0
tailIndices = [0 for i in range(n + 1)]
# Initialized with -1
prevIndices = [-1 for i in range(n + 1)]
# it will always point
# to empty location
length = 1
for i in range(1, n):
if arr[i] < arr[tailIndices[0]]:
# new smallest value
tailIndices[0] = i
elif arr[i] > arr[tailIndices[length - 1]]:
# arr[i] wants to extend
# largest subsequence
prevIndices[i] = tailIndices[length - 1]
tailIndices[length] = i
length += 1
else:
# arr[i] wants to be a
# potential condidate of
# future subsequence
# It will replace ceil
# value in tailIndices
pos = GetCeilIndex(arr, tailIndices, -1, length - 1, arr[i])
prevIndices[i] = tailIndices[pos - 1]
tailIndices[pos] = i
res = []
i = tailIndices[length - 1]
while len(res) < length:
res.append(arr[i])
i = prevIndices[i]
res.reverse()
return res
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论