英文:
How can I smoothly stream frames from a video that is currently being recorded (Python)
问题
我正在尝试在Python中实时接收和记录视频流,将其保存为视频文件并具有合理的压缩,同时能够快速随机访问到目前为止接收的任何帧(减去接受时的可接受缓冲,例如30帧/1秒)。
从给定索引查找的帧必须始终相同... 也就是说,如果稍后打开视频并再次从中读取,我们必须得到与我们“实时”读取时相同的帧。
这是我当前的代码(在Mac上运行),试图保存从摄像头接收的视频流,同时在写入时从视频文件中读取。
import datetime, os, subprocess, time, cv2
video_path = os.path.expanduser(f"~/Downloads/stream_{datetime.datetime.now()}.mkv")
# Start a process that record the video with ffmpeg
process = subprocess.Popen(['ffmpeg', '-y', '-f', 'avfoundation', '-framerate', '30', '-i', '0:none', '-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path])
time.sleep(4) # Let it start
assert os.path.exists(video_path), "Video file not created"
# Let's simulate a process that gets random frames from the video
cap = cv2.VideoCapture(video_path)
try:
while True:
ret, frame = cap.read()
if frame is not None:
print(f"Got frame of shape {frame.shape}")
cv2.imshow("Q to Quit", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
print(f"No frame available on {video_path}.. waiting for more frames...")
time.sleep(0.1)
except KeyboardInterrupt:
pass
process.terminate()
cap.release()
这段代码从未加载帧。
我发现的事情:
- 如果我在else块中使用
cap = cv2.VideoCapture(video_path)
重新生成cap,它确实会从录制的起点开始显示帧。使用完所有帧后,它停止。如果我跟踪到目前为止看到的帧,并执行以下操作,它不起作用(cap.set..
没有任何影响)。
else:
print(f"No frame available on {video_path}")
time.sleep(0.1)
cap = cv2.VideoCapture(video_path)
print(f"Restarting video capture from frame {frames_seen_so_far}")
cap.set(cv2.CAP_PROP_POS_FRAMES, frames_seen_so_far) # DOES NOT WORK
英文:
I am trying to stream a video (into numpy arrays) while recording it in Python. Currently, I am recording using ffmpeg, and opencv to read it, but it is not going smoothly.
GOAL: Receive and record a video stream in real-time, into a video file with reasonable compression, while being able to have fast random access to any frame received up to that point (minus perhaps an acceptable buffer of e.g. 30 frames / 1 second).
The frame read from a given index lookup must always be identical.. i.e. if we open the video later and read from it again, we must get identical frames as when we read them "live".
Here is my current code (which runs on Mac), which attempts to save a video stream from the camera, while reading from the video file as it is being written to.
import datetime, os, subprocess, time, cv2
video_path = os.path.expanduser(f"~/Downloads/stream_{datetime.datetime.now()}.mkv")
# Start a process that record the video with ffmpeg
process = subprocess.Popen(['ffmpeg', '-y', '-f', 'avfoundation', '-framerate', '30', '-i', '0:none', '-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path])
time.sleep(4) # Let it start
assert os.path.exists(video_path), "Video file not created"
# Let's simulate a process that gets random frames from the video
cap = cv2.VideoCapture(video_path)
try:
while True:
ret, frame = cap.read()
# print(f"Captured frame of shape {frame.shape}" if frame is not None else "No frame available")
if frame is not None:
print(f"Got frame of shape {frame.shape}")
cv2.imshow("Q to Quit", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
print(f"No frame available on {video_path}.. waiting for more frames...")
time.sleep(0.1)
except KeyboardInterrupt:
pass
process.terminate()
cap.release()
This code never loads a frame.
Things I have discovered:
- If I renew the cap with
cap = cv2.VideoCapture(video_path)
in the else block, it does start showing frames from the start of the recording. After using up all frames, it stops. If I keep track frames seen so far and do
else:
print(f"No frame available on {video_path}")
time.sleep(0.1)
cap = cv2.VideoCapture(video_path)
print(f"Restarting video capture from frame {frames_seen_so_far}")
cap.set(cv2.CAP_PROP_POS_FRAMES, frames_seen_so_far) # DOES NOT WORK
It does not work (the cap.set..
makes no difference).
答案1
得分: 1
FFmpeg CLI 支持一次命令中的多个输出。
我们可以将第一个输出设置为录制的视频文件,第二个输出设置为 pipe:
。
这样我们可以录制视频,并从 FFmpeg 子进程的 stdout
管道中获取原始视频帧。
我的[以下回答][1]展示了类似的内容,同时也抓取了音频。
我们可以从抓取视频时的相同概念进行操作。
为了快速随机访问录制的视频帧,我们可以使用[分段复用器][2]将录制的输入拆分为多个1秒的持续时间文件。
在抓取实时视频之前,最好通过从以下 MP4 [输入文件][3]中抓取视频来测试解决方案。
代码示例:
import subprocess
import cv2
import numpy as np
video_path = 'recorded_video.mkv' # 输出视频文件名
input_file_name = 'small_bunny_1080p_60fps.mp4'
# 如果我们从高级知道分辨率,可以跳过以下部分
cap = cv2.VideoCapture(input_file_name) # 打开视频流以捕获(仅用于获取视频分辨率)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap.release()
# 启动一个进程,使用ffmpeg录制视频并将原始视频帧传递给stdout
process = subprocess.Popen(['ffmpeg', '-y', '-an', '-i', input_file_name, '-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path,
'-f', 'rawvideo', '-pix_fmt', 'bgr24', 'pipe:'], stdout=subprocess.PIPE)
while True:
raw_frame = process.stdout.read(width*height*3) # 以字节数组形式读取原始视频帧
if len(raw_frame) != (width*height*3):
break # 如果读取的字节数太少,假设到达文件结尾(或关闭摄像头)。
# 将读取的字节转换为NumPy数组,并重新形状为视频帧尺寸
frame = np.frombuffer(raw_frame, np.uint8).reshape((height, width, 3))
cv2.imshow("Q to Quit", frame) # 用于测试显示帧
if cv2.waitKey(1) == ord('q'):
break
process.stdout.close() # 关闭stdout管道
process.wait(1) # 在终止子进程之前等待1秒。
process.terminate()
cv2.destroyAllWindows()
从摄像头抓取:
当从摄像头抓取时,我们添加 "-rtbufsize 100M" 参数以减少丢帧的机会(更大的 "实时缓冲区" 允许FFmpeg存储缓冲帧而不是丢弃它们)。
在以下示例中,我使用我的网络摄像头进行测试(还使用 "-f dshow" 而不是 "-f avfoundation",因为我使用的是Windows 10):
import subprocess
import cv2
import numpy as np
video_path = 'recorded_video.mkv' # 输出视频文件名
# 如果我们从高级知道分辨率,可以跳过以下部分
cap = cv2.VideoCapture(0) # 打开视频流以捕获(仅用于获取视频分辨率)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap.release()
# 启动一个进程,使用ffmpeg录制视频并将原始视频帧传递给stdout
# 从网络摄像头读取的示例,使用 "-rtbufsize 100M" 以减少丢帧的机会。
process = subprocess.Popen(['ffmpeg', '-y', '-an', '-f', 'dshow', '-rtbufsize', '100M', '-framerate', '30', '-i', 'video=Microsoft® LifeCam HD-3000',
'-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path,
'-f', 'rawvideo', '-pix_fmt', 'bgr24', 'pipe:'], stdout=subprocess.PIPE)
while True:
raw_frame = process.stdout.read(width*height*3) # 以字节数组形式读取原始视频帧
if len(raw_frame) != (width*height*3):
break # 如果读取的字节数太少,假设到达文件结尾(或关闭摄像头)。
# 将读取的字节转换为NumPy数组,并重新形状为视频帧尺寸
frame = np.frombuffer(raw_frame, np.uint8).reshape((height, width, 3))
cv2.imshow("Q to Quit", frame) # 用于测试显示帧
if cv2.waitKey(1) == ord('q'):
break
process.stdout.close() # 关闭stdout管道
process.wait(1) # 在终止子进程之前等待1秒。
process.terminate()
cv2.destroyAllWindows()
快速随机访问录制的视频帧:
由于我们无法从尚未关闭的视频文件中捕获帧,我们可以使用[分段复用器][2]将录制的输入拆分为多个1秒的持续时间文件(使用FLV容器而不是MKV)。
这样,我们可以访问(并具有寻址能力)录制的视频帧。
使用分段解复用器的代码示例:
import subprocess
import cv2
import numpy as np
# 如果我们从高级知道分辨率,可以跳过以下部分
cap = cv2.VideoCapture(0) # 打开视频流以捕获(仅用于获取视频分辨率)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap.release()
# 启动一个进程,使用ffmpeg录制视频并将原始视频帧传递给stdout
# 从网络摄像头读取的示例,使用 "-rtbufsize 100M" 以减
<details>
<summary>英文:</summary>
FFmpeg CLI supports multiple outputs in one command.
We may set the first output as recorded video file, and the second output as `pipe:`.
That way we can recorded the video, and get the raw video frames from `stdout` pipe of FFmpeg sub-process.
My [following answer][1] shows something similar, while grabbing the audio.
We may use the same concept from grabbing the video while recording.
For having fast random access to recorded video frames, we may use [segment muxer][2] for splitting the recorded input to multiple 1 second duration files.
---
Before grabbing a live video, we better test the solution by grabbing video from the following MP4 [input file][3].
Code sample:
import subprocess
import cv2
import numpy as np
video_path = 'recorded_video.mkv' # Output video file name
input_file_name = 'small_bunny_1080p_60fps.mp4'
# We may skip the following part, if we know the resolution from advanced
cap = cv2.VideoCapture(input_file_name) # Open video stream for capturing (just for getting the video resolution)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap.release()
# Start a process that record the video with ffmpeg and also pass raw video frames to stdout
process = subprocess.Popen(['ffmpeg', '-y', '-an', '-i', input_file_name, '-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path,
'-f', 'rawvideo', '-pix_fmt', 'bgr24', 'pipe:'], stdout=subprocess.PIPE)
while True:
raw_frame = process.stdout.read(width*height*3) # Read raw video frame as bytes array
if len(raw_frame) != (width*height*3):
break # Break the loop in case of too few bytes were read - assume end of file (or turning off the camera).
# Transform the bytes read into a NumPy array, and reshape it to video frame dimensions
frame = np.frombuffer(raw_frame, np.uint8).reshape((height, width, 3))
cv2.imshow("Q to Quit", frame) # Show frame for testing
if cv2.waitKey(1) == ord('q'):
break
process.stdout.close() # Close stdout pipe
process.wait(1) # Wait 1 second before terminating the sub-process.
process.terminate()
cv2.destroyAllWindows()
---
Grabbing from a video camera:
When grabbing from a video camera, we add "-rtbufsize 100M" argument for reducing the chance for losing frames (larger "real-time buffer" allows FFmpeg to store buffered frames instead of dropping them).
In the following example I am using my webcam for testing (also using `-f dshow` instead of `-f avfoundation`, since I am using Windows 10):
import subprocess
import cv2
import numpy as np
video_path = 'recorded_video.mkv' # Output video file name
# We may skip the following part, if we know the resolution from advanced
cap = cv2.VideoCapture(0) # Open video stream for capturing (just for getting the video resolution)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap.release()
#process = subprocess.Popen(['ffmpeg', '-y', '-f', 'avfoundation', '-framerate', '30', '-i', '0:none', '-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path])
# Start a process that record the video with ffmpeg and also pass raw video frames to stdout
# Example for reading from a webcam, use "-rtbufsize 100M" for reducing the chance for losing frames.
process = subprocess.Popen(['ffmpeg', '-y', '-an', '-f', 'dshow', '-rtbufsize', '100M', '-framerate', '30', '-i', 'video=Microsoft® LifeCam HD-3000',
'-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path,
'-f', 'rawvideo', '-pix_fmt', 'bgr24', 'pipe:'], stdout=subprocess.PIPE)
while True:
raw_frame = process.stdout.read(width*height*3) # Read raw video frame as bytes array
if len(raw_frame) != (width*height*3):
break # Break the loop in case of too few bytes were read - assume end of file (or turning off the camera).
# Transform the bytes read into a NumPy array, and reshape it to video frame dimensions
frame = np.frombuffer(raw_frame, np.uint8).reshape((height, width, 3))
cv2.imshow("Q to Quit", frame) # Show frame for testing
if cv2.waitKey(1) == ord('q'):
break
process.stdout.close() # Close stdout pipe
process.wait(1) # Wait 1 second before terminating the sub-process.
process.terminate()
cv2.destroyAllWindows()
---
**Having fast random access to recorded video frames:**
Since we can't capture frames from the a video file that is not closed, we may use [segment muxer][2] for splitting the recorded input to multiple 1 second duration files (using FLV container instead of MKV).
In that way, we have access (and seeking capability) to the recorded video frames.
Code sample using segment demuxer:
import subprocess
import cv2
import numpy as np
# We may skip the following part, if we know the resolution from advanced
cap = cv2.VideoCapture(0) # Open video stream for capturing (just for getting the video resolution)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap.release()
#process = subprocess.Popen(['ffmpeg', '-y', '-f', 'avfoundation', '-framerate', '30', '-i', '0:none', '-preset', 'fast', '-crf', '23', '-b:v', '8000k', video_path])
# Start a process that record the video with ffmpeg and also pass raw video frames to stdout
# Example for reading from a webcam, use "-rtbufsize 100M" for reducing the chance for losing frames.
# Set segment time to 1 second.
# The output files are going to be 00000000.flv, 00000001.flv, 00000002.flv, 00000003.flv... (each file is 1 second).
# Use "-g 30`" encoding argument for adding as key frames every 30 frames (and "-flags +cgop" for assuring close GOP).
process = subprocess.Popen(['ffmpeg', '-y', '-an', '-f', 'dshow', '-rtbufsize', '100M', '-framerate', '30', '-i', 'video=Microsoft® LifeCam HD-3000',
'-vcodec', 'libx264', '-flags', '+cgop', '-preset', 'fast', '-crf', '23', '-b:v', '8000k', '-g', '30',
'-f', 'segment', '-segment_time', '1', '-reset_timestamps', '0',
'-segment_list', 'list.txt', '-segment_list_type', 'ffconcat', '%08d.flv',
'-f', 'rawvideo', '-pix_fmt', 'bgr24', 'pipe:'], stdout=subprocess.PIPE)
while True:
raw_frame = process.stdout.read(width*height*3) # Read raw video frame as bytes array
if len(raw_frame) != (width*height*3):
break # Break the loop in case of too few bytes were read - assume end of file (or turning off the camera).
# Transform the bytes read into a NumPy array, and reshape it to video frame dimensions
frame = np.frombuffer(raw_frame, np.uint8).reshape((height, width, 3))
cv2.imshow("Q to Quit", frame) # Show frame for testing
if cv2.waitKey(1) == ord('q'):
break
process.stdout.close() # Close stdout pipe
process.wait(1) # Wait 1 second before terminating the sub-process.
process.terminate()
cv2.destroyAllWindows()
---
After finish recording, we may merge the segment into a single video file using FFmpeg concat demuxer:
`ffmpeg -y -f concat -safe 0 -i list.txt -c copy recorded_video.mkv`
---
We may ensure that seeking is working:
cap = cv2.VideoCapture('00000003.flv')
cap.set(cv2.CAP_PROP_POS_FRAMES, 10)
ret, frame = cap.read()
if ret:
cv2.imshow('frame', frame)
cv2.waitKey()
cv2.destroyAllWindows()
cap.release()
---
Notes:
- Ensuring that the recorded video is synchronized with the live video is challenging.
Thorough testing and parameters tuning is required.
- The benefit of grabbing and recording using one FFmpeg command is not so large.
For enduring frames synchronization, we may consider grabbing the frames (say suing `cv2.VideoCapture`), and record the grabbed frames using FFmpeg sub-process.
That way, even when captured frames are "lost", the frame synchronization is ensured.
We may also open a new video file every second instead of using segment demuxer.
[1]: https://stackoverflow.com/a/76443636/4926757
[2]: https://ffmpeg.org/ffmpeg-formats.html#segment_002c-stream_005fsegment_002c-ssegment
[3]: https://drive.google.com/file/d/1fEvbcsLqrj4AN-gOiV1PezMmMvuQqzyh/view
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论