英文:
Is it possible to do h264 encoding before streaming a video with gstreamer, to save time when serving frames?
问题
我对gstreamer有些不太了解,并且想要在Python中设置一个RTSP服务器,可以流式传输mp4或原始的h264数据块。我注意到流可能会占用大量的CPU时间,我认为这是因为对每帧进行h264编码造成的。视频文件已经是h264编码的(即使它们不是,我可以事先使用ffmpeg将它们编码为h264),所以将它们解析并直接传输到管道的下一环节是否是有道理的呢?但我并不完全确定gstreamer启动字符串是如何操作的。
我目前使用的可以工作的启动字符串如下:
```python
launch_string = 'appsrc name=source block=false format=GST_FORMAT_TIME ' \
'caps=video/x-raw,format=BGR,width={},height={},framerate={}/1 ' \
'! videoconvert ! video/x-raw,format=I420 ' \
'! x264enc speed-preset=veryfast tune=zerolatency ' \
'! queue min-threshold-time=300000000 max-size-time=10000000000 max-size-bytes=0 max-size-buffers=0 ' \
'! rtph264pay config-interval=1 name=pay0 pt=96 '.format(opt.image_width, opt.image_height, self.fps)
我尝试过其他一些像:
launch_string = 'appsrc name=source block=false format=GST_FORMAT_TIME ' \
'caps=video/x-h264,format=BGR,width={},height={},framerate={}/1 ' \
'! h264parse ' \
'! queue min-threshold-time=300000000 max-size-time=10000000000 max-size-bytes=0 max-size-buffers=0 ' \
'! rtph264pay config-interval=1 name=pay0 pt=96 '.format(opt.image_width, opt.image_height, self.fps)
但结果是得到了空的H.264 RTP数据包。我认为这只是因为我不理解管道是如何工作的,所以任何解释都将会有所帮助。
这是我的主循环:
def on_need_data(self, src, length):
if self.number_frames >= (self.max_frame): # 循环视频或退出
if LOOP:
self.reset_video()
else:
Gloop.quit()
if self.cap.isOpened():
ret, frame = self.cap.read()
if ret:
if frame.shape[:2] != (self.height, self.width):
if self.debug >=2:
print("调整帧大小")
print(frame.shape[:2])
print((self.height, self.width))
frame = cv2.resize(frame, (self.width, self.height))
data = frame.tobytes()
buf = Gst.Buffer.new_allocate(None, len(data), None)
buf.fill(0, data)
buf.duration = self.duration
timestamp = self.timestamp_frame * self.duration
buf.pts = buf.dts = int(timestamp)
buf.offset = timestamp
self.number_frames += 1
self.timestamp_frame += 1
retval = src.emit('push-buffer', buf)
if self.debug >= 2:
print('推送缓冲区到{}, 帧 {}, 持续时间 {} 纳秒, 持续时间 {} 秒'.format(self.device_id,
self.timestamp_frame,
self.duration,
self.duration / Gst.SECOND))
if retval != Gst.FlowReturn.OK:
print("[信息]: 返回值不是OK: {}".format(retval))
if retval == Gst.FlowReturn.FLUSHING:
print('离线')
else:
if self.debug > 0:
print("[信息]: 无法从cap中读取帧: ")
print(self.device_id)
print(self.number_frames)
print(self.max_frame)
Gloop.quit()
英文:
I'm somewhat new to gstreamer and wanted to set up an RTSP server in python that streams either mp4's or raw h264 chunks. I noticed that the streams can take up a lot of CPU time and I assume that's due to encoding each frame in h264. The video files are already h264 encoded (even if they aren't, I could just encode them as h264 with ffmpeg beforehand), so wouldn't it make sense to just parse them and throw then on to the next point in the pipe? But I'm not entirely certain how the gstreamer launch string operates.
The current launch string that I've got that works is as follows:
launch_string = 'appsrc name=source block=false format=GST_FORMAT_TIME ' \
'caps=video/x-raw,format=BGR,width={},height={},framerate={}/1 ' \
'! videoconvert ! video/x-raw,format=I420 ' \
'! x264enc speed-preset=veryfast tune=zerolatency ' \
'! queue min-threshold-time=300000000 max-size-time=10000000000 max-size-bytes=0 max-size-buffers=0 ' \
'! rtph264pay config-interval=1 name=pay0 pt=96 '.format(opt.image_width, opt.image_height, self.fps)
I've tried a few others like:
launch_string = 'appsrc name=source block=false format=GST_FORMAT_TIME ' \
'caps=video/x-h264,format=BGR,width={},height={},framerate={}/1 ' \
'! h264parse ' \
'! queue min-threshold-time=300000000 max-size-time=10000000000 max-size-bytes=0 max-size-buffers=0 ' \
'! rtph264pay config-interval=1 name=pay0 pt=96 '.format(opt.image_width, opt.image_height, self.fps)
but I end up with empty H.264 RTP packets. I think this is just because I don't understand how the pipeline works, so any explanation would be helpful.
This is also my main loop:
def on_need_data(self, src, length):
if self.number_frames >= (self.max_frame): # Loop the video or exit
if LOOP:
self.reset_video()
else:
Gloop.quit()
if self.cap.isOpened():
ret, frame = self.cap.read()
if ret:
if frame.shape[:2] != (self.height, self.width):
if self.debug >=2:
print("Resizing frame")
print(frame.shape[:2])
print((self.height, self.width))
frame = cv2.resize(frame, (self.width, self.height))
data = frame.tobytes()
buf = Gst.Buffer.new_allocate(None, len(data), None)
buf.fill(0, data)
buf.duration = self.duration
timestamp = self.timestamp_frame * self.duration
buf.pts = buf.dts = int(timestamp)
buf.offset = timestamp
self.number_frames += 1
self.timestamp_frame += 1
retval = src.emit('push-buffer', buf)
if self.debug >= 2:
print('pushed buffer to {}, frame {}, duration {} ns, durations {} s'.format(self.device_id,
self.timestamp_frame,
self.duration,
self.duration / Gst.SECOND))
if retval != Gst.FlowReturn.OK:
print("[INFO]: retval not OK: {}".format(retval))
if retval == Gst.FlowReturn.FLUSHING:
print('Offline')
else:
if self.debug > 0:
print("[INFO]: Unable to read frame from cap: ")
print(self.device_id)
print(self.number_frames)
print(self.max_frame)
Gloop.quit()
答案1
得分: 1
我最终使用filesrc解决了我的问题,但这比我想象的要麻烦一些,最终我通过结合以下答案解决了我的问题:https://stackoverflow.com/questions/53747278/seamless-video-loop-in-gstreamer 和 https://stackoverflow.com/questions/61604103/where-are-gstreamer-bus-log-messages。
关键是使用Seek事件,在视频开始处设置一个起始片段,视频结束处设置一个结束片段:
def seek_video(self):
if opt.debug >= 1:
print("Seeking...")
self.my_player.seek(1.0,
Gst.Format.TIME,
Gst.SeekFlags.SEGMENT,
Gst.SeekType.SET, 0,
Gst.SeekType.SET, self.video_length * Gst.SECOND)
这会触发"SEGMENT_DONE"消息的广播,然后可以拦截:
if message.type == Gst.MessageType.SEGMENT_DONE:
self.seek_video()
我还在我的GitHub上提供了一个示例,以及一个关于我所学到的内容的简短博客文章。
英文:
I ended up using filesrc to solve my issue, but this is a little bit more finicky than I thought, I eventually got my solution by combining this answer: https://stackoverflow.com/questions/53747278/seamless-video-loop-in-gstreamer with https://stackoverflow.com/questions/61604103/where-are-gstreamer-bus-log-messages.
The key is to use the Seek event with a start segment at the beginning of the video and an end segment at the end of the video:
def seek_video(self):
if opt.debug >= 1:
print("Seeking...")
self.my_player.seek(1.0,
Gst.Format.TIME,
Gst.SeekFlags.SEGMENT,
Gst.SeekType.SET, 0,
Gst.SeekType.SET, self.video_length * Gst.SECOND)
This causes the "SEGMENT_DONE" message to be broadcast, which can then be intercepted:
if message.type == Gst.MessageType.SEGMENT_DONE:
self.seek_video()
I've also made an example available at my GitHub as well as a short Blog post talking about what I've leaned
答案2
得分: 0
if self.cap.isOpened():
ret, frame = self.cap.read()
这是你的数据来源。你提供的片段没有显示cap
是什么,但可能是某种捕获源。也许是OpenCV?
无论如何,你需要修改数据源,以便提供H264数据包而不是解码帧。之后,你的第二个流程有更好的机会运行。
英文:
if self.cap.isOpened():
ret, frame = self.cap.read()
This is where your data is coming from. The snippet you have included does not show what cap
is, but probably some kind of capture source. Perhaps OpenCV?
In any case, you need to modify your data source to provide you with h264 packets instead of decoded frames. After that, your second pipeline has a better chance of working.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论