Rich Freeman via plug on 31 Aug 2020 16:46:13 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Transcoding MTS vids to something else for frame extraction |
On Mon, Aug 31, 2020 at 4:50 PM JP Vossen via plug <plug@lists.phillylinux.org> wrote: > > By the way, the videos are short clips of dolphins just off the coast of > Kitty Hawk, NC from a recent vacation. We're not fast enough with a > camera to capture them so we took vids with the express idea of > extracting frames for pics. The camera is a relatively recent Sony > point-n-shoot. > So, you're dealing with a couple of things here. The only thing you can correct at this point is the processing - extracting the best still that you can from the data the camera captured. I don't have any advice here, though doing multiple conversions probably isn't a good idea. You really want something that will construct the best image it can in memory, and convert it once to jpeg. I suspect most of your problems were with the original capture, and there is no way you can fix those now. I do a fair bit of photography these days but I'm not really an expert at the video side - I know enough to be dangerous. Your loss of image quality for extracting a still are going to come from two sources: how each frame of the video is captured, and how the video is encoded. I'll elaborate on each. When it comes to capturing a frame of video, you need to understand that video is generally captured in a way that is intended to render a good image when it is played back continuously. The first issue you'll have is rolling shutter - a video camera basically captures lines from top to bottom in semi-parallel, but not synchronously across the entire frame. It will start capturing line one, then two, then three, and so on, and then after a sufficient delay it will finish capturing line one, then two, then three. So at any time many lines are being exposed with this area of capture moving down the frame. If the scan rate is slow enough it could end up just wrapping right around so that it is capturing line one of the next frame before it is finished the last line of the previous frame. This causes artifacts when you have panning or rapidly moving objects in-frame, like video of an aircraft propeller which looks bent. Better cameras can scan the sensor faster, so that more of it can be run in parallel to finish the entire frame in time, and you get less rolling shutter. The next capture issue is interlacing. I have no idea if your video was actually interlaced but if it was then that will cause obvious artifacts when you want to capture only one "frame" - which is two separate fields captured at slightly different times. They're going to give you that screen-door-like pattern if you're panning with every other line being shifted. Interlacing isn't so common these days but maybe cheaper cameras still do it. Then you get issues with how the pixels themselves are sampled. Many cameras record video at a lower resolution than stills. A 1080p frame is only about 2 megapixels - not an impressive spec by today's standard. So, you're not going to get more resolution than that from a video frame. Then there can be issues around how the color vs luminosity are captured, and in general the cheaper the camera the worse this is. Then there is bit depth - cheaper cameras are probably only going to sample 8 bits, while better ones will do more, which is an issue if you have highlights and shadows. Now we'll talk about encoding. The first issue there that you touched on is inter-frame compression. First, every frame is compressed with discrete cosine like a jpeg, which of course reduces quality (and the lower the bitrate the worse this gets). Then the stream is divided into I/P/B frames. I frames or keyframes are basically just jpegs - if you extract one you get a clean jpeg (within the limits above around how the frame was captured). P frames are basically diffs vs one frame, and B frames are bilinear diffs vs two frames. The software will do the best it can to encode this but in general the further you are from a keyframe the worse a still will look. A higher-end camera will record only I frames or raw frames, at a very high bitrate. rtjpeg is basically all I frames. Storing that obviously requires a very large memory card, and a fast one especially for 4k+ or higher frame rates (the newest memory cards are basically NVMe drives, and priced like them). Your fairly cheap point/shoot camera is probably checking all the boxes above in terms of things that make the camera less expensive to build, and which greatly reduce the quality of the video. Really though I'm not sure what alternatives you have. If the camera doesn't have a mechanical shutter (clicking sound) then it will have rolling shutter even for stills, and of course the latency/etc is such that timing a shot is probably not going to be great. Your best bet is if the camera has a burst mode of some kind - some point/shoot cameras will basically fill the buffer with a series of jpgs captured at a higher than normal rate when you hit the bottom. This will probably get you much better quality than trying to pull frames from video. Alternatively you could get a better camera - either a still camera that has high fps and a mechanical shutter, or a camera that records higher-quality video, but you're going to spend closer to $1k or more either way (something that does really good 4k video like the A7Siii is over $3k). Unfortunately in photography you tend to get what you pay for. You might still get a somewhat better image out of your existing camera - I'm just not sure what the best FOSS solution for this is. I don't deal with much video. -- Rich -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug