Scenarios to be supported:
Example - three colors which play for 32 sec each are different videos and it has text to differentiate it. So we need to get the text and assert it.
Validate specific frames during single video playback.
When playing a single video, pausing at a specific time and validate its time and the frame(image). This also depends on getting the text and assert it
Video playback Smoothness(It won't stop or buffer during complete playback).
Example approach would be to capture the entire playback during training and saving as a baseline. Whenever it runs on the cloud, it should compare with the baseline. Baseline video should not change unless the user wants.
Audio playing or not.