Recovering Audio

Recovering Audio

The most typical use of the synchronizing function would be to pass capture a video frame by the means that you normally would, then to make a call to:

NDIlib_avsync_ret_e NDIlib_avsync_synchronize(
    NDIlib_avsync_instance_t p_avsync, 
    const NDIlib_video_frame_v2_t *p_video_frame, 
    NDIlib_audio_frame_v3_t* p_audio_frame
);

This function is however deceptively powerful and the key to understanding this is correctly passing the correct values into the NDIlib_audio_frame_v3_t* p_audio_frame parameter of this function. The following describes the values that may be specified.

ParameterDescription

no_samples

If this value is โ€œ0โ€ when you pass it into the function, then if audio may be recovered then the synchronization function will automatically return the exact length of audio that matches the video frame that was passed in as the first parameter. The number of samples returned in this way will almost exactly match the values related to the timestamps but can be assumed to very accurately reflect the audio that matches the frame. Because the timestamps are often subject to noise when frames are stamped, the number of samples might vary slightly. This will result in a return code of NDIlib_avsync_ret_success.

If this value is some constant (e.g., 1601 or 1602) then the AV sync will attempt to return this number of samples if it is close to the true number of audio samples that are aligned with this frame. This is designed so that you let this class correctly recover audio that might follow some external constraint on the number of samples that are used with video frames. If the number of samples requested is sufficiently close to the number of audio samples that match this frame, then a return code of NDIlib_avsync_ret_success is returned. If it was not possible to correctly return this number of samples because it did not closely match the number of samples that are truly associated with this video frame, then the function will instead return the correct audio (which might be too many or too few samples) and return a value NDIlib_avsync_ret_success_num_samples_not_matched. It would then be the responsibility of the caller to determine how to best handle this condition. This condition is normally caused by trying to synchronize audio and video that are not on the same clock.

When requesting a specific number of audio samples, this is normally computed externally to this function under the assumption of some known audio sample rate. Because incoming audio might change sample rates which would render the number of requested samples invalid, please review the section below on how specifying the sample_rate parameter of p_audio_frame.

sample_rate

When this function is โ€œ0โ€, then there is no assumed sample rate, and the function will return an audio frame that specifies the current audio sample rate.

If you are specifying the number of samples to be captured as non-zero, it is likely that this was computed at a given audio sample-rate. If you specify this on the p_audio_frame as input, then if the sample rate of this audio source does not match it will not capture any audio and will return a result of NDIlib_avsync_ret_format_changed and fill in the audio format only in the returned structure. One can then simply recompute the number of required audio samples and simply call the function again to capture the audio with that video frame.

If the NDIlib_video_frame_v2_t *p_video_frame is not specified (is NULL) then the audio parameter can be used either to capture all current audio (no_samples=0) or a specified number of audio samples (no_samples is not zero) and behaves exactly as specified above although all audio is handled and not simply the audio associated with the current video frame.

Please note that this function will fill all parameters of the return frame, including the timecode, timestamp, and metadata. The metadata is chosen from the closest matching audio frame and is returned just one time. If the audio frames are all much smaller in duration than the corresponding video frames, then some metadata fields might be missing since they no longer have corresponding audio data to be assigned too.

It is important that you call NDIlib_avsync_free_audio in order to return the frames returned by NDIlib_avsync_syncronize.

The full set of return codes from this function are documented below. Please note that these have integer values which are positive for success and negative for failure.

Error codeMeaning

*_success

This function succeeded and returned audio that matches the frame, and if you specified sample_rate or no_samples then correctly match those constraints.

*t_success_num_samples_not_matched

This function succeeded, but you specified a no_samples that could not be matched exactly because this would push the audio and video frame sufficiently out of alignment. The full audio samples associated with this video frame are returned and it does not match no_samples. It might be a higher or lower number.

*_no_audio_stream_received

This indicates that there is currently no audio stream and so it would not be possible to return any audio data. This might indicate a video only stream.

*_ret_no_samples_found

This indicates that this video frame did not have matching audio data. This might be because the time at which this video frame was sent there was no corresponding audio. It might also indicate that the sending of the audio and video streams is sufficiently unaligned that they cannot easily be resynchronized; for instance, the audio data arriving is more than a second out of sync with the corresponding video data.

*_format_changed

A sample_rate and no_samples was specified; however the sample rate did not match and so this function has returned and correctly filled in the

*_ret_internal_error

This function was called with incorrect

Last updated