Warning
TorchAudio’s C++ API is prototype feature. API/ABI backward compatibility is not guaranteed.
Note
The top-level namespace has been changed from torchaudio to torio.
StreamWriter has been renamed to StreamingMediaEncoder.
torio::io::StreamingMediaEncoder¶
StreamingMediaEncoder is the implementation used by Python equivalent and provides similar interface.
When working with custom I/O, such as in-memory data, StreamingMediaEncoderCustomIO class can be used.
Both classes have the same methods defined, so their usages are the same.
Constructors¶
StreamingMediaEncoder¶
- 
class StreamingMediaEncoder¶
- Encode and write audio/video streams chunk by chunk - Subclassed by torio::io::StreamingMediaEncoderCustomIO 
- 
explicit torio::io::StreamingMediaEncoder::StreamingMediaEncoder(const std::string &dst, const c10::optional<std::string> &format = c10::nullopt)¶
- Construct StreamingMediaEncoder from destination URI - Parameters:
- dst – Destination where encoded data are written. 
- format – Specify output format. If not provided, it is guessed from - dst.
 
 
StreamingMediaEncoderCustomIO¶
- 
class StreamingMediaEncoderCustomIO : private detail::CustomOutput, public torio::io::StreamingMediaEncoder¶
- A subclass of StreamingMediaDecoder which works with custom read function. Can be used for encoding media into memory or custom object. 
- 
torio::io::StreamingMediaEncoderCustomIO::StreamingMediaEncoderCustomIO(void *opaque, const c10::optional<std::string> &format, int buffer_size, int (*write_packet)(void *opaque, uint8_t *buf, int buf_size), int64_t (*seek)(void *opaque, int64_t offset, int whence) = nullptr)¶
- Construct StreamingMediaEncoderCustomIO with custom write and seek functions. - Parameters:
- opaque – Custom data used by - write_packetand- seekfunctions.
- format – Specify output format. 
- buffer_size – The size of the intermediate buffer, which FFmpeg uses to pass data to write_packet function. 
- write_packet – Custom write function that is called from FFmpeg to actually write data to the custom destination. 
- seek – Optional seek function that is used to seek the destination. 
 
 
Config methods¶
add_audio_stream¶
- 
void torio::io::StreamingMediaEncoder::add_audio_stream(int sample_rate, int num_channels, const std::string &format, const c10::optional<std::string> &encoder = c10::nullopt, const c10::optional<OptionDict> &encoder_option = c10::nullopt, const c10::optional<std::string> &encoder_format = c10::nullopt, const c10::optional<int> &encoder_sample_rate = c10::nullopt, const c10::optional<int> &encoder_num_channels = c10::nullopt, const c10::optional<CodecConfig> &codec_config = c10::nullopt, const c10::optional<std::string> &filter_desc = c10::nullopt)¶
- Add an output audio stream. - Parameters:
- sample_rate – The sample rate. 
- num_channels – The number of channels. 
- format – Input sample format, which determines the dtype of the input tensor. - "u8": The input tensor must be- torch.uint8type.
- "s16": The input tensor must be- torch.int16type.
- "s32": The input tensor must be- torch.int32type.
- "s64": The input tensor must be- torch.int64type.
- "flt": The input tensor must be- torch.float32type.
- "dbl": The input tensor must be- torch.float64type.
 - Default: - "flt".
- encoder – The name of the encoder to be used. - When provided, use the specified encoder instead of the default one. - To list the available encoders, you can use - ffmpeg -encoderscommand.
- encoder_option – Options passed to encoder. To list encoder options for a encoder, you can use - ffmpeg -h encoder=<ENCODER>.
- encoder_format – Format used to encode media. When encoder supports multiple formats, passing this argument will override the format used for encoding. To list supported formats for the encoder, you can use - ffmpeg -h encoder=<ENCODER>command.
- encoder_sample_rate – If provided, perform resampling before encoding. 
- encoder_num_channels – If provided, change channel configuration before encoding. 
- codec_config – Codec configuration. 
- filter_desc – Additional processing to apply before encoding the input data 
 
 
add_video_stream¶
- 
void torio::io::StreamingMediaEncoder::add_video_stream(double frame_rate, int width, int height, const std::string &format, const c10::optional<std::string> &encoder = c10::nullopt, const c10::optional<OptionDict> &encoder_option = c10::nullopt, const c10::optional<std::string> &encoder_format = c10::nullopt, const c10::optional<double> &encoder_frame_rate = c10::nullopt, const c10::optional<int> &encoder_width = c10::nullopt, const c10::optional<int> &encoder_height = c10::nullopt, const c10::optional<std::string> &hw_accel = c10::nullopt, const c10::optional<CodecConfig> &codec_config = c10::nullopt, const c10::optional<std::string> &filter_desc = c10::nullopt)¶
- Add an output video stream. - Parameters:
- frame_rate – Frame rate 
- width – Width 
- height – Height 
- format – Input pixel format, which determines the color channel order of the input tensor. - "gray8": One channel, grayscale.
- "rgb24": Three channels in the order of RGB.
- "bgr24": Three channels in the order of BGR.
- "yuv444p": Three channels in the order of YUV.
 - In either case, the input tensor has to be - torch.uint8type and the shape must be (frame, channel, height, width).
- encoder – See - add_audio_stream().
- encoder_option – See - add_audio_stream().
- encoder_format – See - add_audio_stream().
- encoder_frame_rate – If provided, change frame rate before encoding. 
- encoder_width – If provided, resize image before encoding. 
- encoder_height – If provided, resize image before encoding. 
- hw_accel – Enable hardware acceleration. 
- codec_config – Codec configuration. - When video is encoded on CUDA hardware, for example - encoder="h264_nvenc", passing CUDA device indicator to- hw_accel(i.e.- hw_accel="cuda:0") will make StreamingMediaEncoder expect video chunk to be a CUDA Tensor. Passing CPU Tensor will result in an error.- If - None, the video chunk Tensor has to be a CPU Tensor.
- filter_desc – Additional processing to apply before encoding the input data 
 
 
set_metadata¶
- 
void torio::io::StreamingMediaEncoder::set_metadata(const OptionDict &metadata)¶
- Set file-level metadata - Parameters:
- metadata – metadata. 
 
Write methods¶
open¶
- 
void torio::io::StreamingMediaEncoder::open(const c10::optional<OptionDict> &opt = c10::nullopt)¶
- Open the output file / device and write the header. - Parameters:
- opt – Private options for protocol, device and muxer. 
 
close¶
- 
void torio::io::StreamingMediaEncoder::close()¶
- Close the output file / device and finalize metadata. 
write_audio_chunk¶
- 
void torio::io::StreamingMediaEncoder::write_audio_chunk(int i, const torch::Tensor &frames, const c10::optional<double> &pts = c10::nullopt)¶
- Write audio data - Parameters:
- i – Stream index. 
- frames – Waveform tensor. Shape: - (frame, channel). The- dtypemust match what was passed to- add_audio_stream()method.
- pts – - Presentation timestamp. If provided, it overwrites the PTS of the first frame with the provided one. Otherwise, PTS are incremented per an inverse of sample rate. Only values exceed the PTS values processed internally. - NOTE: The provided value is converted to integer value expressed in basis of sample rate. Therefore, it is truncated to the nearest value of - n / sample_rate.
 
 
write_video_chunk¶
- 
void torio::io::StreamingMediaEncoder::write_video_chunk(int i, const torch::Tensor &frames, const c10::optional<double> &pts = c10::nullopt)¶
- Write video data - Parameters:
- i – Stream index. 
- frames – Video/image tensor. Shape: - (time, channel, height, width). The- dtypemust be- torch.uint8. The shape- (height, width and the number of channels)must match what was configured when calling- add_video_stream().
- pts – - Presentation timestamp. If provided, it overwrites the PTS of the first frame with the provided one. Otherwise, PTS are incremented per an inverse of frame rate. Only values exceed the PTS values processed internally. - NOTE: The provided value is converted to integer value expressed in basis of frame rate. Therefore, it is truncated to the nearest value of - n / frame_rate.
 
 
flush¶
- 
void torio::io::StreamingMediaEncoder::flush()¶
- Flush the frames from encoders and write the frames to the destination.