Note
Click here to download the full example code
StreamWriter Advanced Usage¶
Author: Moto Hira
This tutorial shows how to use torchaudio.io.StreamWriter to
play audio and video.
Note
This tutorial uses hardware devices, thus it is not portable across different operating systems.
The tutorial was written and tested on MacBook Pro (M1, 2020).
Note
This tutorial requires torchaudio nightly build and FFmpeg libraries (>=4.1, <4.4).
To install torchaudio nightly build, please refer to https://pytorch.org/get-started/locally/ .
There are multiple ways to install FFmpeg libraries.
If you are using Anaconda Python distribution,
conda install 'ffmpeg<4.4' will install the required FFmpeg libraries,
however, this distribution does not have SDL plugin, so it cannot play
video.
Warning
TorchAudio dynamically loads compatible FFmpeg libraries installed on the system. The types of supported formats (media format, encoder, encoder options etc) depend on the libraries.
To check the available devices, muxers and encoders, you can use the following commands
ffmpeg -muxers
ffmpeg -encoders
ffmpeg -devices
ffmpeg -protocols
Preparation¶
import torch
import torchaudio
print(torch.__version__)
print(torchaudio.__version__)
from torchaudio.io import StreamWriter
from torchaudio.utils import download_asset
AUDIO_PATH = download_asset("tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042.wav")
VIDEO_PATH = download_asset("tutorial-assets/stream-api/NASAs_Most_Scientifically_Complex_Space_Observatory_Requires_Precision-MP4_small.mp4")
Device Availability¶
StreamWriter takes advantage of FFmpeg’s IO abstraction and
writes the data to media devices such as speakers and GUI.
To write to devices, provide format option to the constructor
of StreamWriter.
Different OS will have different device options and their availabilities depend on the actual installation of FFmpeg.
To check which device is available, you can use ffmpeg -devices command.
“audiotoolbox” (speaker) and “sdl” (video GUI) are available.
$ ffmpeg -devices
...
Devices:
 D. = Demuxing supported
 .E = Muxing supported
 --
  E audiotoolbox    AudioToolbox output device
 D  avfoundation    AVFoundation input device
 D  lavfi           Libavfilter virtual input device
  E opengl          OpenGL output
  E sdl,sdl2        SDL2 output device
For details about what devices are available on which OS, please check the official FFmpeg documentation. https://ffmpeg.org/ffmpeg-devices.html
Playing audio¶
By providing format="audiotoolbox" option, the StreamWriter writes
data to speaker device.
# Prepare sample audio
waveform, sample_rate = torchaudio.load(AUDIO_PATH, channels_first=False, normalize=False)
num_frames, num_channels = waveform.shape
# Configure StreamWriter to write to speaker device
s = StreamWriter(dst="-", format="audiotoolbox")
s.add_audio_stream(sample_rate, num_channels, format="s16")
# Write audio to the device
with s.open():
    for i in range(0, num_frames, 256):
        s.write_audio_chunk(0, waveform[i:i+256])
Note
Writing to “audiotoolbox” is blocking operation, but it will not wait for the aduio playback. The device must be kept open while audio is being played.
The following code will close the device as soon as the audio is
written and before the playback is completed.
Adding time.sleep() will help keep the device open until
the playback is completed.
with s.open():
    s.write_audio_chunk(0, waveform)
Playing Video¶
To play video, you can use format="sdl" or format="opengl".
Again, you need a version of FFmpeg with corresponding integration
enabled. The available devices can be checked with ffmpeg -devices.
Here, we use SDL device (https://ffmpeg.org/ffmpeg-devices.html#sdl).
# note:
#  SDL device does not support specifying frame rate, and it has to
#  match the refresh rate of display.
frame_rate = 120
width, height = 640, 360
For we define a helper function that delegates the video loading to a background thread and give chunks
running = True
def video_streamer(path, frames_per_chunk):
    import queue, threading
    from torchaudio.io import StreamReader
    q = queue.Queue()
    # Streaming process that runs in background thread
    def _streamer():
        streamer = StreamReader(path)
        streamer.add_basic_video_stream(
            frames_per_chunk, format="rgb24",
            frame_rate=frame_rate, width=width, height=height)
        for (chunk_, ) in streamer.stream():
            q.put(chunk_)
            if not running:
                break
    # Start the background thread and fetch chunks
    t = threading.Thread(target=_streamer)
    t.start()
    while running:
        try:
            yield q.get()
        except queue.Empty:
            break
    t.join()
Now we start streaming. Pressing “Q” will stop the video.
Note
write_video_chunk call against SDL device blocks until SDL finishes playing the video.
# Set output device to SDL
s = StreamWriter("-", format="sdl")
# Configure video stream (RGB24)
s.add_video_stream(frame_rate, width, height, format="rgb24", encoder_format="rgb24")
# Play the video
with s.open():
    for chunk in video_streamer(VIDEO_PATH, frames_per_chunk=256):
        try:
            s.write_video_chunk(0, chunk)
        except RuntimeError:
            running = False
            break
[code]
Streaming Video¶
So far, we looked at how to write to hardware devices. There are some alternative methods for video streaming.
RTMP (Real-Time Messaging Protocol)¶
Using RMTP, you can stream media (video and/or audio) to a single client. This does not require a hardware device, but it requires a separate player.
To use RMTP, specify the protocol and route in dst argument in
StreamWriter constructor, then pass {"listen": "1"} option when opening
the destination.
StreamWriter will listen to the port and wait for a client to request the video.
The call to open is blocked until a request is received.
s = StreamWriter(dst="rtmp://localhost:1935/live/app", format="flv")
s.add_audio_stream(sample_rate=sample_rate, num_channels=num_channels, encoder="aac")
s.add_video_stream(frame_rate=frame_rate, width=width, height=height)
with s.open(option={"listen": "1"}):
    for video_chunk, audio_chunk in generator():
        s.write_audio_chunk(0, audio_chunk)
        s.write_video_chunk(1, video_chunk)
[code]
UDP (User Datagram Protocol)¶
Using UDP, you can stream media (video and/or audio) to socket. This does not require a hardware device, but it requires a separate player.
Unlike RTMP streaming and client processes are disconnected. The streaming process are not aware of client process.
s = StreamWriter(dst="udp://localhost:48550", format="mpegts")
s.add_audio_stream(sample_rate=sample_rate, num_channels=num_channels, encoder="aac")
s.add_video_stream(frame_rate=frame_rate, width=width, height=height)
with s.open():
    for video_chunk, audio_chunk in generator():
        s.write_audio_chunk(0, audio_chunk)
        s.write_video_chunk(1, video_chunk)
[code]
Tag: torchaudio.io
Total running time of the script: ( 0 minutes 0.000 seconds)