Sending a custom video stream through WebRTC

WebRTC is used to create video call enabled p2p applications. By default it supports only local webcam and audio input to be sent to a peer. However, it might be useful to send a remote video stream to a peer - for example a RTSP stream from an IP camera.
In this post I’ll focus on modifying the peerconnection_client example to send a remote RTSP stream to another peer.

The first step would be to build the peerconnection_client and peerconnection_server applications.
This can be done by following the instruction on the WebRTC site, with regards to fetching the source code, the dependencies and the build tools.

After you get the build tools and the source code, you have to configure the build environment. Define the following environment variables:

export GYP_DEFINES="build_with_libjingle=1 build_with_chromium=0 libjingle_java=1 OS=linux enable_tracing=1"
export GYP_GENERATOR_FLAGS="output_dir=out"

This isn’t a required step, but you may want to do it if you don’t want the whole chromium source to build.
Next you need to generate the build scripts and build peerconnection_client and peerconnection_server:

cd [webrtc checkout dir]/src
gclient runhooks
ninja -C out/Debug peerconnection_client
ninja -C out/Debug peerconnection_server

After these steps you should have the peerconnection_client and peerconnection_server executables in the out/Debug directory. You should try to run the default peer connection example before we start modifying the code, to see that everything is ok. You can follow these instructions to get the sample running: http://www.webrtc.org/native-code/development#TOC-Peerconnection-.

At this point the base sample should be in place and working, and we can start modifying the code.
In my example I am using the OpenCV libraries to read a RTSP stream and to convert it to a format accepted by WebRTC.
You can get OpenCV on Ubuntu, like this:

sudo apt-get install libopencv-core-dev

With the prerequisites installed, we need to modify the code in order to add our custom stream. This is done by implementing a CustomVideoCapturer class, extending the existing cricket::VideoCapturer class.
Go to examples/peerconnection/client and create the following files CustomVideoCapturer.h and CustomVideoCapturer.cpp. The contents will be as follows:
CustomVideoCapturer.h:

#pragma once  
  
#include "opencv2/videoio.hpp"  
#include "talk/media/base/videocapturer.h"  
  
namespace videocapture {  
  
class CustomVideoCapturer :  
        public cricket::VideoCapturer  
{  
public:  
    CustomVideoCapturer(int deviceId);  
    virtual ~CustomVideoCapturer();  
  
    // cricket::VideoCapturer implementation.  
    virtual cricket::CaptureState Start(const cricket::VideoFormat& capture_format) override;  
    virtual void Stop() override;  
    virtual bool IsRunning() override;  
    virtual bool GetPreferredFourccs(std::vector<uint32>* fourccs) override;  
    virtual bool GetBestCaptureFormat(const cricket::VideoFormat& desired, cricket::VideoFormat* best_format) override;  
    virtual bool IsScreencast() const override;  
  
private:  
    DISALLOW_COPY_AND_ASSIGN(CustomVideoCapturer);  
  
    static void* grabCapture(void* arg);  
  
    //to call the SignalFrameCaptured call on the main thread  
    void SignalFrameCapturedOnStartThread(const cricket::CapturedFrame* frame);  
  
    cv::VideoCapture m_VCapture; //opencv capture object  
    rtc::Thread*  m_startThread; //video capture thread  
};  
  
  
class VideoCapturerFactoryCustom : public cricket::VideoDeviceCapturerFactory  
{  
public:  
    VideoCapturerFactoryCustom() {}  
    virtual ~VideoCapturerFactoryCustom() {}  
  
    virtual cricket::VideoCapturer* Create(const cricket::Device& device) {  
  
        // XXX: WebRTC uses device name to instantiate the capture, which is always 0.  
        return new CustomVideoCapturer( atoi(device.id.c_str()));  
    }  
};  
  
} // namespace videocapture

CustomVideoCapturer.cpp:

#include "customvideocapturer.h"  
#include <iostream>  
#include <pthread.h>  
#include <sys/time.h>  
  
#include "opencv2/opencv.hpp"  
#include <opencv2/core/core.hpp>  
#include <opencv2/highgui/highgui.hpp>  
  
#include "webrtc/common_video/libyuv/include/webrtc_libyuv.h"  
#include "talk/media/webrtc/webrtcvideocapturer.h"  
  
#include <memory>  
  
using std::endl;  
  
namespace videocapture {  
  
#define STREAM "rtsp://184.72.239.149/vod/mp4:BigBuckBunny_175k.mov"  
  
pthread_t g_pthread;  
  
CustomVideoCapturer::CustomVideoCapturer(int deviceId)  
{  
    m_VCapture.open(STREAM);  
}  
  
CustomVideoCapturer::~CustomVideoCapturer()  
{  
}  
  
cricket::CaptureState CustomVideoCapturer::Start(const cricket::VideoFormat& capture_format)  
{  
    std::cout << "Start" << endl;  
    if (capture_state() == cricket::CS_RUNNING) {  
        std::cout << "Start called when it's already started." << endl;  
        return capture_state();  
    }  
  
    while(!m_VCapture.isOpened()){  
        std::cout << "Capturer is not open -> will try to reopen" << endl;  
        m_VCapture.open(STREAM);  
    }  
    //get a reference to the current thread so we can send the frames to webrtc  
    //on the same thread on which the capture was started  
    m_startThread = rtc::Thread::Current();  
  
    //start frame grabbing thread  
    pthread_create(&g_pthread, NULL, grabCapture, (void*)this);  
  
    SetCaptureFormat(&capture_format);  
    return cricket::CS_RUNNING;  
}  
  
void CustomVideoCapturer::Stop()  
{  
    std::cout << "Stop" << endl;  
    if (capture_state() == cricket::CS_STOPPED) {  
        std::cout << "Stop called when it's already stopped." << endl;  
        return;  
    }  
  
    m_startThread = nullptr;  
  
    SetCaptureFormat(NULL);  
    SetCaptureState(cricket::CS_STOPPED);  
}  
  
/*static */void* CustomVideoCapturer::grabCapture(void* arg)  
{  
    CustomVideoCapturer *vc = (CustomVideoCapturer*)arg;  
    cv::Mat frame;  
  
    if(nullptr == vc){  
        std::cout << "VideoCapturer pointer is null" << std::endl;  
        return 0;  
    }  
  
    while(vc->m_VCapture.read(frame) && vc->IsRunning()){  
        cv::Mat bgra(frame.rows, frame.cols, CV_8UC4);  
        //opencv reads the stream in BGR format by default  
        cv::cvtColor(frame, bgra, CV_BGR2BGRA);  
  
        webrtc::VideoFrame vframe;  
        if(0 != vframe.CreateEmptyFrame(bgra.cols, bgra.rows, bgra.cols, (bgra.cols+1) /2, (bgra.cols+1) /2) )  
        {  
            std::cout << "Failed to create empty frame" << std::endl;  
        }  
        //convert the frame to I420, which is the supported format for webrtc transport  
        if(0 != webrtc::ConvertToI420(webrtc::kBGRA, bgra.ptr(), 0, 0, bgra.cols, bgra.rows, 0, webrtc::kVideoRotation_0, &vframe) ){  
            std::cout << "Failed to convert frame to i420" << std::endl;  
        }  
        std::vector<uint8_t> capture_buffer_;  
        size_t length = webrtc::CalcBufferSize(webrtc::kI420, vframe.width(), vframe.height());  
        capture_buffer_.resize(length);  
        webrtc::ExtractBuffer(vframe, length, &capture_buffer_\[0\]);  
        std::shared_ptr<cricket::WebRtcCapturedFrame> webrtc_frame(new cricket::WebRtcCapturedFrame(vframe, &capture_buffer_\[0\], length));  
  
        //forward the frame to the video capture start thread  
        if (vc->m_startThread->IsCurrent()) {  
            vc->SignalFrameCaptured(vc, webrtc_frame.get());  
        } else {  
            vc->m_startThread->Invoke<void>(rtc::Bind(&CustomVideoCapturer::SignalFrameCapturedOnStartThread, vc, webrtc_frame.get()));  
        }  
    }  
    return 0;  
}  
  
void CustomVideoCapturer::SignalFrameCapturedOnStartThread(const cricket::CapturedFrame* frame)  
{  
    SignalFrameCaptured(this, frame);  
}  
  
bool CustomVideoCapturer::IsRunning()  
{  
    return capture_state() == cricket::CS_RUNNING;  
}  
  
bool CustomVideoCapturer::GetPreferredFourccs(std::vector<uint32>* fourccs)  
{  
    if (!fourccs)  
        return false;  
    fourccs->push_back(cricket::FOURCC_I420);  
    return true;  
}  
  
bool CustomVideoCapturer::GetBestCaptureFormat(const cricket::VideoFormat& desired, cricket::VideoFormat* best_format)  
{  
    if (!best_format)  
        return false;  
  
    // Use the desired format as the best format.  
    best_format->width = desired.width;  
    best_format->height = desired.height;  
    best_format->fourcc = cricket::FOURCC_I420;  
    best_format->interval = desired.interval;  
    return true;  
}  
  
bool CustomVideoCapturer::IsScreencast() const  
{  
    return false;  
}  
  
} // namespace videocapture

The code should be somewhat self-explanatory. As a high level picture: we are reading the hardcoded RTSP stream using OpenCV, frame by frame, converting the frames from BGR to BGRA and then to I420 (the conversion to BGRA is extra because WebRTC doesn’t provide utilities to convert directly from BGR to I420) which is the supported format for WebRTC video transfer.
To use the custom stream we need to modify the conductor.cc file, and rewrite the OpenVideoCaptureDevice method as follows:

cricket::VideoCapturer* Conductor::OpenVideoCaptureDevice() {  
  rtc::scoped_ptr<cricket::DeviceManagerInterface> dev_manager(  
      cricket::DeviceManagerFactory::Create());  
  if (!dev_manager->Init()) {  
    LOG(LS_ERROR) << "Can't create device manager";  
    return NULL;  
  }  
  
  //Inject our video capturer  
  cricket::DeviceManager* device_manager = static_cast<cricket::DeviceManager*>(dev_manager.get());  
  device_manager->SetVideoDeviceCapturerFactory(new videocapture::VideoCapturerFactoryCustom());  
  
  cricket::VideoCapturer* capturer = NULL;  
  
  cricket::Device dummyDevice;  
  dummyDevice.name = "custom dummy device";  
  capturer = dev_manager->CreateVideoCapturer(dummyDevice);  
  if (capturer == NULL){  
      LOG(LS_ERROR) << "Capturer is NULL!";  
  }  
  
  return capturer;  
}

Basically we’re injecting our VideoCapturerFactoryCustom using a dummy cricket::Device object.
I’ve also disabled the small local video window in my example by commenting this line in Conductor::AddStreams:

main_wnd_->StartLocalRenderer(video_track);

After this we have to modify the build scripts. Find the file libjingle_examples.gyp, find the target_name peerconnection_client for the linux OS, and update the sources tag with the new files, the cflags tag with the additional include directories, and the libraries tag with the additional libraries for linking.
Regenerate the build scripts and re-build peerconnection_client:

gclient runhooksninja -C out/Debug peerconnection_client

You should now have a working peerconnection_client sending the custom RTSP stream to the other peer, with the problem that only the first frame is received, and that you are receiving this log in the console:

webrtc: (video_capture_input.cc:108): Same/old NTP timestamp for incoming frame. Dropping.

All the other frames are dropped because they have the same NTP timestamp. This is a bug in WebRTC as the timestamp of the video frames are not taken into consideration. The only fix for this that I managed to do is to update the timestamp before the check is done.
In video_capture_input.cc add the following line:

void VideoCaptureInput::IncomingCapturedFrame(const VideoFrame& video_frame) {  
...  
incoming_frame.set_ntp_time_ms(Clock::GetRealTimeClock()->CurrentNtpInMilliseconds());

After this change you should have the remote video working on each peer connection.
The end result should look something like this (2 peersconnection_client instances running on the same machine):

Some issues are still present at this point:

the video has a blue tint - from what I read this is a bug in WebRTC
the audio is still received from the local microphone - it is not possible to stream audio at the moment - this is currently being implemented/discussed inside the WebRTC team

References:

Sending a custom video stream through WebRTC

2015/09/06