排查SRS问题记录一(RTMP推流,RTC播放卡住)

问题描述:

SRS Version=4.0.126. 配置文件配置:

listen              1935;
max_connections     100;
daemon              off;
srs_log_tank        console;

http_server {
    enabled         on;
    listen          8080;
    dir             ./objs/nginx/html;
}

http_api {
    enabled         on;
    listen          1985;
}
stats {
    network         0;
}
rtc_server {
    enabled on;
    # Listen at udp://8000
    listen 8000;
    #
    # The $CANDIDATE means fetch from env, if not configed, use * as default.
    #
    # The * means retrieving server IP automatically, from all network interfaces,
    # @see https://github.com/ossrs/srs/wiki/v4_CN_RTCWiki#config-candidate
    candidate 192.168.71.35; #本机IP
}

vhost __defaultVhost__ {
    rtc {
        enabled     on;
    }
    http_remux {
        enabled     on;
        mount       [vhost]/[app]/[stream].flv;
    }
    http_hooks {
        enabled on;
        on_rtc_play http://127.0.0.1/control/relay/subscriber;
    }
}

注释:on_rtc_play http://127.0.0.1/control/relay/subscriber; 自定义开发,目的是在rtc播放拉流时,如果srs服务中没有直播流,可以通过回调到Nginx-RTMP触发RTMP推流到SRS中。

一、使用ffmpeg推流:

ffmpeg -re -i ~/Downloads/test.mp4 -c copy -f flv rtmp://127.0.0.1/live/livestream

二、使用http://127.0.0.1:8080/中SRS播放器播放。播放正常。

三、再打开一个页面使用http://127.0.0.1:8080/中SRS播放器播放,发现无法播放,并且第二部中的播放页面也会卡住。

只要开启rtc enabled on;即使不使用rtc播放,也存在如下问题
  1. 使用ffmpeg推流
  2. 使用ffplay播放流
  3. 使用ffmpeg推相同的流
  4. 第2步中的流可以播放,当时关闭播放再使用ffplay播放会失败。

问题分析

一、查看日志,发现有:

cleanup when unpublish
cleanup when unpublish, created=%u, deliver=%u

说明有触发推流断开。猜想可能是第二次进行RTC播放时,回调了http://127.0.0.1/control/relay/subscriber,使Nginx-RTMP服务又推流到SRS,并且app和name是相同的。这样第二次的RTMP推流由于是重复的流,所以会推流失败,然后进入unpublish调用链中。

二、在void SrsLiveSource::on_unpublish()函数中,将RTMP转RTC的bridger_进行了释放。

void SrsLiveSource::on_unpublish()
{
    。。。。。略
    if (bridger_) {
        bridger_->on_unpublish();
        srs_freep(bridger_);
    }
    。。。。。略
}

三、对于RTMP推流,是会经常出现重复推流的情况的。在Nginx-RTMP服务中,只会将第二次重复推流进行禁止,并不会影响第一次已经推上来的直播流。同理,SRS中的RTMP服务应该也有类似的逻辑。于是查看代码:

srs_error_t SrsRtmpConn::publishing(SrsLiveSource* source)
{
    srs_error_t err = srs_success;
    
    SrsRequest* req = info->req;
    
    if (_srs_config->get_refer_enabled(req->vhost)) {
        if ((err = refer->check(req->pageUrl, _srs_config->get_refer_publish(req->vhost))) != srs_success) {
            return srs_error_wrap(err, "rtmp: referer check");
        }
    }
    
    if ((err = http_hooks_on_publish()) != srs_success) {
        return srs_error_wrap(err, "rtmp: callback on publish");
    }
    
    // TODO: FIXME: Should refine the state of publishing.
    if ((err = acquire_publish(source)) == srs_success) {
        // use isolate thread to recv,
        // @see: https://github.com/ossrs/srs/issues/237
        SrsPublishRecvThread rtrd(rtmp, req, srs_netfd_fileno(stfd), 0, this, source, _srs_context->get_id());
        err = do_publishing(source, &rtrd);
        rtrd.stop();
    }
    
    // whatever the acquire publish, always release publish.
    // when the acquire error in the midlle-way, the publish state changed,
    // but failed, so we must cleanup it.
    // @see https://github.com/ossrs/srs/issues/474
    // @remark when stream is busy, should never release it.
    if (srs_error_code(err) != ERROR_SYSTEM_STREAM_BUSY) {
        release_publish(source);
    }
    
    http_hooks_on_unpublish();
    
    return err;
}

可以看到只有srs_error_code(err) != ERROR_SYSTEM_STREAM_BUSY才会将RTMP publish连接进行释放。

四、断点查看生成error code的代码

srs_error_t SrsRtmpConn::acquire_publish(SrsLiveSource* source)
{
    srs_error_t err = srs_success;
    
    SrsRequest* req = info->req;
	
    // @see https://github.com/ossrs/srs/issues/2364
    // Check whether GB28181 stream is busy.
#if defined(SRS_GB28181)
    if (_srs_gb28181 != NULL) {
        SrsGb28181RtmpMuxer* gb28181 = _srs_gb28181->fetch_rtmpmuxer(req->stream);
        if (gb28181 != NULL) {
            return srs_error_new(ERROR_SYSTEM_STREAM_BUSY, "gb28181 stream %s busy", req->get_stream_url().c_str());
        }
    }
#endif

    // Check whether RTC stream is busy.
#ifdef SRS_RTC
    SrsRtcSource *rtc = NULL;
    bool rtc_server_enabled = _srs_config->get_rtc_server_enabled();
    bool rtc_enabled = _srs_config->get_rtc_enabled(req->vhost);
    if (rtc_server_enabled && rtc_enabled && !info->edge) {
        if ((err = _srs_rtc_sources->fetch_or_create(req, &rtc)) != srs_success) {
            return srs_error_wrap(err, "create source");
        }

        if (!rtc->can_publish()) {
            return srs_error_new(ERROR_RTC_SOURCE_BUSY, "rtc stream %s busy", req->get_stream_url().c_str());
        }
    }
#endif

    // Check whether RTMP stream is busy.
    if (!source->can_publish(info->edge)) {
        return srs_error_new(ERROR_SYSTEM_STREAM_BUSY, "rtmp: stream %s is busy", req->get_stream_url().c_str());
    }

    // Bridge to RTC streaming.
#if defined(SRS_RTC) && defined(SRS_FFMPEG_FIT)
    if (rtc) {
        SrsRtcFromRtmpBridger *bridger = new SrsRtcFromRtmpBridger(rtc);
        if ((err = bridger->initialize(req)) != srs_success) {
            srs_freep(bridger);
            return srs_error_wrap(err, "bridger init");
        }

        source->set_bridger(bridger);
    }
#endif

    // Start publisher now.
    if (info->edge) {
        return source->on_edge_start_publish();
    } else {
        return source->on_publish();
    }
}

发现其在这一步直接返回错误了:

if (!rtc->can_publish()) {
            return srs_error_new(ERROR_RTC_SOURCE_BUSY, "rtc stream %s busy", req->get_stream_url().c_str());
        }

这是由于在第一个推流时,已经将RTMP转RTC的bridger_的SrsRtcSource中is_created_ = true;所以会返回RTC的推流已经存在。错误码并不是RTMP重复推流的错误码ERROR_SYSTEM_STREAM_BUSY。

问题修复

将acquire_publish中判断RTC推流重复和RTMP推流重复调换位置。如下:

srs_error_t SrsRtmpConn::acquire_publish(SrsLiveSource* source)
{
    srs_error_t err = srs_success;
    
    SrsRequest* req = info->req;
	
    // @see https://github.com/ossrs/srs/issues/2364
    // Check whether GB28181 stream is busy.
#if defined(SRS_GB28181)
    if (_srs_gb28181 != NULL) {
        SrsGb28181RtmpMuxer* gb28181 = _srs_gb28181->fetch_rtmpmuxer(req->stream);
        if (gb28181 != NULL) {
            return srs_error_new(ERROR_SYSTEM_STREAM_BUSY, "gb28181 stream %s busy", req->get_stream_url().c_str());
        }
    }
#endif

    // Check whether RTMP stream is busy.
    if (!source->can_publish(info->edge)) {
        return srs_error_new(ERROR_SYSTEM_STREAM_BUSY, "rtmp: stream %s is busy", req->get_stream_url().c_str());
    }

    // Check whether RTC stream is busy.
#ifdef SRS_RTC
    SrsRtcSource *rtc = NULL;
    bool rtc_server_enabled = _srs_config->get_rtc_server_enabled();
    bool rtc_enabled = _srs_config->get_rtc_enabled(req->vhost);
    if (rtc_server_enabled && rtc_enabled && !info->edge) {
        if ((err = _srs_rtc_sources->fetch_or_create(req, &rtc)) != srs_success) {
            return srs_error_wrap(err, "create source");
        }

        if (!rtc->can_publish()) {
            return srs_error_new(ERROR_RTC_SOURCE_BUSY, "rtc stream %s busy", req->get_stream_url().c_str());
        }
    }
#endif

    // Bridge to RTC streaming.
#if defined(SRS_RTC) && defined(SRS_FFMPEG_FIT)
    if (rtc) {
        SrsRtcFromRtmpBridger *bridger = new SrsRtcFromRtmpBridger(rtc);
        if ((err = bridger->initialize(req)) != srs_success) {
            srs_freep(bridger);
            return srs_error_wrap(err, "bridger init");
        }

        source->set_bridger(bridger);
    }
#endif

    // Start publisher now.
    if (info->edge) {
        return source->on_edge_start_publish();
    } else {
        return source->on_publish();
    }
}

上一篇:持续集成部署Jenkins工作笔记0005---应用服务器设置账号密码说明


下一篇:centos安装jekins