Planning audio and video network bandwidth requirements

Video streams are shared via networks and can use significant network resources. This topic explains how IBM Sametime® video uses bandwidth and how to calculate the amount of bandwidth you need in your environment.

Understanding the audio-video bandwidth needs for your organization depends on two-factors: First the concurrency, or number of users participating in video conferences at the same time. This requires an educated estimate based on the organization's existing data such as assumption, usage culture, pattern in similar technology, or pilot programs. If the estimate is too high, there will be extra bandwidth which can be costly. Conversely, if the estimate is too low, audio and/or video quality might not be acceptable, and other networked applications may suffer due to bandwidth capacity overflow. This part of the assessment requires that you poll your users, collect metrics on their video use, or use another way to estimate how extensively they use the video feature. The second factor is how the media is generated and packaged. Each communication session involves audio-only or audio and video, therefore, the network bandwidth required for a user is basically the total bitrate of the codecs being used in the session. Audio codecs, in general, require lower bitrates than video codecs because audio data has less volume than video data. The first step in determining the bandwidth cost of a given session is knowing which codecs are being used.

Sametime provides 6 audio codecs (SAC (Siren-LPR Scalable), Siren-LPR , G.722.1C , G.722.1, G.729, G.711) and 3 video codecs (H.264-SVC , H.264 and H.263). Each codec requires a different network bandwidth to operate. Within a video codec, there are many attributes that affect the data payload size and bitrate. For example, in video resolutions, HD resolutions require greater bandwidth than SD resolutions.

Sametime defaults to SAC (Siren-LPR Scalable) for audio and H.264-SVC for video during SIP session negotiation between two Sametime endpoints - client to client, or client to the Video MCU. However, endpoints can select audio codecs other than SAC (Siren-LPR Scalable) and H.264-SVC to establish the call in an integrated environment with an external audio/video bridge. This flexibility impacts bandwidth and can be configured and controlled from the external bridge.

Sametime provides capabilities to protect the network from being over-run by audio and video data packets if the usage concurrency is higher than expected. When deployed, each audio and video call is monitored to control bandwidth usage based on class of users and location policies. The call can be allowed, rejected, or modified to meet the utilization of the network bandwidth constraint imposed for audio and video.

The following sections describe in detail the Sametime audio and video codecs usage and network bandwidth management.

Audio Codecs

Sametime uses Siren-LPR Scalable (Scalable Audio Codec) for audio communication. The available bandwidth is split among all users in the call. More bandwidth is allocated for the active speaker (approximately 48k) and less for the background speakers (approximately 10k for each).

The following table lists the bandwidth requirements for each of the audio codecs supported by Sametime.

Table 1. Sametime audio codecs, bitrates and sampling rates

Codec Name Bitrate (kbps) Sampling Rate (kHz)
SAC (Siren-LPR Scalable) 32/ 48/ 64 48
Siren-LPR 24/ 32/ 48/ 64 48
G.722.1C 24/ 32/ 48 32
G.722.1 16/24/32 16
G.729 (only used in SUT) 8 8
G.711 64 8

Video Codecs

Sametime uses H264- SVC, for enhancing video conferencing. This refers to the use of layering when sending video. A less-capable client will request only the lower, lesser quality layers, while a more capable client will receive multiple layers, which combine together to display a higher quality video. The benefit over previous technologies is that SVC performs a more graceful degradation of video when low bandwidth or CPU usage is encountered. Like its predecessor H.264/AVC, SVC covers a wide application range, from low bitrate mobile applications to High-Definition Television (HDTV) broadcasting. For more information on SVC, refer to RFC 6190.

The video conference experience is dictated by the client line rate. The client line rate is set by the administrator as part of the user policy. Some groups of users may have different video policy than others. The policy has the provision for adding conference templates. Each conference room owned by the user maps to a conference template in the video policy. The template specifies three important settings: Conference Mode, Conference Experience, and Conference Line rate. These settings, along with the client line rate, determine the overall video experience for a conference.
  • Default setting for client line rate -- 384 kbps
  • Default setting for mobile client line rate -- 384 kbps
  • Default setting for a conference template:
    • Conference Mode - Mixed AVC+SVC
    • Conference Experience - Optimized for mobile devices
    • Conference Line Rate - 384 kbps
The client line rate defines the highest bandwidth that can be allocated for the client. The conference line rate determines maximum bandwidth allowed for any user in the call.
Note: The conference line rate is not the aggregate of the bandwidth of all the users in the conference but a bandwidth per user.

Sametime 8.5.2 clients use the video resolution parameter in the video policy to determine the maxbitrate, framerate, and video resolution.

Depending on the number of participants in the conference, a Sametime client can receive remote video streams of different resolutions. The video resolution of these streams is decided by the line rate assigned in the policy. For example, if there are ten participants in the conference, a client can receive a maximum of six remote video streams. With the line rate of 1024kbps, two streams would be 180p@30fps, while four will be 180p@15fps.

Table 2. Number of Remote Video Streams: 6

Bit Rate Down-link resolution

1920 kbps

180p@30fps x 6

1024 kbps

180p@30fps x 2 +

180p@15fps x 4

768 kbps

180p@15fps x 4+

180p@7.5fps x 2

512 kbps

180p@15fps x 5

384 kbps

180p@15fps x 1+

180p@7.5fps x 2

256 kbps

180p@7.5fps x 2

Table 3. Number of Remote Video Streams: 5

Bit Rate Down-link resolution

1920 kbps

180p@30fps x 5

1024 kbps

180p@30fps x 4+

180p@15fps x 1

768 kbps

180p@15fps x 5

512 kbps

180p@7.5fps x 5

384 kbps

180p@15fps x 1+

180p@7.5fps x 2

256 kbps

180p@7.5fps x 2

Table 4. Number of Remote Video Streams: 4

Bit Rate Down-link resolution

1920 kbps

360p@30fps x 4

1024 kbps

180p@30fps x 4

768 kbps

360p@15fps x 1+

180p@15fps x 3

512 kbps

180p@7.5fps x 4

384 kbps

180p@15fps x 1+

180p@7.5fps x 2

256 kbps

180p@7.5fps x 2

Table 5. Number of Remote Video Streams: 3

Bit Rate Down-link resolution

1920 kbps

360p@30fps x 3

1024 kbps

360p@30fps x 2+1

360p@15fps x 1

768 kbps

180p@30fps x 3

512 kbps

180p@15fps x 3

384 kbps

180p@15fps x 1+

180p@7.5fps x 2

256 kbps

180p@7.5fps x 2

Table 6. Number of Remote Video Streams: 2

Bit Rate Down-link resolution

1920 kbps

360p@30fps x 2

1024 kbps

360p@30fps x 2

768 kbps

360p@30fps x 1+

360p@15fps x 1

512 kbps

180p@30fps x 2

384 kbps

180p@15fps x 2

256 kbps

180p@7.5fps x 2

Table 7. Number of Remote Video Streams: 1

Bit Rate Down-link resolution

1920 kbps

720p@30fps x 1

1024 kbps

720p@30fps x 1

768 kbps

360p@30fps x 1

512 kbps

360p@30fps x 1

384 kbps

360p@30fps x 1

256 kbps

360p@30fps x 1

Table 8. Video Resolutions and Bandwidth Requirement for Uplink from Client to Video MCUThe Sametime client sends three temporal layers (T0, T1 and T2) of 7.5, 15 and 30 frames per seconds (fps) for each of the spatial resolutions of 180p, 360p and 720p. The bandwidth requirement for each temporal layer is listed in the following table.
Temporal Layer 180p Resolution 360p Resolution 720p Resolution
Base layer, 7.5 fps 86 kbps 173 kbps 346 kbps
First layer, 15 fps 128 kbps 256 kbps 512 kbps
Second layer, 30 fps 192 kbps 384 kbps 768 kbps

The Sametime client can send multiple temporal layers of each resolution based on available bandwidth. The list of uplink resolutions for a given bandwidth is listed in Table 9. For example, with 1024 kbps, a client can send three streams of 180p@30fps, 360p@15fps and 720p@15fps. However, to save bandwidth, a client will send a stream only if there is at least one remote client in the conference receiving it. Therefore, if no remote client is receiving the 720p@15fps, it is not sent. The client sends only the 180p@30fps and 360p@15fps streams.

Table 9. Uplink resolutions per bandwidth

Bit Rate Down-link resolution

1920 kbps

180p@30fps + 360p@30fps + 720p@30fps

1024 kbps

180p@30fps + 360p@15fps + 720p@15fps

768 kbps

180p@30fps + 360p@30fps

512 kbps

180p@30fps + 360p@15fps

384 kbps

180p@15fps + 270p@15fps

256 kbps

180p@30fps

128 kbps

180p@7.5fps

In this example, a video conference consists of 6 participants: 2 participants on mobile devices, 2 participants on a mid-range laptop, and 2 participants on a high-end laptop with a large screen and powerful processor.

Table 10. Bandwidth consumption metrics for each type of user in the example

Device Video Resolution@Frame Rate (fps) x Number of User Possible Number of Remote Videos Downlink Bandwidth Consumed by client (kbps) Uplink Bandwidth Consumed By Client (kbps)

Mobile 1

180P@7.5fps x 2

2

256

256

Mobile 2

180P@15fps x 1 + 180P@7.5fps x 2

3

384

256

Mid range Desktop 1

180P@7.5fps x 5

5

512

256

Mid range Desktop 2

180P@15fps x 5

5

768

256

High end Desktop 1

180P@30fps x 4 + 180P@15fps x 1

5

1024

256

High end Desktop 2

180P@30fps x 5

5

1920

256

Aggregate bandwidth consumed for this conference is 4864 kbps in downlink and 1536 kbps in uplink.

Bandwidth Management

Moderate all audio and video data rates to protect the network for other business critical applications and to provide enough bandwidth for acceptable voice and visual quality.

Sametime uses SIP to negotiate media sessions. Embedded in the SIP message is an SDP (Session Description Protocol RFC 4566) section containing the desired session bandwidth attribute, which the Bandwidth Manager uses to monitor transmission rates on the managed network.

The following graphic shows Bandwidth Manager deployed and part of the signalling path, performing CAC (Call Access Control) based on the available bandwidth.


Signalling path, endpoint A and endpoint B with Bandwidth Manager performing Call Access Control based on available bandwidth.

Depending on user policy, locations of the call, and available bandwidth, the Bandwidth Manager might accept the call, reject the call, or modify the media or the bandwidth attribute in the SDP. The action ensures that the total transmission rate for audio and video does not exceed the available bandwidth allocated for audio and video usage in the system configuration.

Calls are recorded with detail such as call locations and bandwidth required. Organizations can use this information to measure audio and video usage and their utilization of the network capacity for future planning. Use the data captured by the Sametime Bandwidth Manager to calculate the impact that the deployment of audio and video exerts on the network.

There are differences in audio and video codecs bandwidth usage due to how the Media Manage processes the data in Sametime. Calculating the required network bandwidth for an organization should be based on the formulas given in Table 2 and Table 4. It should be part of capacity planning to afford the most optimal network conditions for audio and video. Organizations should consider deploying Bandwidth Manager to protect the network and to ensure quality audio and video calls. Using data captured by the Bandwidth Manager enables organization to plan for future capacity.