Video over IP

The following technical article was current at the time it was published. However, due to changing technologies and standards updates, some of the information contained in this article may no longer be accurate or up to date.

Video over IP

Executive Summary

Thinking as an executive, there are pressures to keep costs down and help a company survive in this challenging market. Let us assume that company A has 10 locations and wishes to have a corporate meeting. The cost to fly 2 people from each of the 10 locations at an average of US $250.00 per ticket rings up a total of US $5,000.00. Add rooms for 20 people, meals while on the road, and one can quickly see that less costly alternatives are becoming mainstream. Recent advancements in technology have brought video conferencing from its jerky infancy to a viable and mature product. Other developments in the video industry such as Video on Demand (VOD), digitized video, interactive video, streaming video and real-time audio/video are enabling companies in ways that were not possible a year or two.

The burdens that these technologies can place on an infrastructure vary with the amount of their usage, but no one will argue that the bandwidth demand does increase. Companies have placed applications on their networks only to realize that the bandwidth is not there. A robust, well planned and installed infrastructure system such as 10G ip™ can provide enough bandwidth for these applications to run while providing peace of mind when implementing other applications within a single infrastructure in the future. Provided below is an overview of Video over IP systems and how 10G ip™ can address this bandwidth hungry application’s needs.

Video over IP Technology and Market Trends

Legacy video signals are based on analog technology. They are carried via expensive transmission circuits. We now, however, live in a digital world. Through advancements in digital video compression composite audio and video signals can now be carried over typical network circuits both on the LAN and across the WAN, and even over the Internet. Video over IP or IP Streaming Video are newer technologies that allow video signals to be captured, digitized, streamed and managed over IP networks.

The first step is the capturing of the video content. This can be accomplished via several means. The content is processed, compressed, stored and edited on a video server. The content can either be “live” (captured and processed in real-time) or pre-recorded and stored. These transmissions can then be sent via the network to either one or several stations for viewing singularly or simultaneously. The viewing station will need either a hardware or software viewer or in some cases both. Emerging applications provide the viewer and video over Java with no special requirements on the end station.

Video presentations can be grouped into three categories: Video Broadcasting, Video on Demand, and Video Conferencing. Of the three, only video conferencing is full duplex, the others are essentially one way transmissions. These video over IP transmissions are scalable, cost effective, and very flexible. These new business tools bring disparate offices together on one enterprise and are being deployed rapidly. According to The Gartner Group, IP video applications will be utilized in 80% of Fortune 2000 companies by the year 2006. The applications are rapidly replacing the legacy ISDN video conferencing applications. According to In-Stat/MDR (March, 2003), video conferencing endpoints are expected to reach US$875 million in sales in 2007, and the video conferencing services total is expected to reach US$5.5 billion in the same year.

Video Broadcast over IP

Video broadcast over IP is a network-based one-way transmission of video file content. The endpoint is merely a passive viewer with no control over the session. Video broadcast can be either Unicast or Multicast from the server. In a Unicast configuration, the transmission is replicated by the server for each endpoint viewer. In a Multicast configuration, the same signal is sent over the network as one transmission, but to multiple endpoints or, simply, a group of users.

This technology is being implemented in corporate environments as a means to distribute training, presentations, meeting minutes and speeches. It is also being utilized by universities, continuing education or technical education centers, broadcasters, webcast providers, just to name a few. There are three factors to determine how much bandwidth this technology will require: the number of users, their bandwidth to your server, and the length of the presentation or video. Broadcast video is typically considered to be an “open pipe”.

Video on Demand (VOD) over IP

Generally speaking, VOD allows a user to request on demand a streamed video stored on a server. This technology differs from broadcast video in that the user has the options to stop, start, fast-forward or rewind the video as the service is interactive. VOD also has another feature in that it is generally accompanied by usage data allowing viewing and billing of video services or video time. While VOD can be used for real-time viewing, it is generally used for stored video files. This technology is used for e-learning, training, marketing, entertainment, broadcasting, and other areas where the end user has needs to view the files based on their schedule and not the schedule of the video supplier.

In a typical VOD over IP Network, the following components are implemented:

  • The Video Server (may be an archive server or cluster of servers)
  • The Application Control Server which initiates the transmission (may be included in the archive server)
  • An endpoint with a converter to submit the viewing request and control playback
  • Management Software and/or billing software
  • PC or Network-based device to record/convert the video files

Videoconferencing over IP

Videoconferencing (VC) is a combination of full duplex audio and video transmissions which allows people in two different locations to see and hear each other as if participating in a face-to-face conversation. A camera is utilized at both endpoints to capture and send the video signals. Microphones are used at each endpoint to capture and transmit speech which is then played through speakers. The communications are real-time and generally not stored.

The first videoconferencing technology was introduced to the market by AT&T in 1964. The legacy standard for the communications is ITU H.320. This standard has restrictions on usage costs and users had to maintain dedicated equipment in a single location. New standards released in 1996 (H323) allow for IP-based VC. IP-based services are far better as the conference can be initiated from any PC on the network equipped with the proper equipment, and the signals travel over the regular network infrastructure and equipment, eliminating the need for dedicated lines and usage charges.

These services can be used for applications including corporate communications, telemedicine, telehealth, training, e-learning, telecommuting and customer service. Videoconferencing can be point-to-point (one user to one user), or multipoint (multiple users participating in the same session). Users in the latter are viewed in separate windows. Videoconferencing has also introduced a new concept in communications through collaboration. An electronic whiteboard can be included in the conference allowing users to write notes on the same board and/or view each others presentations and notes while speaking.

A MCU (Multipoint Conference Unit) is generally maintained at a central location. This unit allows the multiple video feeds to be viewed simultaneously. A box called a Gatekeeper is normally included for multipoint conferences. This box controls the bandwidth, addressing, identification and security measures for the conferences. They are typically software applications that reside on a separate PC, but newer model equipment has the gatekeeper functionality built in.

Standards for Video over IP

Open systems requirements specify that the communications must happen within the predefined IP packet structures and that one vendor’s equipment must interoperate with another’s in a non-proprietary fashion. The two most important protocol components are H.323 and SIP (Session Initiation Protocol). Four major elements, namely, terminals, gateways, gatekeepers, and multi-point control units are defined in the H.323 standard and its addendums. SIP was developed by the IETF (Internet Engineering Task Force) in the middle 1990s and is a signaling protocol for Internet conferencing, telephony, presence, events notification and instant messaging. SIP was developed within the IETF MMUSIC (Multiparty Multimedia Session Control) working group, with work proceeding since September 1999 in the IETF SIP working group. Today’s video applications utilize video compression and video coding technology to carry the video portion with reduced bandwidth consumption attributable to the compression scheme. MPEG (Motion Picture Experts Group) is the predominant developer of compression standards for video feeds, with MPEG-4 being the latest.

Bandwidth Bits and Bytes

When an analog signal is converted to a digital signal (as in video or voice transmissions) the process is completed by what is known as sampling. Sampling, as the name implies, refers to taking samples of the signal at various times per second (the sampling rate) within a sampling depth (bits per sample). The greater the sampling, the larger the file will be. The number of values equals the number of sample values (on or off) raised to the power of the number of bits sampled. In simpler terms, a music CD is sampled at a rate of 44 thousand samples per second or in general 5 MB of samples (data) per minute of listening time.

The same technique is used for video, although a bit more complex. The difference here is that what is now being transmitted is a raster image in picture elements also known as pixels. The MPEG standard uses what is called “lossy” compression. That is much of the image is “lost” but not enough to diminish comprehension by the human eye as the human brain fills in the gaps. The video is sampled in segments of the video. The first frame (the index frame) is transmitted entirely and the remaining frames transmit changes as compared to the initial index frame. The greater the compression, the greater the “lossiness” of the frame will be. In a congested network, samples can be received out of sequence and a phenomenon known as pixilation occurs. Pixilation is when the pixels seem out of place when compared to the original index frame and the image is skewed. A raw video feed (non-compressed) fully sampled requires 165 Mbps for D1 quality. D1 resolution is full screen 720 x 480 TV resolution as devined by the National Television System Committee (NTSC) and 720 x 576 for Phase Alternating Line (PAL). There are two ways to compress the feed. One is to lower the resolution and the other is through the sampling rate. Compressed, the feed will obviously consume less resources, but there is a trade-off between compression and video quality.

Bottlenecks and Hurdles

In order to implement real-time video in the network, a network must be in excellent working order. Traffic shapers can assist by prioritizing video traffic and voice traffic utilizing the Quality of Service bit. All IP headers have a section called TOS or Type of Service byte. This was built into the protocol several years ago. Quality of Service is a term which refers to a set of parameters for both connectionmode (TCP) and connectionless-mode (IP) transmissions which provide for performance in terms of transmission quality and availability of service. It encompasses maximum delay, throughput and priority of the packets being transmitted. The first bits of the ToS byte are reset with QoS information. Prioritizing network traffic puts time-sensitive packets ahead of data packet transmissions. This same method is used in VoIP (Voice over IP) networks.

Much like email, companies will begin to rely heavily on these services in the near future. Every administrator is aware of the resulting disaster when critical systems go down. Companies are striving for the same uptime as service providers, 99.999%. Downtime is expensive. Uptime becomes harder to achieve as networks carry additional loads with increased sensitivity to quality of service. There is a single common denominator for all applications, namely the infrastructure.

A strong infrastructure with plenty of headroom, bandwidth and capacity will be the single greatest factor in any installation of converged IP services. 10G ip™ was developed to answer such a need. Why? Technology does not stand still. Ability fosters ingenuity. New standards are being developed each day to address new applications. Five years ago, no one thought that there would be a need for 10G transmissions. The reality is that today, 10G exists, and is being used. It makes sense to supply your infrastructure with the necessary room to grow. 10G ip™ does just this today. It is the best cabling technology available today and will work for you for tomorrow’s applications.