Skip to Content.
Sympa Menu

wg-multicast - Re: Is Multicast a Real Service?

Subject: All things related to multicast

List archive

Re: Is Multicast a Real Service?


Chronological Thread 
  • From: Russ Hobby <>
  • To:
  • Cc:
  • Subject: Re: Is Multicast a Real Service?
  • Date: Thu, 27 Oct 2005 01:39:16 -0700

It seems sad that it is the users that have to be the ones testing the performance to find these problems with multicast. Is there some way that network operators can do some regular testing to maintain a baseline for performance for multicast like that has been started for unicast? Perhaps using the multicast beacons? There is also the issue of what to measure. This example shows that setup time is one factor (on that is not usually considered in unicast). Are there other parameters that need to be measured that are unique to multicast?

Russ


At 12:27 AM 10/27/2005, hoerdt Mickael wrote:
Hi,

I don't know if it can be related but some month ago i performed branch setup time measurements from a receiver toward a source in a multicast tree. I observed very
a high delay of the hop by hop PIM join message propagation in the network (from 700ms
to 900ms for 5 hops). Doing measurement on a well known router implementation, i
observed join router traversal having a value of 50ms. Pavan Namburi and Kamil Sarac
observed the same results in their SSM-Ping paper [1]

I informed the IETF community about this problem, to know if it was specified somewhere,
but the problem seems to be still there, for me it's an implementation problem.

Cheers,

Hoerdt Mickaƫl


[1] SSM-Ping: A Ping Utility for Source Specific Multicast <http://www.utdallas.edu/%7Eksarac/research/publications/CIIT04-1.pdf>, with K. Almeroth and P. Namburi, the /3rd IASTED International Conference on Communications, Internet, and Information Technology (CIIT)/, St. Thomas, US Virgin Islands, USA, November 2004.

Steven Senger wrote:

Well it has been a while since I got out of class but other things
interrupted.

One caveat. I'm not a network engineer and I almost certainly don't
use the correct terminology but here goes.

Lea has described the problem we were seeing in CENIC and Stanford.
From my point of view the symptoms were a little more exciting then
the short description implies. When we first started this we would
see setup delays in excess of 10 minutes before traffic would start
flowing. Going from Abilene to CENIC in LA changed from Juniper to
Cisco and when we finally got an answer from Cisco it had to do with
a difference in implementation on how the routers handled the
encapsulated and non-encapsulated cases. The result was that traffic
sourced in Wisconsin would not show delays going to Stanford but
traffic sourced from Michigan would. The solution was that Cisco was
changing their behavior to agree with Juniper. When tech support
finally told us this, after several months of on and off testing, it
was rather anticlimactic but did fix the problem in CENIC.

Once that problem was solved we found that there is still a small
amount of setup delay (0 - 45 sec, measured every hour) between the
source in michigan and Merit. It presumably has a similar cause but
has not yet been solved.

We are also seeing a weird situation in WiscNet. We first stumbled
upon this using Access Grid between Wisconsin and Stanford. Things
would work correctly for weeks at a time and then one day the video
traffic from Wisconsin would not get to Stanford but the audio
traffic would or vice versa. I eventually figured out that I could
source traffic from Wisconsin on a group and have it arrive at
Stanford without problems. But starting a receiver for the group on
my campus would block the traffic from getting through WiscNet. So
far we have determined that there is a Juniper box in WiscNet that is
misbehaving. Restarting the rpd process on the box corrects the
situation and it will stay working for a month or more but will
eventually fail. When it is failing, in addition to the basic
behavior of local receivers blocking outbound traffic, there seem to
be a number of other odd behaviors that I have not fully cataloged.

I don't think we are trying to do the impossible but any help would
be greatly appreciated.

- steve

On Oct 25, 2005, at 11:39 AM, Lea Roberts wrote:

hello everyone!

the simple answer is that it was an MSDP propagation issue for an ASM
application.

the bug was in Cisco IOS (I can look back for the bug ID if there's any
interest) where the remote source address would take tens of seconds to
propagate from one router to the next... once we zeroed in what was
happening, it took a while to get the answer but once the code was all
upgraded things have been workihg fine in the normal path. Now I'll let
Steve Senger comment on how they've continued to find other sites that
need to upograde their Cisco routers...

Lea Roberts
Stanford Network Operations

On Tue, 25 Oct 2005, Russ Hobby wrote:


I wasn't directly involved with the problem or the solution but I
will get
the details and report back. (or if any of you on the list were
involved
you can report directly ;-)

Russ

At 06:01 AM 10/25/2005, Russ Hobby wrote:

Hi All,

I have been asked a question from a researcher who's project has been
trying to use multicast for some time with many problems, and she
asked me
"Are we trying to do the impossible?"

Their particular use of multicast relies on multiple multicast
connections
and for them to be set up in a timely fashion. It is the setup
time for a
connection with which they have been having trouble. Connections
can take
up to several minutes to be established. They have worked the
problem out
with a couple of campuses and a gigapop to get it to work (which
had to
wait for new router codes to be installed in several places).
However
every time they go to a new site, they are likely to have the problem
again and they have to work it through those network engineers.

So the researcher is asking if they should give up on multicast and
redesign their application. What do you think?

Russ






Archive powered by MHonArc 2.6.16.

Top of Page