Skip to Content.
Sympa Menu

wg-multicast - Re: multicast issue = 5828

Subject: All things related to multicast

List archive

Re: multicast issue = 5828


Chronological Thread 
  • From: Bruce Curtis <>
  • To: "White, Craig (Level 3)" <>
  • Cc: "Marshall Eubanks" <>, "Matthew Davy" <>, "TechnicalSupport" <>, "Hicks John" <>, "wg-multicast" <>
  • Subject: Re: multicast issue = 5828
  • Date: Mon, 29 Nov 2004 16:04:57 -0600

It's working again. Now the multicast streams stay active, no freezing. However I am still seeing 1% to 2% packet loss intermittently.



On Thursday, November 11, 2004, at 06:17 AM, White, Craig (Level 3) wrote:

FYI:

-----Original Message-----
From: Vaughan, Graham
Sent: Thursday, November 11, 2004 3:59 AM
To: White, Craig (Level 3)
Subject: FW: multicast issue = 5828

Fwded to TCAM:

-----Original Message-----
From: Smith, Chad
Sent: Thursday, November 11, 2004 3:37 AM
To:
''
Cc: Vaughan, Graham
Subject: FW: multicast issue = 5828

Hello Matthew,
I was just forwarded this e-mail regarding possible multicast BGP issues
on the Level3 network. As it stands right now Indiana University is the
only Level3 customer of record listed in this e-mail string. Based on
your reply within the mail string it sounds like you are only
experiencing the MBGP issue over your MPLS link, but not your direct
link.

Would you like to open a formal trouble ticket with Level3 to
investigate this? If so please call us at 877-884-8930 or forward this
e-mail on to
.
Please include as many details as
possible.


Regards,

Chad Smith
Level 3 Communications
IP-TCAM
720-888-0334

-----Original Message-----
From: Marshall Eubanks
[mailto:]
Sent: Thursday, November 04, 2004 7:00 PM
To: White, Craig (Level 3); Matthew Davy; Marshall Eubanks
Cc: TechnicalSupport; Hicks John; Curtis Bruce; wg-multicast
Subject: Re: multicast issue = 5828

Craig;

I am going to bring L3 into this. There seems to be a real problem at
the
L3 / I2 multicast peering at Nasa AMES -

symptom - multicast data comes (L3 into I2) for about 5 seconds, dies,
and then
revives after 3 minutes, only to repear the cycle.

This sounds like something is messed up with PIM keep alive messages to
me.

Marshall


On Thu, 4 Nov 2004 15:34:19 -0500
Matthew Davy
<>
wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm looking at this right now. I'm seeing the same behavior here at
IU. I see good, consistent PIM state all the way back to the
Abilene/Level3 peering at NASA Ames. Traffic comes for 5 seconds then

stops for a long while, 3-5 minutes maybe. But when it stops, it
stops
even at the first Abilene interfaces. So it must be getting hung up
somewhere before there.

We just changed our setup at the Ames MIX to use a GIGE over MPLS
thing. I suspect the PIM joins or multicast packets might be getting
lost on that link.

Okay, I just changed the preference on the IU peering with Level3 to
prefer the direct path to Level3 over the path through Abilene then to

Level3. This resolved the problem which really makes me suspect that
Ethernet over MPLS link to NASA Ames. We'll look into this.

- - Matt

Matthew Davy
Chief Network Engineer, Indiana University
University Information Technology Services / Abilene Network
Operations
Center
2711 East 10th Street, Bloomington IN, 47403

/ 812.855.7728
PGP key fingerprint: A84D DFB6 9DD5 BEB4 1EF7 D713 956F F85C 6422
CBEB


On Nov 4, 2004, at 1:03 PM, Marshall Eubanks wrote:

I am going to take the liberty of forwarding this onto this list :

Symptoms : from Nodak.edu, he can join an americafree.tv group. It
dies
after 5 seconds or so, and comes back after 3 minutes.

The timing makes me think of PIM softstate intervals, but nothing
specifc comes to
mind at present. It's not caused by either the source or the
receiver.
He says
he can get other sources (not from here) just fine.

Can someone else report whether or not they see this ? Any ideas ?

Regards
Marshall


--- the forwarded message follows ---

From: "Bruce Curtis"
<>
Date: November 4, 2004 12:11:40 PM EST
To: "Marshall Eubanks"
<>
Cc:
,
"Marshall Eubanks"
<>,
"John M Hicks"
<>,
,


Subject: Re: multicast issue = 5828
Reply-To:




On Thu, November 4, 2004 8:24 am, Marshall Eubanks said:
There are several possibilities. What OS / player are you using.

I'm using Quicktime in MAC OS X. I also used Quicktime under
Windows
with the same result. Both had worked fine to view this stream
previously.

Quicktime, for example, had a bug a good while ago that meant that
it
would not
refresh IGMP member messages (or maybe it was the Mac stack, but
that
was
the effect).
So, groups will be joined but would then time out and die.

What I was trying to make sure was that the problem was not in the
receiver computer -
thus the request to use rtpdump. (rtpqual uses the same code, so it

means
that the problem is not in
the player).

Right.


The times to stop you are talking about are much too long to be due
to
passing
encapsulated data but rejecting multicast, so it has to be either a

data
stoppage or a local problem
(which includes the player, the player host, and your local DR).

The next step would be to use tcpdump to see if IGMP membership
messages
are being sent as they
should be.

I'm out of town today but I have checked the router and it shows
the
correct state indicating that my host is sending IGMP membership
messages correctly. It works fine for other streams.



Also, you could look in your DR AFTER the data stops. If the
problem
is
from the receiver, the group
will be missing. If it is due to a data stoppage, then the group
state
will still be there, as
the DR will still be receiving IGMP member messages from the group.

Multicast state is still there and correct in the DR and in the
routers
in the path towards the source, including the Abilene core routers.

A "show ip mroute x count" on my Cisco router that peers with the
Northern Lights GigaPOP shows that traffc has stopped and is not
reaching our network.

Looking at the Abilene core routers they show 0 pps and 0 Kbps but

have
the correct multicast state, indicating that the traffic stops
before
reaching the Abilene backbone.


(Actually, quicktime sends RTCP traffic from each receiver, so that

should
be visible going out
bound if you can use quicktime.)

Thanks
Marshall


On Wed, 3 Nov 2004 22:32:42 -0600 (CST)
"Bruce Curtis"
<>
wrote:

On Wed, November 3, 2004 9:53 pm, Marshall Eubanks said:
On Wed, 3 Nov 2004 15:45:35 -0600
Bruce Curtis
<>
wrote:


Yesterday I got all of my Abilene routes back.


Today I'm receiving MBGP routes for 63.105.122.28 so the
multicast
session starts, but now we are back to the other problem that I
only
receive a few packets and then the session stops.


Get a copy of rtpdump and issue

rtpdump -F ascii 233.64.133.120/8022

before you start the player up.

You should see a flood of packets come through. If the
pkayer _still_ stops, then whether not rtpdump stops will tell
whether this a player issue or a network issue.

Marshall

The abilene core routers show 0 Kbps and 0 pps after the initial

flow.


I used rtpqual, I got the output below fairly quickly and then
stopped
receiving packets. Earlier I also verified with tcpdump that it
was
not
a problem with the viewer, I could see that the packets stopped.
Also
the viewer works fine with other streams.

$ rtpqual 233.64.133.120 8022
Defaulting to: rtpqual 233.64.133.120 8022 rtp
Report from: rtpqual 233.64.133.120 8022 rtp at Wed Nov 3
22:25:36
2004
T Pkts Loss % Late Bytes | Pkts Loss % Late kB Sender
36 44 2 4 1 46366 | 44 2 4 1 45
63.105.122.28
37 29 2 6 0 29195 | 73 4 5 1 73
63.105.122.28
38 28 3 9 0 27462 | 101 7 6 1 100
63.105.122.28
39 30 2 6 0 31166 | 131 9 6 1 131
63.105.122.28
40 31 0 0 0 30988 | 162 9 5 1 161
63.105.122.28

As the time stamp shows there was a three minute pause and then
I
received another group of packets.

41 9 0 0 0 8650 | 171 9 5 1 169
63.105.122.28
Report from: rtpqual 233.64.133.120 8022 rtp at Wed Nov 3
22:28:32
2004
T Pkts Loss % Late Bytes | Pkts Loss % Late kB Sender
31 4 12 75 0 4185 | 175 21 10 1 173
63.105.122.28
32 40 2 4 0 44018 | 215 23 9 1 216
63.105.122.28
33 34 0 0 0 40881 | 249 23 8 1 256
63.105.122.28
34 40 0 0 0 41966 | 289 23 7 1 297
63.105.122.28
35 40 0 0 0 44122 | 329 23 6 1 340
63.105.122.28
36 36 0 0 0 39468 | 365 23 5 1 379
63.105.122.28
37 33 0 0 0 37322 | 398 23 5 1 415
63.105.122.28

And then is stopped again, will likely have another burst in 3
minutes...

Today I looked at the whole list of MBGP routes from level 3 and

found
another site with an MSDP entry that came through level 3. I used
rtpqual for that site and it worked fine. But it looked like that

site
was in Chicago if I remember correctly.








On Tuesday, November 2, 2004, at 01:25 PM, Bruce Curtis wrote:


On Tuesday, November 2, 2004, at 01:14 PM, John M Hicks wrote:

Bruce,
Back at it again today. Can you send me a snap shot of the
multicast
state
for what you are looking at? Also, where are you getting the
SAs
from? Do
you have a level3 peering?
Thanks,
-john

We don't have a level 3 peering, but the source 63.105.122.28
is
on
level 3. Our only multicast peering is with the Northern
Lights
GigaPOP.

The problem is different that it was last week. Now I get no
multicast packets from 63.105.122.28 for group 233.64.133.110,
which
makes sense since I'm not receiving any routes for
63.105.122.28
in
MBGP anymore, and so there is an rpf failure. The email I sent
yesterday from that included the entries from the Level 3
looking
glass showed that there is a problem within the Level 3 network

and
the MBGP announcements for 63.105.122.28 aren't making it to
the
West
Coast portion of Level 3's network.



i2.ndsu>show ip mroute 233.64.133.110
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C
-
Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register
flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP
Advertisement,
U - URD, I - Received Source Specific Host Report, Z -
Multicast Tunnel
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 233.64.133.110), 00:00:50/stopped, RP 134.129.65.254,
flags:
SP
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list: Null

(63.105.122.28, 233.64.133.110), 00:00:50/00:02:40, flags: P
Incoming interface: FastEthernet1/0/0, RPF nbr 134.129.107.3
Outgoing interface list: Null



i2.ndsu>show ip mbgp 63.105.122.28
% Network not in table



2.ndsu>show ip msdp sa 233.64.133.110
MSDP Source-Active Cache - 2 entries for 233.64.133.110
(63.105.122.28, 233.64.133.110), RP 206.61.163.252, MBGP/AS
1239,
1w3d/00:05:13, Peer 192.42.152.174

Even though whois shows that 63.105.122.28 belongs to uunet
and
206.61.163.252 belongs to sprint traceroutes show that both are
on
Level 3's network.


traceroute 206.61.163.252
traceroute to 206.61.163.252 (206.61.163.252), 30 hops max, 40
byte
packets
1 a095.not-a-bridge.ndsu.nodak.edu (134.129.95.100) 0.859 ms
0.779
ms 0.321 ms
2 fast100.i1.ndsu.nodak.edu (134.129.107.3) 0.647 ms 0.444
ms
0.576 ms
3 router.gig.hecn.ndsu.nodak.edu (134.129.29.41) 1.166 ms
1.136
ms
0.739 ms
4 165.234.165.131 (165.234.165.131) 1.634 ms 1.004 ms
1.287
ms
5 165.234.165.129 (165.234.165.129) 1.748 ms 1.677 ms
1.833
ms
6 sl-gw33-chi-0-0.sprintlink.net (144.223.34.197) 37.837 ms
38.19
ms 41.226 ms
7 sl-bb22-chi-4-0.sprintlink.net (144.232.26.21) 38.921 ms
40.394
ms 206.61 38.611 ms
8 sl-st21-chi-13-0.sprintlink.net (144.232.20.91) 39.264 ms
39.245
ms 41.346 ms
9 sl-st20-chi-1-0.sprintlink.net (144.232.8.102) 39.151 ms
40.284
ms 39.616 ms
10 so-2-1-0.edge1.chicago1.level3.net (209.0.225.21) 39.285
ms
40.818 ms 38.925 ms
11 so-2-1-0.bbr2.chicago1.level3.net (209.244.8.13) 51.131 ms
88.688 ms 40.646 ms
12 ge-0-3-0.bbr2.washington1.level3.net (64.159.0.229) 58.12
ms
so-2-0-0.bbr2.washington1.level3.net (209.247.10.130) 58.892
ms
61.364 ms
13 ge-7-1.ipcolo1.washington1.level3.net (4.68.121.75) 58.171
ms
ge-7-0.ipcolo1.washington1.level3.net (4.68.121.11) 58.313 ms
ge-9-1.ipcolo1.washington1.level3.net (4.68.121.107) 58.163 ms
14 unknown.level3.net (63.210.25.154) 59.39 ms^C^\Quit



---
Bruce Curtis

Certified NetAnalyst II
701-231-8527
North Dakota State University



---
Bruce Curtis

Certified NetAnalyst II 701-231-8527
North Dakota State University





---
Bruce Curtis

Certified NetAnalyst II
701-231-8527
North Dakota State University





---
Bruce Curtis

Certified NetAnalyst II 701-231-8527
North Dakota State University




Matthew Davy
Chief Network Engineer, Indiana University
University Information Technology Services / Abilene Network
Operations
Center
2711 East 10th Street, Bloomington IN, 47403

/ 812.855.7728
PGP key fingerprint: A84D DFB6 9DD5 BEB4 1EF7 D713 956F F85C 6422
CBEB
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)

iD8DBQFBipJKlW/4XGQiy+sRAl1vAKCpGrfWX0xjMAVC2qczuOb5XrZ42wCg0M13
W8VwCT0C2Q35av2Zg5mtA6U=
=6Gc+
-----END PGP SIGNATURE-----





---
Bruce Curtis

Certified NetAnalyst II 701-231-8527
North Dakota State University




Archive powered by MHonArc 2.6.16.

Top of Page