wg-multicast - Re: multicast issue = 5828

Subject: All things related to multicast

List archive

Re: multicast issue = 5828

From: Bruce Curtis <>
To: "White, Craig (Level 3)" <>
Cc: "Marshall Eubanks" <>, "Matthew Davy" <>, "TechnicalSupport" <>, "Hicks John" <>, "wg-multicast" <>
Subject: Re: multicast issue = 5828
Date: Mon, 29 Nov 2004 16:04:57 -0600

It's working again. Now the multicast streams stay active, no freezing. However I am still seeing 1% to 2% packet loss intermittently.

On Thursday, November 11, 2004, at 06:17 AM, White, Craig (Level 3) wrote:

FYI:

-----Original Message-----
From: Vaughan, Graham
Sent: Thursday, November 11, 2004 3:59 AM
To: White, Craig (Level 3)
Subject: FW: multicast issue = 5828

Fwded to TCAM:

-----Original Message-----
From: Smith, Chad
Sent: Thursday, November 11, 2004 3:37 AM
To:
''
Cc: Vaughan, Graham
Subject: FW: multicast issue = 5828

Hello Matthew,
I was just forwarded this e-mail regarding possible multicast BGP issues
on the Level3 network. As it stands right now Indiana University is the
only Level3 customer of record listed in this e-mail string. Based on
your reply within the mail string it sounds like you are only
experiencing the MBGP issue over your MPLS link, but not your direct
link.

Would you like to open a formal trouble ticket with Level3 to
investigate this? If so please call us at 877-884-8930 or forward this
e-mail on to
.
Please include as many details as
possible.

Regards,

Chad Smith
Level 3 Communications
IP-TCAM
720-888-0334

-----Original Message-----
From: Marshall Eubanks
[mailto:]
Sent: Thursday, November 04, 2004 7:00 PM
To: White, Craig (Level 3); Matthew Davy; Marshall Eubanks
Cc: TechnicalSupport; Hicks John; Curtis Bruce; wg-multicast
Subject: Re: multicast issue = 5828

Craig;

I am going to bring L3 into this. There seems to be a real problem at
the
L3 / I2 multicast peering at Nasa AMES -

symptom - multicast data comes (L3 into I2) for about 5 seconds, dies,
and then
revives after 3 minutes, only to repear the cycle.

This sounds like something is messed up with PIM keep alive messages to
me.

Marshall

On Thu, 4 Nov 2004 15:34:19 -0500
Matthew Davy
<>
wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm looking at this right now. I'm seeing the same behavior here at
IU. I see good, consistent PIM state all the way back to the
Abilene/Level3 peering at NASA Ames. Traffic comes for 5 seconds then

stops for a long while, 3-5 minutes maybe. But when it stops, it

stops

even at the first Abilene interfaces. So it must be getting hung up
somewhere before there.

We just changed our setup at the Ames MIX to use a GIGE over MPLS
thing. I suspect the PIM joins or multicast packets might be getting
lost on that link.

Okay, I just changed the preference on the IU peering with Level3 to
prefer the direct path to Level3 over the path through Abilene then to

Level3. This resolved the problem which really makes me suspect that
Ethernet over MPLS link to NASA Ames. We'll look into this.

- - Matt

Matthew Davy
Chief Network Engineer, Indiana University
University Information Technology Services / Abilene Network

Operations

Center
2711 East 10th Street, Bloomington IN, 47403

/ 812.855.7728
PGP key fingerprint: A84D DFB6 9DD5 BEB4 1EF7 D713 956F F85C 6422

CBEB

On Nov 4, 2004, at 1:03 PM, Marshall Eubanks wrote:

I am going to take the liberty of forwarding this onto this list :

Symptoms : from Nodak.edu, he can join an americafree.tv group. It

dies

after 5 seconds or so, and comes back after 3 minutes.

The timing makes me think of PIM softstate intervals, but nothing
specifc comes to
mind at present. It's not caused by either the source or the

receiver.

He says
he can get other sources (not from here) just fine.

Can someone else report whether or not they see this ? Any ideas ?

Regards
Marshall

--- the forwarded message follows ---

From: "Bruce Curtis"
<>
Date: November 4, 2004 12:11:40 PM EST
To: "Marshall Eubanks"
<>
Cc:
,
"Marshall Eubanks"
<>,
"John M Hicks"
<>,
,

Subject: Re: multicast issue = 5828
Reply-To:

On Thu, November 4, 2004 8:24 am, Marshall Eubanks said:

There are several possibilities. What OS / player are you using.

I'm using Quicktime in MAC OS X. I also used Quicktime under

Windows

with the same result. Both had worked fine to view this stream
previously.

Quicktime, for example, had a bug a good while ago that meant that

it

would not
refresh IGMP member messages (or maybe it was the Mac stack, but

that

was
the effect).
So, groups will be joined but would then time out and die.

What I was trying to make sure was that the problem was not in the
receiver computer -
thus the request to use rtpdump. (rtpqual uses the same code, so it

means
that the problem is not in
the player).

Right.

The times to stop you are talking about are much too long to be due

to

passing
encapsulated data but rejecting multicast, so it has to be either a

data
stoppage or a local problem
(which includes the player, the player host, and your local DR).

The next step would be to use tcpdump to see if IGMP membership
messages
are being sent as they
should be.

I'm out of town today but I have checked the router and it shows

the

correct state indicating that my host is sending IGMP membership
messages correctly. It works fine for other streams.

Also, you could look in your DR AFTER the data stops. If the

problem

is
from the receiver, the group
will be missing. If it is due to a data stoppage, then the group

state

will still be there, as
the DR will still be receiving IGMP member messages from the group.

Multicast state is still there and correct in the DR and in the
routers
in the path towards the source, including the Abilene core routers.

A "show ip mroute x count" on my Cisco router that peers with the
Northern Lights GigaPOP shows that traffc has stopped and is not
reaching our network.

Looking at the Abilene core routers they show 0 pps and 0 Kbps but

have
the correct multicast state, indicating that the traffic stops

before

reaching the Abilene backbone.

(Actually, quicktime sends RTCP traffic from each receiver, so that

should
be visible going out
bound if you can use quicktime.)

Thanks
Marshall

On Wed, 3 Nov 2004 22:32:42 -0600 (CST)
"Bruce Curtis"
<>
wrote:

On Wed, November 3, 2004 9:53 pm, Marshall Eubanks said:

On Wed, 3 Nov 2004 15:45:35 -0600
Bruce Curtis
<>
wrote:

Yesterday I got all of my Abilene routes back.

Today I'm receiving MBGP routes for 63.105.122.28 so the
multicast
session starts, but now we are back to the other problem that I
only
receive a few packets and then the session stops.

Get a copy of rtpdump and issue

rtpdump -F ascii 233.64.133.120/8022

before you start the player up.

You should see a flood of packets come through. If the
pkayer _still_ stops, then whether not rtpdump stops will tell
whether this a player issue or a network issue.

Marshall

The abilene core routers show 0 Kbps and 0 pps after the initial

flow.

I used rtpqual, I got the output below fairly quickly and then
stopped
receiving packets. Earlier I also verified with tcpdump that it

was

not
a problem with the viewer, I could see that the packets stopped.
Also
the viewer works fine with other streams.

$ rtpqual 233.64.133.120 8022
Defaulting to: rtpqual 233.64.133.120 8022 rtp
Report from: rtpqual 233.64.133.120 8022 rtp at Wed Nov 3

22:25:36

2004
T Pkts Loss % Late Bytes | Pkts Loss % Late kB Sender
36 44 2 4 1 46366 | 44 2 4 1 45

63.105.122.28

37 29 2 6 0 29195 | 73 4 5 1 73

63.105.122.28

38 28 3 9 0 27462 | 101 7 6 1 100

63.105.122.28

39 30 2 6 0 31166 | 131 9 6 1 131

63.105.122.28

40 31 0 0 0 30988 | 162 9 5 1 161

63.105.122.28

As the time stamp shows there was a three minute pause and then

I

received another group of packets.

41 9 0 0 0 8650 | 171 9 5 1 169

63.105.122.28

Report from: rtpqual 233.64.133.120 8022 rtp at Wed Nov 3

22:28:32

2004
T Pkts Loss % Late Bytes | Pkts Loss % Late kB Sender
31 4 12 75 0 4185 | 175 21 10 1 173

63.105.122.28

32 40 2 4 0 44018 | 215 23 9 1 216

63.105.122.28

33 34 0 0 0 40881 | 249 23 8 1 256

63.105.122.28

34 40 0 0 0 41966 | 289 23 7 1 297

63.105.122.28

35 40 0 0 0 44122 | 329 23 6 1 340

63.105.122.28

36 36 0 0 0 39468 | 365 23 5 1 379

63.105.122.28

37 33 0 0 0 37322 | 398 23 5 1 415

63.105.122.28

And then is stopped again, will likely have another burst in 3
minutes...

Today I looked at the whole list of MBGP routes from level 3 and

found
another site with an MSDP entry that came through level 3. I used
rtpqual for that site and it worked fine. But it looked like that

site
was in Chicago if I remember correctly.

On Tuesday, November 2, 2004, at 01:25 PM, Bruce Curtis wrote:

On Tuesday, November 2, 2004, at 01:14 PM, John M Hicks wrote:

Bruce,
Back at it again today. Can you send me a snap shot of the

multicast

state
for what you are looking at? Also, where are you getting the

SAs

from? Do
you have a level3 peering?
Thanks,
-john

We don't have a level 3 peering, but the source 63.105.122.28

is

on

level 3. Our only multicast peering is with the Northern

Lights

GigaPOP.

The problem is different that it was last week. Now I get no
multicast packets from 63.105.122.28 for group 233.64.133.110,

which

makes sense since I'm not receiving any routes for

63.105.122.28

in
MBGP anymore, and so there is an rpf failure. The email I sent
yesterday from that included the entries from the Level 3

looking

glass showed that there is a problem within the Level 3 network

and
the MBGP announcements for 63.105.122.28 aren't making it to

the

West

Coast portion of Level 3's network.

i2.ndsu>show ip mroute 233.64.133.110
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C

-

Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register

flag,

T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP
Advertisement,
U - URD, I - Received Source Specific Host Report, Z -
Multicast Tunnel
Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 233.64.133.110), 00:00:50/stopped, RP 134.129.65.254,

flags:

SP
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list: Null

(63.105.122.28, 233.64.133.110), 00:00:50/00:02:40, flags: P
Incoming interface: FastEthernet1/0/0, RPF nbr 134.129.107.3
Outgoing interface list: Null

i2.ndsu>show ip mbgp 63.105.122.28
% Network not in table

2.ndsu>show ip msdp sa 233.64.133.110
MSDP Source-Active Cache - 2 entries for 233.64.133.110
(63.105.122.28, 233.64.133.110), RP 206.61.163.252, MBGP/AS

1239,

1w3d/00:05:13, Peer 192.42.152.174

Even though whois shows that 63.105.122.28 belongs to uunet

and

206.61.163.252 belongs to sprint traceroutes show that both are

on

Level 3's network.

traceroute 206.61.163.252
traceroute to 206.61.163.252 (206.61.163.252), 30 hops max, 40
byte
packets
1 a095.not-a-bridge.ndsu.nodak.edu (134.129.95.100) 0.859 ms

0.779

ms 0.321 ms
2 fast100.i1.ndsu.nodak.edu (134.129.107.3) 0.647 ms 0.444

ms

0.576 ms
3 router.gig.hecn.ndsu.nodak.edu (134.129.29.41) 1.166 ms
1.136

ms

0.739 ms
4 165.234.165.131 (165.234.165.131) 1.634 ms 1.004 ms

1.287

ms
5 165.234.165.129 (165.234.165.129) 1.748 ms 1.677 ms

1.833

ms
6 sl-gw33-chi-0-0.sprintlink.net (144.223.34.197) 37.837 ms

38.19

ms 41.226 ms
7 sl-bb22-chi-4-0.sprintlink.net (144.232.26.21) 38.921 ms

40.394

ms 206.61 38.611 ms
8 sl-st21-chi-13-0.sprintlink.net (144.232.20.91) 39.264 ms

39.245

ms 41.346 ms
9 sl-st20-chi-1-0.sprintlink.net (144.232.8.102) 39.151 ms

40.284

ms 39.616 ms
10 so-2-1-0.edge1.chicago1.level3.net (209.0.225.21) 39.285

ms

40.818 ms 38.925 ms
11 so-2-1-0.bbr2.chicago1.level3.net (209.244.8.13) 51.131 ms
88.688 ms 40.646 ms
12 ge-0-3-0.bbr2.washington1.level3.net (64.159.0.229) 58.12

ms

so-2-0-0.bbr2.washington1.level3.net (209.247.10.130) 58.892

ms

61.364 ms
13 ge-7-1.ipcolo1.washington1.level3.net (4.68.121.75) 58.171

ms

ge-7-0.ipcolo1.washington1.level3.net (4.68.121.11) 58.313 ms
ge-9-1.ipcolo1.washington1.level3.net (4.68.121.107) 58.163 ms
14 unknown.level3.net (63.210.25.154) 59.39 ms^C^\Quit

---
Bruce Curtis

Certified NetAnalyst II

701-231-8527

North Dakota State University

---
Bruce Curtis

Certified NetAnalyst II 701-231-8527
North Dakota State University

---
Bruce Curtis

Certified NetAnalyst II

701-231-8527

North Dakota State University

---
Bruce Curtis

Certified NetAnalyst II 701-231-8527
North Dakota State University

Matthew Davy
Chief Network Engineer, Indiana University
University Information Technology Services / Abilene Network

Operations

Center
2711 East 10th Street, Bloomington IN, 47403

/ 812.855.7728
PGP key fingerprint: A84D DFB6 9DD5 BEB4 1EF7 D713 956F F85C 6422

CBEB

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)

iD8DBQFBipJKlW/4XGQiy+sRAl1vAKCpGrfWX0xjMAVC2qczuOb5XrZ42wCg0M13
W8VwCT0C2Q35av2Zg5mtA6U=
=6Gc+
-----END PGP SIGNATURE-----

---
Bruce Curtis

Certified NetAnalyst II 701-231-8527
North Dakota State University

Fwd: multicast issue = 5828, Marshall Eubanks, 11/04/2004
- Re: multicast issue = 5828, Matthew Davy, 11/04/2004
  - Re: multicast issue = 5828, Marshall Eubanks, 11/04/2004
- <Possible follow-up(s)>
- Re: multicast issue = 5828, Bruce Curtis, 11/29/2004

List archive

Re: multicast issue = 5828