Thanks, David. This has been a great discussion and definitely justifies the need for continued iteration on the architecture as we move forward. Most of what's been discussed is probably a good second step after we get through the "bake-off" period here with SDN and native VLANs. There are definitely more interesting things we can do after we're more comfortable with the native functionality of the AL2S network.
On May 31, 2013, at 2:38 PM, "David Crowe, Jr." <> wrote: hi grover, et al,
i wanted to bring the conversation back to the main short term issue chris is trying to get consensus on and this note here seems to be a good one to work from. the other discussion on SDN, tools, path visibility, etc is all good as well but we need to get a recommendation soon that chris can work with.
to that end...
On 05/30/2013 06:20 AM, Grover Browning wrote:
I agree. Maybe something like:
*) Total connector (member?) availability (with a target of 100%) *) Impaired availability (I had to fail over from my 100g to my 10g and ran that way for 4 hours)
This, within the context of a L3 full mesh using VLAN/SDN L2 paths and MPLS FRR. Does that sound like a decent plan to everyone?
on the backbone side, MPLS FRR+BFD (last option in the document) coupled with the full mesh between IP nodes would be my end-state choice.
in the initial roll out, i would suggest the partial mesh plan with MPLS+BFD with parallel static VLAN and OESS/OF configured paths. this would give us operational information without the full overhead of fully configuring all of the full mesh up front.
for the FRR path priority i would suggest:
1) either static VLAN or OESS/OF path 2) current/traditional inter-node LAG connections 3) opposite selection from #1
then we can gather stability data over a few months time and see where to go from there. if data supports it we could remove the static VLAN option from the equation to simplify that issue and move toward more of a full mesh deployment.
on the customer/connector side, we should have a similar selection of paths available if desired by the individual connector. BFD should be employed where possible on the connector side for AL2S path failure detection and the connector will need to have the knobs available to select preference of any individual path for traffic to flow.
more bashing, arguments for/against, alternatives, etc are welcome.
thanks,
David
On May 29, 2013, at 1:58 PM, John Moore < <>> wrote:
I think the member availability idea is important, certainly not a distraction. It's a good differentiator for us, particularly with new constituents like healthcare and public safety that some of the regionals are aggressively pursuing.
I'd be OK with counting a sub-100 ms restoration event as impaired but not down, so that the member availability number doesn't see a hit. A metric that indicates how long we were in the impaired (or non-optimal) state might be useful, perhaps from the latency matrix.
John M
On Wed, May 29, 2013 at 7:54 AM, Grover Browning < <>> wrote:
VLANs/Trill gets you the link failure faster but FRR gets you the path restoration faster, correct?
We seem to have three issues here, the topology, the L2 mechanism, and the L3 restoration mechanic. I believe the L2 mechanism was decided in the calls over the last couple of months; a combination of VLAN paths and SDN paths.
On the topology, I agree with the full mesh approach Michael. Mimicking the in/out behavior of L3 doesn't seem to have any real advantages and at least one major disadvantage.
L3 restoration is a more interesting discussion, I think. FRR gets the I2 L3 system up to the state of modern communication networks. If it's not a distraction then I'd also like to see a greater effort put in to a kind of "member availability" figure. Some way to calculate availability and "impaired but not down" figures for a set of BGP connections. IE: move the discussion from individual restoration technologies to a more holistic view of member availability. While reducing a path restoration to <100ms is a great goal I'd also like to see us move toward 100% member availability.
On May 24, 2013, at 11:50 AM, Michael H Lambert < <>> wrote:
On 24 May 2013, at 10:04, Grover Browning wrote:
Full mesh with MPLS FRR/BFD.
Trying to see past my knee-jerk, religious, "MPLS is Evil"
initial reaction...
Let's assume that at some point in the not-too-distant future we
have solid, full-featured implementations of OpenFlow version x.y, where ((what we need) <= x.y < (what we want)). What do we want the IP-over-AL2S topology to look like at that time? If it's a full mesh of SDN-configured paths, then why not start out with a topology that mimics that goal. If it's not, then we might want to go in some other direction in the interim. I would conjecture that a full mesh is a reasonable approach, at least until we have a tight(er) coupling among SDN, routing protocols and real-time flow switching.
As an alternative to MPLS/FRR/BFD, how about traditional VLANs
on top of TRILL rather than spanning tree?
Michael
-- John H. Moore Senior Director, Innovation and Strategy Chief Security Officer MCNC <> 919.248.4186 (desk) 434.284.3990 (mobile)
|