transport - Re: [transport] Re: Need Advice on Multicasting Large File Data Sets
Subject: Transport protocols and bulk file transfer
List archive
- From: Larry Dunn <>
- To: , Larry Dunn <>
- Cc: Transport WG <>, Michael Laufer <>
- Subject: Re: [transport] Re: Need Advice on Multicasting Large File Data Sets
- Date: Fri, 22 Feb 2013 09:19:01 -0600
- Authentication-results: sfpop-ironport01.merit.edu; dkim=neutral (message not signed) header.i=none
Bill,
I'll toss out a couple quick thoughts,
but others on the list may have better advice.
1. multicast
could do it, but would need "reliable multicast",
where some (often TCP) unicast flow is used to "fill in"
whatever was missed on the initial multicast send.
Some consider this highly experimental,
but you can probably find others who consider it "no big deal".
I'd at least think-through the options mentioned below,
before committing to multicast path...
2. hierarchical distributed cache.
Take a look at what Cern is doing for LHC.
They spend several years working out how to do something
very similar to what you describe (though the real-time aspect
is not quite the same, the notion of "lots of data to multiple
partners" is a good fit).
Primary source distributes to geographically-sensible
caches. Since your data in time-sensitive, those caches would
not need to be very large (maybe hold only 2-3 orbit's worth of data)?
Maybe 1-2 per continent, and you could to a unicast feed
to one of them, have it feed the other nearby.
And end-users pick from best-cache.
Main advantage is that it spreads out the bandwidth pattern,
rather than having a single source need huge pipe to
feed several receiver concurrently, for example.
3. A related system has been pioneered by one of the folks
on this list, and Internet2 already supports it in many
locations.
That is "Phoebus", from Martin Swany and co.
This might be a made-to-order fit for your data
distribution. Or maybe not- but worth having a chat w/ Martin!
Feel free to drop me a note if you'd like to chat further...
Larry Dunn
desk = +1-651-638-6155
--
On Feb 22, 2013, at 8:14 AM, Bill Owens wrote:
> Transport folks, this note came to the Multicast working group yesterday,
> but after a little bit of digestion the overall opinion was that multicast
> probably wouldn't cut it. I wonder if there might be better answers to be
> found here?
>
> Bill.
>
> On Thu, Feb 21, 2013 at 08:06:38PM -0500, Michael Laufer wrote:
>> I recently joined this working group and would like some advice.
>>
>> My organization is considering the possibility of using new methodologies
>> to distribute near real time satellite weather data streams to
>> international & domestic partners/users, probably via Internet2 and
>> international peers. The previous generation of satellites would produce
>> ~30 Gbytes/day of data products but the new generation produces 3-4
>> Terabytes/day. Existing distribution methods will not economically scale,
>> especially internationally. The data streams are all files and any packet
>> loss would cause a file to become corrupted and unusable.
>>
>> We would like to investigate the possible use of multicasting to
>> implement
>> this. To do this we would need a very large file distribution capability.
>> I have been researching some possibilities but would like your
>> suggestions
>> on what software/applications/systems could be used for this by us and
>> our
>> partners/users, especially with our specific requirements. We would also
>> welcome any other suggested methodologies that could achieve similar
>> results.
>>
>> The following are some details that may be useful:
>> This information is for one satellite only (in a polar orbit). In a few
>> years we may need multiples of this.
>> Number of source distribution sites: 1 (with a possible backup)
>> Number of destination partners/users: 10-30 (may grow if this is
>> successful but not > 100)
>> Each file has a companion checksum type file (a few bytes only)
>> associated
>> with it. The numbers of files listed below do not include these checksum
>> files.
>> Number of products streams: ~100+. Partners/users must be able to
>> subscribe to individual streams, some may want all steams, others
>> subsets.
>> Number of files/product stream: 2 - ~200, most streams with large files
>> have ~70. All files in a stream are ~same size.
>> How often to distribute: Every ~100 minutes (~14 times/day).
>> Data per distribution: ~250 Gbytes/orbit (~3.5 Tbytes/day)
>> Files per distribution: ~11000/orbit (~150000/day)
>> Number of files (approximate) per file size:
>> 1500 < 10 Kbytes < 2500 < 100 KBytes < 1000 < 1 Mbyte < 1500 < 10 Mbytes
>> <
>> 4000 < 100 Mbytes < 500 < ~600 Mbytes
>> Time for distribution: Goal is <= 10 minutes (10% of orbit). Can tolerate
>> up to ~50 minutes (50% of orbit). [Cannot be longer as need to allow
>> catch
>> up time for missed orbits.]
>> Bandwidth required: ~3+ Gbits/sec for ~10 minute distribution.
>> Current I2 connection: 2 x 5 Gbits/sec (We may be able to add more if
>> needed).
>> Security: UDP would security easier. Must be able to permit/deny each
>> separate partner/user request. Prefer separate sending system and any
>> return/feedback system (for missed files/retransmission notification).
>> Time frame: ~ Summer to start initial testing.
>>
>> We would start initially with some of the smaller product streams and add
>> additional larger streams as partners/users request. Would probably start
>> and test with domestic partners/user directly on I2.
>>
>> We have discussed bandwidth needs with I2 and GEANT and they will support
>> this effort.
>> Please NOTE however: THIS IS ONLY EXPLORATORY AND NOT ANY DECISION OR
>> COMMITMENT TO IMPLEMENT THIS DATA DISTRIBUTION!!!!!
>> Thanks in advance for any help and/or suggestions you can offer.
>> Michael
>>
>> --
>> Michael Laufer
>> NOAA/NESDIS/OSD/GSD Systems Engineering - Network & Security Architect
>> Contractor, Columbus Technologies & Services
>> NOAA - NSOF 4231 Suitland Road Suitland MD 20746
>> Office: (301) 817-4410 Mobile: (301) 340-8772
>> Note: I am not a government employee and have no legal authority to
>> obligate any federal, state, or local government to perform any action or
>> payment.
- [transport] Re: Need Advice on Multicasting Large File Data Sets, Bill Owens, 02/22/2013
- Re: [transport] Re: Need Advice on Multicasting Large File Data Sets, Larry Dunn, 02/22/2013
- Re: [transport] Re: Need Advice on Multicasting Large File Data Sets, Scott Brim, 02/22/2013
Archive powered by MHonArc 2.6.16.