[Udpcast] what is the speed bottleneck?
Felix Rauch
rauch at inf.ethz.ch
Fri Sep 24 06:36:16 CEST 2004
On Thu, 23 Sep 2004, Ramon Bastiaans wrote:
> I was wondering if anyone knows or could tell what the bottleneck is for
> udpcast multicast speeds?
[...]
> Is it the udp protocol, or the multicast technique, or could it still be
> a hardware issue?
>
> Any opinions on the subject are appreciated, perhaps some of the authors
> of udpcast could give some insight?
Disclaimer: I'm not an author of udpcast, but I have experience with
multicasting large amounts of data in clusters. Furthermore, I wrote a
reliable multicast protocol many years ago and more recently a tool
similar to udpcast, which works technically different (Dolly [1]).
There are a number of possible bottlenecks in such a scenario: First,
there are the trivial bottlenecks like disk speeds and network
throughput. With Gigabit Ethernet the network will almost certainly
not be the bottleneck. Second, there are the more complex bottlenecks
like CPU, memory and PCI bus, or complexity.
Personlly I think that when using IP multicast (as udpcast does), the
complexity of the whole protocol might be a limiting factor, because a
single sender has to coordinate so many receivers. However, I don't
have any data to substantiate this claim. The problem is that the
sender has to send the data at the speed of the slowest receiver. The
slowest receiver is not necessarily known in advance and it might also
change during the transmission. Adapting the speed correctly is not an
easy task.
Thus, for our own cloning tool Dolly -- I'm sorry for the shameless
plug on this mailinglist -- we use TCP to transfer large data files
(like whole partitions or disks) to many nodes in a cluster. Since TCP
works only between a single sender and a single receiver, it can much
better adapt to the maximal transmission throughput as well as to
changing conditions. To link all the participating nodes together, we
simply form a virtual ring with TPC connections. The data is then sent
around this link concurrently. It sounds to be against intuition, but
works remarkably well (and in fact better than any IP-multicast-based
approach I have heard of so far).
For example, in a cluster of 16 nodes with 1-GHz PentiumII processors
interconnected by Gigabit Ethernet, we could get up to approximately
60 MByte/s throughput with Dolly (for benchmark reasons and to
eliminate the trivial bottleneck without actually accessing the
disks). With udpcast we got about 45 MByte/s (also without accessing
the disks) after tweaking with the parameters (sometimes udpcast
simply stopped transmission).
Please note that I'm not saying udpcast is bad. It just has different
application areas. Udpcast is much better (or even the only solution)
if the network is not switched, asymmetric or even unidirectional. For
a tightly interconnected, switched high-speed network in a cluster,
Dolly achieves usually better throughput. Therefore, Dolly is by the
way used as cloning tool for the Xibalba 128-node cluster at ETH
Zurich [2].
In short, to find the bottleneck in such a scenario is more complex
than it might seem at first. If you are interested, you will find some
research papers at [3].
- Felix
[1] http://www.cs.inf.ethz.ch/CoPs/patagonia/#dolly
[2] http://www.xibalba.inf.ethz.ch/
[3] http://www.cs.inf.ethz.ch/CoPs/patagonia/#relmat
--
Felix Rauch | Email: rauch at inf.ethz.ch
More information about the Udpcast
mailing list