[Udpcast] UDPCast

Erik Jacobson erik.jacobson at hpe.com
Tue Feb 19 15:45:56 CET 2019


> 1) Is it possible to do the clone right through the server? I realize that
> there's dhcp configuration. When can I use it?

This depends on how you are using it. At its heart, in my opinion,
udpcast is like a big network pipe that works one-to-many. What you do
with the pipe is up to you.

You could use it to clone a complete disk I suppose, but that would mean
you'd want the clone source and clone target to be the same hardware
nearly exactly I'd assume -- both due to the drivers in your initial ram
disks and due to the need for the image to fit block-per-block without
exceeding space. I think people have used it this way, but unless you
have multiple disks or partitions, you still need some sort of environment
to host the udp-receiver on the client side whether that means putting it in
to an initrd or using something else to fill that role.

What has typically done for cases where you use udpcast as a image
transfer engine is you integrate it with some sort of cluster manager.
Systemimager integrates it as a component in
systemimager-server-flamethrowerd. Flametrhower manages the udp-sender
processes on the admin node and maps content to instances of the server.
Through it's install environment, the clients boot in to the
systemimager initrd/environment where a udp-receiver is installed. If
you boot the nodes at the same time, they can all join the same stream
for the disk image. In the cases of systemimager I've seen, they would
typically stage the image as a tarball on the node in TMPFS, then extract it
after the transfer. This works better because the transfer can be done
to memory, reducing retries (hard disks can induce retries). I believe this
is configurable of course.  In this use case, the image is typically a tar
file created on the fly through the pipeline. However, there is still pre and
post-configuration to get this going. Pre-configuration might be making disk
partitions and filesystems and post-configuration may be things like ensuring
the newly installed nodes install bootloaders and configure themselves.

In my area, we use udpcast in a case similar to systemimager although we
have our own environment that replaces the systemimager function. As I
mentioned, we're still using it with current distros and now even with
the aarch64 architecture. I prefer the udpcast stream approach to other
multicast approaches that use file block lists. This is because,
whenever possible, I want to avoid having a image file that is a copy of
an image directory layout. It is painful to have to maintain files that
represent directory layouts, push them around to multiple administrative
servers in a big node, and re-create and re-push all that stuff each time
the master image is changed. So with udpcast, the tar is created and
piped on-demand. We even integrated encryption in to the stream using
openssl in some cases where customers have wanted that.

On really big systems, we have multiple administrative servers (leader
nodes) that all have udp-sender instances managed by flamethrower
currently, We tend to handle 288 node per leader in this mechanism
although we've had higher ratios. udp-sender currently supports up to
1024.

One challenge area with udpcast in my opinion is multicast support in
switches -- especially smart switches -- have made pain points for us in
large clusters. It can be painful to get groups of switches from
different vendors to behave well with multicast in some cases. We've
even had to resort of "flood all ports" in some desperate scenarios. In some
cases where the management switches had been decided ahead of time by
other teams, we've had to switch to using bittorrent instead of udpcast to
avoid multicast issues. Bittorrent means you need a file instead of a stream,
so there is added maintenance in making sure the tarball represents the
current image copy and is the same on all the admin servers. Still, with
everything properly configured on a clean network, it's hard to beat the
magic of having 10,000 nodes all installing at once through their
respective administrative (leader) nodes and VLANs using udpcast.

Best wishes,

Erik


More information about the Udpcast mailing list