[mirror-admin] push mirroring - who owns the SSH keys?

Sat Jun 20 12:15:17 EDT 2009

Some thoughts on ssh key

1) It adds an additional user to the mirror machines, this could be 
problematic from a policy perspective, and should the private keys 
become known it poses certain security risks for the remote mirrors. 
Even using individual key pairs has this risk, since since gaining 
access to one private key is no more or less difficult in gaining the 
private key to all of them since the keys will be in the same location. 
  The only advantage to using multiple key pairs is that the daemon has 
to use one key pair per machine/site thus helping sandbox it's own 
interaction with the remote side and verifying the remote sides 'identity'.

2) It adds the complexity of dealing with a user specific to this 
purpose on all of the machines, and given that there are hundreds of 
mirrors in the Fedora this is bound to get someone burned.

3) SSH provides no queuing mechanisms, and as we've run into similar 
issues with queuing and such (with respect to rsyncFilter) there might 
be better alternatives.  This is particularly bad if there's a transient 
error to the master servers, that does not affect the local geographic 
users.

I'm not saying that push mirroring is bad, it has some advantages over 
pull mirroring, but using SSH as your trigger mechanism has some hefty 
downsides.  When we (kernel.org) were looking into providing push 
mirroring we considered ssh, but ultimately went with e-mail as the 
trigger mechanism.  It queues both remotely and locally, can be gpg 
signed if your worried about authenticity, doesn't force an additional 
user to be created / exist and I would guess that setting up, even a 
machine specific e-mail address alias, has a lower barrier to entry than 
creating a user account.  Some thoughts to consider.

Other thoughts inline.

- John

Matt Domsch wrote:
> I'm starting to think again about push mirroring, with an eye to
> having something in place for general use by Fedora 12.  Anyone who
> would care to help would be greatly appreciated.
> 
> In the grand scheme, I envision:
> 
> 1.  rel-eng posts new content to the master
> 2.  rel-eng waits for new content to finish replicating to all the
>     masters (for the curious, that's currently done via a NetApp
>     SnapMirror).
> 3.  rel-eng informs MirrorManager that new content is ready.
>     Hopefully with directory paths included (so as to not need to
>     rescan the whole server)
> 
> For N in [0 1 2]:
> 
>   4.  MM informs each Tier N mirror that new content is ready.
>   5.  Each Tier N mirrors download the new content.
>   6.  Each Tier N mirror runs report_mirror to inform MM that it has the
>       new content.
> 
> For this to work, MM is going to need:
> a) To know the tiering hierarchy (who pulls from whom).  MM has a
>    field for this today (the "upstream" field on each HostCategory
>    page) though it's not used for anything at present, and any values
>    set presently are probably meaningless.

Could continue to have MM not know about the upstream, as there are 
times when a mirror will change it's upstream and it may forget to 
change MM.  This could be particularly problematic during a release as 
mirrors change their upstream.  Though if your doing push mirroring you 
could actually pass which upstream to use as a component of trigger. 
I.E. mirrors[12].kernel.org and ibiblio are the first mirrors synced 
upstream that your allowed to pull from it sends the list of available 
upstreams with that content out with the trigger.

> b) A trigger method to inform each mirror that their upstream has
> changed.
> 
> For b), Debian uses an 'ssh push' trigger method.  The upstream mirror
> does an outbound SSH to each downstream mirror, using a per-pair SSH
> key, the private half is only known to the upstream mirror.  The
> script itself executed on the downstream mirror simply sets up an
> rsync pull to occur, then exits; the actual pull happens
> independently.
> 
> Other triggers could be email (signed) sent out from MM to the
> robot_email address on the Host page; a message on an AMQP bus (which
> would require such users to have an open connection to the AMQP server
> running in Fedora Infrastructure), or [insert your favorite trigger
> method here].  I'm open to several, it's just a small matter of code.
> 
> Thinking about the SSH push method, and particularly, key management.
> Should MM create the keypairs and maintain them?  This would give a
> lot of flexibility to downstream mirrors, being able to change their
> upstream "at will" (edit the upstream field in MM, and you immediately
> start to get notifications when your upstream changes; no need to have
> your new upstream mirror admin get involved).  But would people feel
> comfortable with this?

How would the upstream mirror deal with granting access to the 
downstream mirror in this case?  MM doesn't provide something like an 
rsync user/pass combination to the [up|down]stream mirrors for this 
interaction and maintaining a programmatic list of allowed IPs in rsync 
would be a PITA (since that list can't exist in a separate file you 
would be continually updating and modifying the actual rsync config file)

If you move to using ssh as the rsync transport that means that a 
private key is going to need to exist on the client and that the public 
key exist on the server.

There are some advantages if the trigger mechanism can tell the 
downstream where to go, means that machines in Asia, Europe, Africa, 
Australia could treat a mirror that gets the content first as their 
local 'upstream' once it has all of the content.

> Can we use one keypair per downstream mirror, or do we need one
> keypair per (upstream, downstream) pair?  The upstream's (private)
> half of the keypair is only known to MM.
> 
> Should a security breach happen to MM, the private half of the keypairs could
> become known.  This can be mitigated by ensuring the keypairs can only
> run one command on the downstream mirror, one that would be relatively
> safe for anyone to run at any time.  But would it be better for MM to
> have all those keypairs, or for each (upstream, downstream) mirroring
> arrangement to have their own keypairs for this purpose, and MM has
> nothing to do with it?  When the upstream runs report_mirror, it then
> runs the ssh push triggers to its downstreams itself...
> 
> Looking for ideas, input, and coders.
> 
> Thanks,
> Matt
> Fedora Mirror Wrangler
> 

--