tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: getting creative with cp (was Re: cp -n diff)



> For example, does it help to think of cp as a two-part process: 1)
> compare two directories or trees, 2) resolve differences between the
> directories/trees by creating/destroying links or by making copies.

It's not quite where you seem to be going with that, but I do have a
tool (I call it, imaginatively enough, "compare") that is specifically
designed to compare two or more directory trees and list the
differences.  It also has an option that says "the first one is a
master copy, make the rest just like it".  It supports network-remote
operation, either inetd-style or rsh-style, too; I routinely use it to
update copies of things which I want to be identical on multiple
machines (the first version was written back in the bad old SunOS days,
to compare diskless root filesystems on the server).

It's got deficiencies, certainly.  I'm even aware of a few of them.
But if anyone wants it - whether for adoption or use, or as a starting
point, or just to look at for ideas - feel free.
ftp.rodents-montreal.org:/mouseware/local-src/compare or git clone of
git://git.rodents-montreal.org/compare are the places to look.

> B) Speed it up: couldn't cp copy large directory trees much, much
>    faster if it copied files concurrently instead of sequentially,
>    thus giving the disk scheduler a lot more requests to chew on?

Maybe.  I'd want to test this thoroughly first, beacuse I would expect
there to be a performance-impairing effect from more seeking, and I
don't know which one would prove to be the greater.

> C) Speed it up some more: what if the kernel allowed for "lazy"
>    copies to be made when the source & destination were on the same
>    filesystem, and the filesystem supported it?

Filesystem-level COW would be a very nice thing to have.  I would
question whether it belongs in cp, though, or more precisely whether cp
should be aware of it.  I would prefer to have the filesystem
autodetect whenever a block is written that is identical in content to
a block already present and share the data.  (Yes, what I'm talking
about here borders on a content-addressed disk layer.)

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index