[nylug-talk] The best way to mirror 6M+ files?

Yusuke Shinyama
Mon May 8 10:42:08 EDT 2006


Hi Peter, thank you for responding.

"Peter C. Norton" <spacey-nylug at lenin.net> wrote:
> Are the files organized into directories? If so (and I have a hard
> time imagining otherwise), and the directory heirarchies are stable,
> you could use rsync by calling it once per directory:
> 
> for target in $(ls -d </path/to/files/*>) ; do
>   rsync -vae ssh $target <remotehost>:</path/to/files>
> done
> 
> If your directories have < 250k files per, I think you should be able
> to do this in a reasonable timeframe.

Actually one of the top-level directories has already 3M+ files
and the structure of the subdirectories might change. Maybe I need to
ask the users to organize them in a more consistent way.

> If you need to have the files be stable while you do this, I suggest
> you look at the volume managers snapshot capabilities.

They don't have to be stable, but the source disk needs to be
accessible all the time. The users run various experiments on
those files via NFS.

Yusuke


More information about the nylug-talk mailing list