[nylug-talk] The best way to mirror 6M+ files?

Peter C. Norton
Mon May 8 10:00:07 EDT 2006


Are the files organized into directories? If so (and I have a hard
time imagining otherwise), and the directory heirarchies are stable,
you could use rsync by calling it once per directory:

for target in $(ls -d </path/to/files/*>) ; do
  rsync -vae ssh $target <remotehost>:</path/to/files>
done

If your directories have < 250k files per, I think you should be able
to do this in a reasonable timeframe.

If you need to have the files be stable while you do this, I suggest
you look at the volume managers snapshot capabilities.

-Peter

On Mon, May 08, 2006 at 09:46:10AM -0400, Yusuke Shinyama wrote:
> Hello,
> 
> I am trying to mirror two disks that have the same size (320GB).
> The source drive has about 6 million files in ext3 filesystem and
> the number of files keeps growing every day.  I tried rsync and
> found it impossible because it requires so much memory (the
> machine has 3G, but that was not enough).  I could use dd to copy
> raw data, but I am wondering if there is a better way to do
> this. Any hints?
> 
> Thanks,
> Yusuke
> _____________________________________________________________________________
> Hire expert Linux talent by posting jobs here :: http://jobs.nylug.org
> The nylug-talk mailing list is at nylug-talk at nylug.org
> The list archive is at http://nylug.org/pipermail/nylug-talk
> To subscribe or unsubscribe: http://nylug.org/mailman/listinfo/nylug-talk

-- 
The 5 year plan:
In five years we'll make up another plan.
Or just re-use this one.



More information about the nylug-talk mailing list