Simple speed comparison between cp, mv, and rsync

user warning: Table './drinking_drpl2/watchdog' is marked as crashed and last (automatic?) repair failed query: INSERT INTO watchdog (uid, type, message, variables, severity, link, location, referer, hostname, timestamp) VALUES (0, 'flickr', 'Could not connect to Flickr, Error: Forbidden', 'a:0:{}', 4, '', 'http://rothwerx.com/content/simple-speed-comparison-between-cp-mv-and-rsync', '', '204.79.180.21', 1508461933) in /home1/drinking/public_html/rothwerx/modules/dblog/dblog.module on line 146.

Jeremiah - Posted on 03 February 2010

 One of my clients work with huge GIS data files, and they want to expand their storage space. They currently have 4x500GB drives in RAID 5 for about 1.5TB of space, and they want to replace it with 5x1TB drives in RAID 5 for around 4TB of space. As a separate project I replaced all the drives in their Buffalo TerraStation Pro II NAS with 1.5TB drives to give them about 5.5TB of formatted space for backups of their GIS data.
 
With the latest firmware for the Buffalo NAS I got NFS support, which works well because their GIS server is Linux. I created an NFS mount on the GIS server and ran rsnapshot to do an initial backup. I used /usr/bin/time just to see how long it would take to rsync that ~1TB of data, and it took 38.5 hours!
 
Once I replace the drives in the GIS server I'm going to have to shut off access to the share so no one can modify the files, and take another copy of all their data, so I got to wondering which would be the fastest way. I know cp is faster than mv on local file systems, but is that the case for a network attached file system (the NAS uses XFS, for what it's worth) on a gigabit network? How much faster would rsync be than cp?
 
So I decided to experiment. I created a ~2GB file (dd if=/dev/zero of=gigtest.file bs=1MB count=2000) and once again used /usr/bin/time (as opposed to the bash built-in 'time') each time I moved it from my ext3 RAID 5 space to my NFS-mounted NAS. Here's what I found:

  • cp: 3 minutes 27 seconds, 3% CPU
  • rsync: 4 minutes 8 seconds, 21% CPU
  • mv: 3 minutes 30 seconds, 3% CPU

Huh. mv was actually slower than cp by a few seconds. And rsync, which I've heard is faster than cp on local filesystems, was more than half a minute slower than either cp or mv. So then I wanted to know how it performs on all-local filesystems, so I did essentially the same experiment, but just moving a 2GB file from one directory to another on my GIS server. Here are the results:

  • cp: 1 minute 9 seconds, 13% CPU
  • rsync: 1 minute 43 seconds, 54% CPU
  • mv: 0 minutes 19 seconds, 6% CPU

So again, rsync is the slowest, while mv is clearly the winner for local filesystems. Of course this isn't any kind of scientific test, I could be missing key points like rsync is better for small files, or multiple files. But for my limited testing, it doesn't look like it matters if I use cp or mv to get the data off the server and onto the NAS, it's still going to have to be a weekend job.
 
Two other things I could try: the NAS has an adjustable frame size for gigabit networks so I could adjust that to see what differences I get, and I could also try mounting it via SMB or pulling it with FTP, but I doubt either of those will give me better performance than NFS. Well, unless Buffalo's implementation of NFS is particularly bad. Maybe I'll try it - we'll see.