Synchronize a Folder Across Machines Over the Internet–Mac, PC, Whatever
jason| February 27, 2008 1:52 pmHere is a useful HowTo for anyone who needs to keep files in sync between PCs. There are many commercial solutions out there (iDrive et al.) that facilitate this, but those can be expensive and often limit space to a few gigabytes at most. This solution allows you to synchronize as much as you need and keep costs down. It is also relatively easy to implement across multiple platforms and operating systems. Here are the basic ingredients of this design:
- A publicly accessible machine, always connected to the Internet, that you can access via ssh. This can be a web host that allows ssh access or a machine of your own hooked up to the Internet with a static IP or dynamic DNS.
- Unison. Don’t worry about this one. It’s a free rsync-like tool that supports bidirectional edits and can be downloaded for free.
- Some time and patience. This method isn’t exactly for the faint of heart–but no worries, walk through step by step and you’ll get there.
If it looks like you can handle those three, read on for the details.
In order to facilitate synchronization between more than two machines (or even two machines that cannot directly communicate due to firewalls, etc.), it is necessary to use a star or hub and spoke topology. Choose a machine that is always connected to the Internet with a sufficiently fast connection as the hub. This may or may not be a machine from which you intend to access your files, it can be an intermediary. SSH capabilities are also required as Unison tunnels through SSH. SSH, like Unison, is very portable and available on most every platform, however, it will be most convenient to choose a Linux, BSD, or some other platform that natively supports an SSH server as the hub. Setting up an SSH server on Windows is beyond the scope of this article.
Once the hub is chosen, Unison needs to be installed on the hub and each of the spokes–client machines that will be synchronizing the shared folder. Unison is a small 1.4MB stand alone executable, but it can be a little tricky to get installed. Below are the steps I took to get it running on three different platforms. Something to keep in mind is that you should install the same version of unison on each system and upgrade them all at the same time as well.
My hub is my web host, which is running Linux. The best way to get Unison installed on Linux is to build from source, but first the OCaml Compiler must be built. So to install on my web server, I connected via SSH and typed the steps below. In all cases below, take the \ character to mean you should append the next line to the line above it. Lines beginning with # are comments.
$ curl -O \
http://caml.inria.fr/pub/distrib/ocaml-3.10/ \
ocaml-3.10.1.tar.bz2 # wget, scp, ftp or any other method may be used to
# get the packages onto the server $ tar jxvf ocaml-3.10.1.tar.bz2 $ cd ocaml-3.10.1 $ ./configure -prefix ~ #set the prefix to home directory $ make world opt install $ cd .. $ curl -O \
http://www.seas.upenn.edu/~bcpierce/unison/download/ \
releases/stable/unison-2.27.57.tar.gz $ tar zxvf unison-2.27.57.tar.gz $ cd unison-2.27.57 $ make UISTYLE=text $ cp unison ~/bin $ cd .. # These last commands are optional cleanup steps $ rm -f ocaml-3.10.1.tar.bz2 && rm -rf ocaml-3.10.1 $ rm -f unison-2.27.57.tar.gz && rm -rf unison-2.27.57
That takes care of installing Unison on Linux which was the most involved of my three installations. The second system I installed Unison on was an office PC running Windows. Binaries are provided for the Windows platform. Download the latest here and put it anywhere. Use the text version, not the Gtk+ version. I renamed mine to unison.exe and placed it in the C:\Windows\System32 folder.
My third machine is my favorite: a MacBook Pro running OS X. On OS X, install MacPorts and then install Unison by typing the following at the terminal:
$ sudo port install unison
That’s it! MacPorts will automatically download and compile OCaml, then Unison. A beautiful thing, no?
Now to get SSH ready for Unison. On Windows, start by downloading PuTTY. Get the whole package, not just putty.exe and install it. By default, it will install to C:\Program Files\PuTTY.
Now, generate certificates so that the Windows machine can SSH into the hub without a password. Run PuTTYgen to generate a key. Just click ‘Generate’ and take the defaults. When it finishes, save the public and private key generated using the ‘Save public key’ and ‘Save private key’ buttons respectively, acknowledging that you really want to save the private key without a password. Adjust file permissions on the private key and ensure that only your user has access. Delete all other users including SYSTEM, etc. This is necessary to protect your private key because anyone with access to this key will have access to your hub.
The format of the public key needs to be modified to work with OpenSSH on Linux. Open the public key in WordPad. Do not use Notepad! Delete the first two and the last lines of the file. All human readable text should now be gone. Now delete the carriage returns, combining all the characters into one long string. At the beginning of the string, type ssh-rsa and a space. The file should look like this:

Save the file and upload it to your hub using PuTTY’s pscp.exe or some other method. SSH into your home folder on the hub and type the following:
$ mkdir .ssh
$ chmod 0700 .ssh
$ cat PUBKEY >> .ssh/authorized_keys
#replace PUBKEY with the name of the public key
#you uploaded
$ rm PUBKEY
$ chmod 0600 .ssh/authorized_keys
$ chmod 0755 $HOME
The file permissions are very important. If OpenSSH detects an insecure configuration, it will not allow certificate authentication. You should now be able to SSH into the hub by typing the following from the Windows command prompt:
C:\> c:\progra~1\putty\plink.exe -ssh -l USERNAME -i \
PRIVKEY HOST.COM
Assuming PuTTY was installed to the default location and replacing USERNAME, PRIVKEY, and HOST.COM, with your username, path to private key, and DNS name or IP address of your hub respectively.
The next step is to set up a couple of batch files to synchronize a folder. The first, I called unisonssh.bat and is used to provide an SSH interface to Unison. It’s contents look like this:
@c:\progra~1\putty\plink.exe -ssh -C -l USERNAME -i \
PRIVKEY HOST.COM "~/bin/unison -server -auto"
This is all on one line, replacing your information as before. The @ symbol is critical. The -C option tells SSH to use compression which will make things faster. The part in quotes is the command that will be executed on the hub. The next batch file, which I called sync.bat looks like this:
C:\Windows\System32\unison.exe c:\SYNC_FOLDER \
ssh://HOST.COM/SYNC_FOLDER -sshcmd unisonssh.bat \
-servercmd ~/bin/unison -batch
Check to ensure the path to unison.exe is where you placed it if you put it somewhere else. SYNC_FOLDER can be any path locally and any path relative to your home folder on HOST.COM–these paths need not be the same. Include the full path to unisonssh.bat, wherever you put it. The -servercmd tag points Unison to the executable in your home folder instead of the default. The -batch tag just tells Unison you don’t want to be prompted for anything.
You should now take a minute now to create the local SYNC_FOLDER and populate it with something to test, then execute sync.bat. SSH into the hub and verify that your test documents synchronized correctly. If anything goes wrong, read the output carefully. It will likely point you to what went wrong. Check your paths and walk back through the instructions to make sure every step was completed. Congrats! Your Windows machine is now synchronizing with the hub.
Next, I’ll detail my OS X configuration which should be very similar to setting up a Linux client. First, set up SSH with certificates, just like we did on Windows:
$ ssh-keygen
#press return until it finishes, taking defaults
$ scp ~/.ssh/id_rsa.pub USERNAME@HOST.COM:
#use this or any other method to copy the public key
#to your hub
$ ssh USERNAME@HOST.COM
#all the following commands are performed on HOST.COM
#if you didn't create .ssh and adjust file permissions
#with a Windows machine, review and perform the
#mkdir and chmod commands in the instructions above.
$ cat id_rsa.pub >> .ssh/authorized_keys
$ rm id_rsa.pub
My username on the hub is different than my local username, so I created a ~/.ssh/config file with the following contents:
Host HOST.COM
User USERNAME
This is a shortcut so that I can just type ssh HOST.COM instead of ssh USERNAME@HOST.COM every time. You should now be able to type ssh HOST.COM and log into your hub without typing a password.
My sync script on OS X looks like this:
#!/bin/sh
/opt/local/bin/unison -batch -sshargs=-C \
-servercmd=~/bin/unison ~/SYNC_FOLDER \
ssh://HOST.COM/SYNC_FOLDER
The SYNC_FOLDER path on the server needs to be the same as the one you used on the Windows machine, but the local one can be wherever you want. Create the sync script in your favorite editor, chmod 0755 sync and execute. The files you synchronized from your Windows workstation should now come down to the SYNC_FOLDER you specified. All done! You now have a working cross platform sync solution.
If you want to get fancy you can automate things so that you don’t have to execute a script every time you change files. On Windows, just use Scheduled Tasks in the Control Panel. On Linux, you could use cron. I got really fancy with my MacBook Pro and used launchd. You see, with a laptop, I’m not always connected. So I wanted a solution to automatically synchronize only when online. Here is what I came up with:
First, modify the sync script as follows:
#!/bin/sh
while [ -f /var/run/resolv.conf ]
do
/opt/local/bin/unison -batch -sshargs=-C \
-servercmd=~/bin/unison ~/SYNC_FOLDER \
ssh://HOST.COM/SYNC_FOLDER
sleep 900
done
This logic synchronizes the shared folder every 15 minutes (900 seconds, feel free to adjust) as long as exists. The existence of is my poor man’s network detector. OS X creates this file with a list of name servers whenever an interface is configured via DHCP and deletes it when disconnected from any network. In this way, as long as the script is started after a network connection is established, it will kick off synchronization every 15 minutes until a network connection no longer exists.
Starting the script when a network connection is first established is a job for launchd. Download and install Lingon. This tool allows you to easily create launchd jobs. Open Lingon, and click the ‘New’ button. I chose to ‘My Agents’ from the next menu because I don’t need files to synchronize when I’m not logged into my laptop. Configure a job similar to the following:

Adjust the name and path of the script to what you want. The important parameter is where it says, “Run it if this file is modified.” Entering /var/run/resolv.conf for this parameter will ensure that the modified sync script gets launched when a new network connection is established. Launchd takes care of the rest. It won’t run the script twice and once the script is running it will continue to run until the network is disconnected, and resolv.conf is deleted. Now that is spiffy! Close Lingon, saving everything and restart (safest method) to have launchd load the new agent. Use the Console application to monitor the new agent and verify it is working.
This configuration can be varied indefinitely. Equally cool automatic synchronization scripts could be imagined for both Windows and Linux. Synchronizing with Unity and SSH compression is a very efficient method to keep files on multiple machines. I hope you enjoy it as much as I do.
Update: An Update on Unison




3 Responses to “Synchronize a Folder Across Machines Over the Internet–Mac, PC, Whatever”
Hi –
This is a great post — thanks for writing it up. Have you come up with a way to handle file naming differences between platforms? For example, “:” is a legal character on Unix but represents a drive delimiter on Windows — how does Unison handle that when trying to sync?
Ramon
[...] writing the post on synchronizing with Unison, I’ve gotten quite a bit of use out of my synchronized drive. (I [...]
Ramon,
I did run into a problem synchronizing files between platforms. One of my goals in all of this was to synchronize my firefox profile. When I first set it up, I created the profile on OS X and synchronized it to Windows. I noticed Unison failing to synchronize quite a few files in my internet cache because the filenames were invalid in Windows.
So, to answer your question–Unison appears to skip files that cannot be synchronized because the filename is not valid between operating systems.
I actually got around this by creating my Firefox profile in Windows. When the Windows Firefox profile was synchronized to OS X, files in my OS X cache were named with the Windows convention and because the Windows character set for file names is always valid in OS X/BSD, this worked out. Evidently there is a setting buried deep somewhere within the Firefox profile that tells it which character set to use.
I suspect that this concept can be applied to other cases as well. Try creating the files with the more restrictive filename character set in place and then synchronizing.
I hope that helps. Let me know if there are more specifics I can help with.
–jason
Care to comment?
You must be logged in to post a comment.