Sunday, August 15, 2010

Syncronized Folders across Windows and Linux - (Cloud-)Free Encrypted Cross-Platform Synchronisation with Unison

I always wanted to sync a few folders of my system across all my computers. E.g. the my scripts directory, where I constantly fiddle with my scripts and create new ones, my documents I always want to be able to have and edit everywhere, etc. I would never know where the most current version is without bothersome comparison of the dates. I would have to check every file and then copy the newer one. But that's now history, thanks to unison and my sync scripts.

What you need
The biggest problem is that you need a Linux server somewhere that is best constantly running and connected. I sync all my files with an online server. This is not a big deal because with SSH the connection is securely encrypted. Also due to compression and the fact that files are only transferred once, sync is pretty fast even with an ADSL connection. Of course this won't work for slow dial up connections in combination with large files. But as long as there are only small changes in certain files, even dial up might work for you. A Linux server is not really needed, but makes the setup much easier, which is why that's what I demonstrate here. And of course if you only want to sync two computers with each other, then there's no need for a constantly active server. It should work with Mac as a client or even server, but I don't own one so I can't help with that.

Downloads
You can get your unison downloads for Mac, Windows here and for Linux here (Ubuntu binaries i386 for 9.10 and 10.04 and more in this ppa. I recommend version 2.40.16, read below for why.

For Windows you also need to get Putty Agent (Pagent), Putty Puttygen and Plink get them at http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html. For Linux you also need openssh-server ssh-agent, ssh-keygen (in the openssh-client package) and keychain. You need to setup a working ssh connection with public key authentication for this to work without your interaction in the background. See here how to do this.

Bugs in the past
At the time I found it, over a year ago, it was still pretty buggy. It couldn't handle unicode, special characters or Umlauts properly. Especially between different system you'd end up with two differently named versions of a file unless you restricted yourself to normal a-z characters - very annoying.

But recently new version has come out, which fixes the problem. Now unison is not only a tool to synchronize flawlessly across different systems, it's also faster and prettier "GUI" than older versions and it still synchronized across encrypted ssh connections if you want it to. In my view this makes it the perfect tool for my needs. I've set it up to synchronize my laptop and my netbook with each other and my server system over ssh.

How it works
It's blazingly fast by only copying those parts of the files that have changed. And it doesn't transfer any files to check this. It runs both on the server and the client and only transfers the dates and hash coddes of the files via a secure compressed ssh connection. You can find out more about unison at LinuxJournal.

The easiest setup and the one I will show here is a star topology setup. This means there are several client systems synchronizing with one server. It works because unison knows which files are newer. The great side effect is that you automatically have a distributed backup on all your systems - the server and all clients of all your files.

The script runs automatically every X minutes in the background via cron on Linux (completely invisible) and is called by the Windows Task Manager on Windows (almost invisible).

Important to know
You need to use exactly the same version on all system you deploy it. There are easy to install packages for Debian and Ubuntu and precompiled versions for Windows. I haven't worked with OS X version yet, but they should work as well. I use a Linux server, Windows on the server side should be possible to set up but it's going to be much more difficult. Setting up the client on Windows is already non-trivial.

As you don't modify the same file on both systems at the same time before a server sync, then there won't be any problems. If you do, unison will not sync them. You need to call unison by itself and it will prompt you to chose how to deal with the situation.

The Linux version uses the symlink feature of Linux to let you configure which folders to sync on the go. In Windows this won't work, so all your folders you want to sync need to be inside one parent folder. I know no way around this, though the unison config file might have a solution somewhere...

Sync script:
#!/bin/bash

LOG="tee -a $HOME/logs/mysync2.log"
source $HOME/scripts/functions $HOME/scripts/variables

main ()
{
echo Starting at $(date)

# exit if running on battery
grep on-line /proc/acpi/ac_adapter/*/state >> /dev/null
if [ ! $? ]; # 1 if not "on-line"
then
echo Running on battery. Leaving now...
exit 0;
else
echo Running on AC Adapter.
fi

# use ssh-agent
source $HOME/.keychain/*-sh 2>&1
echo sock $SSH_AUTH_SOCK pid $SSH_AGENT_PID me $(whoami)
ssh-add -l || (echo "SSH Key not active\!"; ssh-add || exit 1)

cd $HOME

unison-2.40.16 sync -batch -maxbackups 2 ${@} 2>&1

echo Exiting at $(date)
}

main | $LOG

It checks and exits when the system is running on battery. To be able to work in the background, it uses ssh-agent. So you need to setup ssh key authentication.

Unison config in ~/.unison/sync.prf:
# Unison preferences file
root = /home/user/sync
root = ssh://user@syncserver:port/syncfolder
follow = Path *
ignore = Path {scripts/s3.sh}

This is a sample config file. It follows all symlinks in the $HOME/sync folder to syncfolder on syncserver via ssh. The ignore section shows you how to exclude certain files from synchronization.

Windows Setup
This should already work just fine in Linux. Now let's turn to the Windows client.

The Window setup, as you will see, it much more "fun". There are several steps you need: Putty Agent (Pagent), Putty Puttygen and Plink (see above for links) and then a few bash scripts.

You need to copy your private ssh key from your linux box and convert it to putty's format with puttygen. Then copy it into a safe folder. I have it in the same folder as the Unison.exe. Then setup an ssh connection to your server with Putty that uses the key and save it to profile name unisonssh. Copy the following bash files to your unison folder and adjust the path names accordingly.

Putty Agent.bat:
pageant.exe sshkey.ppk
This needs to be a script or otherwise it's not started in the right folder and won't find the key. Try it with the full path of the key file then. This must be linked into Autostart if your and will prompt you for your ssh key password if your key is password protected.

SSH Connect.bat
@plink -C -ssh -load unisonssh -i "C:\unison\sshkey.ppk" -l djtm -P 22 unison -server -auto
Change the path to where your ssh key lies and the port to your server's port if necessary.

Unison Sync.bat
C:
cd "C:\unison\"
unison2.4.exe -sshcmd sshconnect.bat -backups -backupdir unisonbackups -backuplocation central -batch -confirmbigdeletes -contactquietly

Testing it
Now is a good chance to try everything works fine. Try creating a file one one system and see how after two syncs it magically appears on the other. Edit it there to see how the edits are transferred back to the first system.

Scheduling it
Once everything works you can schedule it to run regularly without you needing to do anything. In Linux, you should install keychain for this to work in scripts and then enter the following into your cron. (To edit your cron type crontab -e)
*/15 * * * * $HOME/scripts/unisonsync
The */15 means to sync every 15 minutes. Don't worry, usually nothing will be done - I hardly ever notice anything happening. Of course the path should be where your unison sync script lies.

In Windows it's a tad more difficult again. The most difficult part of it all is to make it run in the background without annyoing you every 15 minutes or so. The following command looks a bit quirky, but that's the best I could do. Enter exactly this behind Execute: in the task scheduler.
cmd /C start /LOW /MIN "Unison Sync" "C:\unison\unison.bat"
and "C:\unison" next to execute in. Check both execute only when logged in and, under the settings tab, not to start the task if you're running on battery.

And now - finally - enjoy great, free, in sync folders!

Another older, less detailed guide is available here and here. Thanks to the authors of Unison.

0 comments:

Post a Comment