Back Up Files to OneDrive Using Rclone

04/08/2021

I have a couple of Linux servers which I'd like to have daily backup. OneDrive Business from the school provides a whopping 5TB space, which is ideal for backup. Of course I would like the backup process to be a) automatic - this can be done easily with a cron job, and b) efficient, meaning only new and changed files will be transferred, and this is where Rclone comes in.

I'm a long-time user of rsync, which synchronizes files between different computers. Rclone is basically rsync for cloud services. As you can see on their website, they support 40+ cloud storage services like OneDrive, Google Drive, Dropbox and so on, plus many standard file transfer protocols like SFTP. Rclone has very good documentation, so it's easy to set it up just by following their instructions step by step. I did run into two problems, but both are because of OneDrive, not Rclone.

1. Copying large numbers of files to OneDrive Business takes forever.

Initially I tried to rclone CSNS files to OneDrive but eventually gave up. CSNS has over 760K files with a total size of over 330GB. The problem is that OneDrive Business will constantly throttle the upload speed, so uploading all those files will probably take weeks if not longer. I gave up after a few days and just put the backup of the old CSNS files on a school server. I updated CSNS so the new files are stored in a separate folder, and daily backup of this folder to OneDrive works fine.

2. Access token cannot be refreshed automatically on a headless server.

OneDrive Business uses OAuth 2 to authenticate clients (a Rclone setup would be a "client" of the OneDrive service), and the access token will need to be refreshed at some point. The problem is that OneDrive seems to be intended for end users, so it requires a browser to get a new token. The rclone config command gives good prompts, so it's not difficult to get a new token manually then copy&paste it to the rclone configuration on a server, but it's an annoyance nonetheless. Fortunately an access token seems to have a reasonably long life time (at least a couple of weeks I think), so I don't have to do this too frequently.