yet another blog

Sat, 25 Mar 2006

Diseba: a distributed and secure backup (take II)

Continuing on the idea of a solution for distributed and secure backup, I have now a more clear view of the software. Compared to my first original idea, I changed some details. However the cryptographic details could be kept as is. And I have found a name for the project: diseba. :)

Overview

The general idea is the same as in my first note: save the content of your hard disk on your friends' hard-disks, in such a way that only you can read those copies (through strong encryption). Moreover, the network of friends is closed (i.e. you must agree with the others to enter the "club").

The main idea is to divide the backup into two main steps: storing the content to backup into a set of encrypted blocks and then distributing those encrypted blocks onto the hard-disks of your friends. More specifically, a backup phase is divided into four steps:

  1. Determine which files are new or updated;
  2. Cut them in fixed size and encrypted blocks, on local disk, along with a Master Index that contains all needed information to restore a backup;
  3. Save those blocks on the remote hard-disks, distributing enough copies to ensure long term availability. Two methods could be used: rsync or a bittorrent-like distribution;
  4. Send to remote nodes the list of blocks and master index we want to keep, all other blocks or master indexes for this node being erased.

The main disadvantage of this approach is that it might use, in the worse case, twice the disk space of the saved space. But its big advantage is it implementation simplicity and quite good security.

At any point in the above steps, the local software stores on disk its state so it can restart from where it was stopped.

In steps 1 and 2, all files, links and directories of the directories to backup are examined:

As result, a file is saved with following structure:

  +------------------+  +------------------+  +------------------+
  |IV| file content  |  |   file content   |  | file content |MAC|
  +------------------+  +------------------+  +------------------+
     encrypted block       encrypted block       encrypted block

Content of the Master Index

The Master Index is a password encrypted compressed file that contains all the information needed to restore a backup from a set of encrypted blocks. More specifically, it contains:

The Master Index is added a MAC, encrypted and stored on disk with salt and number of hash iterations for decryption.

On disk backup structure

The backup on the local disk has following hierarchy:

node0/master_index/1  <-- saved master index #1
node0/master_index/5
node0/backup_state    <-- local status of the software in case of interruption
node0/blocks/0/40
node0/blocks/0/80     <-- encrypted blocks as 1 KB files
 :
node0/blocks/f/67f
             |
             +---- the set of blocks are divided into 16 directories, 0 to f
node1/   <-- same directory structure as for node0
 :
node5/

Remote copy of blocks and master indexes

Once a new backup (blocks and master index) has been generated on local disk, one needs to save it on remote nodes. We propose to use two strategies:

  1. rsync synchronisation of the disk directory structure. This would be not efficient with several remote nodes but it could be quite useful on a local network (and simple to implement ;) ;
  2. a bittorrent like approach.

The bittorrent-like approach is quite simple: use the copy made on other nodes to reduce the amount of data that an initial node should send. More precisely, for each block that should be copied on N remote nodes:

  1. first copy: complete copy from node 0 to node 1;
  2. second copy: 1/2 copy from node 0 to node 2, 1/2 copy from node 1 to node 2;
  3. ...
  4. Nth copy: 1/N copy from node 0 to node N-1, ..., 1/N copy from node N-2 to node N-1.

Each node is in charge of making the copies of its own blocks and master index on other nodes. To implement above algorithm, one would need following network API:

trigger_block_copy([list of source nodes], [list of block ids])
 => returns success or failure for each block

trigger_master_index_copy([list of source nodes], master_index_id)
 => returns success or failure

pull_block_parts([list of (block id, offset, length)])
 => returns, for each requested part, failure or block content

pull_master_index_part(master index id, offset, length)
 => returns failure or master index content

erase_block_list([list of block id])
 => returns success or failure

erase_master_index(master index id)
 => returns success or failure

A source node would call trigger_blocks_copy to request to one or more nodes a copy of a list of blocks. Each one of the nodes receiving this call would in turn use pull_block_parts to get the needed parts from other nodes having a copy of the needed block. After all new blocks are copied on remote nodes, the source node can call erase_block_list to erase outdated blocks. The procedure is the same for master indexes.

In case a destination node crashes in the middle of a copy, the source node should consider that the copy has failed and it should elect a new destination node. In case a sending node crashes in the middle of a copy, the receiving node should erase the corresponding block and report a failure to the source node.

As the source node sends at backup step 4 the complete list of blocks and master index it wants to keep, we avoid keeping stale blocks or partially copied blocks in case of nodes' crashes.

It might appear that a source node cannot save enough copies onto remote nodes (because they are unavailable): it should attempt to retry those copies at a latter time.

Clique management

To create a new clique (group of friends making backups between each other), one should:

To join an existing clique, a new node should do:

In order to have the same node numbering on each node, the public keys of the clique nodes are sorted and each node is given a number equal to its public key rank. A local unencrypted configuration file contains public SSL keys of other nodes and local directories to backup.

Few notes on implementation

In order to answer remote node requests asynchronously, the diseba program should be multi-threaded: one thread to make local backup and trigger local to remote copies, one or more threads to fulfil remote nodes' copy requests.

I still intend to implement this program in OCaml. ;) RPC over SSL would be used for communication between nodes.

Any opinion on this design?

2006-03-25T14:27:40Z [] permanent link

Made with PyBlosxom