...
 
Commits (11)
......@@ -9,7 +9,7 @@ checkfmt:
@gofmt -d *.go
test:
env TZ=Europe/Berlin go test
env TZ=Europe/Berlin go test -cover -race
install: ${BIN}
install ${BIN} ${PREFIX}/bin
......
......@@ -52,13 +52,16 @@ Run `go get github.com/sstark/snaprd`. The binary will be in
Installing
----------
Snaprd does not daemonize, logs are printed to the standard output. Choose
whatever you like for starting it at boot: rc.local, SysVinit, upstart,
systemd, supervisord, BSD-init, launchd, daemontools, ...
Snaprd does not daemonize, logs are printed to the standard output, also the
stdout and stderr of the rsync command that is being run. Choose whatever you
like for starting it at boot: rc.local, SysVinit, upstart, systemd,
supervisord, BSD-init, launchd, daemontools, ...
In case your repository resides in a separate file system you may want to put
some mechanism before startup that makes sure this file system is mounted.
See below for an example how to run snaprd using systemd.
Running
-------
......@@ -79,24 +82,35 @@ Basic operation:
[...]
```
```
> snaprd list -repository /tmp/snaprd_dest
### Repository: /tmp/snaprd_dest, Origin: /tmp/snaprd_test2, Schedule: shortterm
### From past, 0/∞
### From 866h0m0s ago, 0/4
### From 194h0m0s ago, 0/7
### From 26h0m0s ago, 2/12
2016-09-14 Wednesday 12:14:31 (1s, 2h0m0s)
2016-09-14 Wednesday 12:19:46 (2s, 2h0m0s)
### From 2h0m0s ago, 5/12
2016-09-14 Wednesday 19:51:07 (1s, 10m0s)
2016-09-14 Wednesday 19:51:21 (1s, 10m0s)
2016-09-14 Wednesday 19:51:26 (1s, 10m0s)
2016-09-14 Wednesday 19:51:31 (1s, 10m0s)
2016-09-14 Wednesday 20:32:29 (1s, 10m0s)
```
See a full list of options available to the run command:
The above command will create a hard-linked copy (see the `--link-dest` option
in *rsync(1)*) via ssh from the directory "some/dir" on "someserver" every 10
minutes. The copy will be written into a directory within the `.data`
sub-directory of the target directory "/target/dir" on the local system. The
directory name for the snapshots consists of the start time and end time of the
rsync run (in unix time) and the state of the snapshot. While rsync is running
the name will be `<start>-0-incomplete`. Only after rsync is done, the
directory will be renamed to `<start>-<end>-complete`.
After each snapshot snaprd will also create user-friendly names as symlinks
into the .data dir, so if you export the snapshot directory read-only, users
should find a resonably convenient way to find their backups.
Next, snaprd will *prune* the existing snapshots. That means it will check if a
snapshot is suitable for being advanced into the next level of the schedule (in
this example that means the "two-hourly" interval) or if it should stay in the
current interval. If the current interval is "full" already, but no snapshot is
suitable for being advanced, snaprd will *obsolete* as many snapshots as needed
to match the schedule.
Marking a snapshot "obsolete" simply means renaming it to
`<start>-<end>-obsolete`. From then on it will not show up anymore in normal
listings and also not be considered as a target for --link-dest. The default
for snaprd is to eventually mark those obsolete snapshots as
`<start>-<end>-purging` and delete them from disk. You can tweak this behaviour
with the "-maxKeep", "-noPurge", "-minGbSpace" and "-minPercSpace" parameters
for snaprd.
To get a full list of options available to the run command, use `snaprd run -h`:
$ snaprd run -h
Usage of run:
......@@ -130,6 +144,32 @@ See a full list of options available to the run command:
one of longterm,shortterm (default "longterm")
```
> snaprd list -repository /tmp/snaprd_dest
### Repository: /tmp/snaprd_dest, Origin: /tmp/snaprd_test2, Schedule: shortterm
### From past, 0/∞
### From 866h0m0s ago, 0/4
### From 194h0m0s ago, 0/7
### From 26h0m0s ago, 2/12
2016-09-14 Wednesday 12:14:31 (1s, 2h0m0s)
2016-09-14 Wednesday 12:19:46 (2s, 2h0m0s)
### From 2h0m0s ago, 5/12
2016-09-14 Wednesday 19:51:07 (1s, 10m0s)
2016-09-14 Wednesday 19:51:21 (1s, 10m0s)
2016-09-14 Wednesday 19:51:26 (1s, 10m0s)
2016-09-14 Wednesday 19:51:31 (1s, 10m0s)
2016-09-14 Wednesday 20:32:29 (1s, 10m0s)
```
The above list command will output some information about the intervals for the
given schedule and how many snapshots are in them.
Obviously the list command needs to know which schedule was used for creating
the snapshots, but in the above example you can see that no schedule was given
at the command line. This works because snaprd writes all settings that were
used for the last *run* command to the repository as `.snaprd.settings`.
E-Mail Notification
-------------------
......@@ -147,6 +187,31 @@ Sending happens through use of the standard mail(1) command, make sure your
system is configured accordingly.
System Prerequisites
--------------------
Obviously you need a file system where you can store enough data to fit the
dataset you are backing up. It is not possible to predict how much space will
be needed for a given schedule and update pattern of data. You should at least
make the snapshot file system such that it can be easily extended if needed.
Starting with a factor of 1.5 to 2 should be sufficient.
If you are using mlocate or a similar mechanism to index your files, make sure
you exclude your snapshot file system from it, e. g. like this:
<pre>
$ cat /etc/updatedb.conf
PRUNE_BIND_MOUNTS="yes"
# PRUNENAMES=".git .bzr .hg .svn"
PRUNEPATHS="/tmp /var/spool /media /var/lib/os-prober /var/lib/ceph /home/.ecryptfs /var/lib/schroot <b>/snapshots</b>"
PRUNEFS="NFS nfs nfs4 rpc_pipefs afs binfmt_misc proc smbfs autofs iso9660 ncpfs coda devpts ftpfs devfs devtmpfs fuse.mfs shfs sysfs cifs lustre tmpfs usbfs udf fuse.glusterfs fuse.sshfs curlftpfs ceph fuse.ceph fuse.rozofs ecryptfs fusesmb"
</pre>
If you do not exclude your snapshots you will get enormously big mlocate.db
files with lots of redundant information.
Stopping
--------
......@@ -173,7 +238,12 @@ with the -schedule switch to the run command:
The duration listed define how long a snapshot stays in that interval until it
is either promoted to the next higher interval or deleted.
You can define your own schedules by editing a json-formatted file `/etc/snaprd.schedules` with entries like:
Which schedule you choose is entirely up to you, just make sure the smallest
(first) interval is large enough so the expected runtime of a single rsync
snapshot fits in it with a good margin.
You can define your own schedules by editing a json-formatted file
`/etc/snaprd.schedules` with entries like:
```
{
......@@ -208,7 +278,7 @@ Place in `/etc/systemd/system/snaprd-srv-home.service`
[Service]
User=root
StandardOutput=syslog
ExecStart=/usr/local/bin/snaprd run -noLogDate -repository=/export/srv-home-snap -origin=srv:/export/homes
ExecStart=/usr/local/bin/snaprd run -noLogDate -notify root -repository=/export/srv-home-snap -origin=srv:/export/homes
Restart=on-failure
[Install]
......
......@@ -21,3 +21,6 @@
- support more than one directory to backup (avoid having to run many instances on a system)
- mail hook in case of failed/missed backup
- Test failure and non-failure rsync errors (e. g. 24)
- "snaprd log" subcmd to print log ring buffer
- extend sched subcmd to be more useful
- parse rsync output and fill some extra info struct that can be stored in the repository
/* See the file "LICENSE.txt" for the full license governing this code. */
package main
import (
"testing"
"time"
)
func TestSkewClock(t *testing.T) {
var st int64 = 18
var inc int64 = 5
clock := newSkewClock(st)
t1 := clock.Now().Unix()
clock.forward(time.Second * time.Duration(inc))
t2 := clock.Now().Unix()
if t1 != st {
t.Errorf("wanted %d, but got %v", st, t1)
}
if t2 != st+inc {
t.Errorf("wanted %d, but got %v", st+inc, t2)
}
}
......@@ -5,6 +5,7 @@
package main
import (
"fmt"
"io/ioutil"
"log"
"os"
......@@ -23,22 +24,23 @@ func newPidLocker(lockfile string) *pidLocker {
}
}
func (pl *pidLocker) Lock() {
func (pl *pidLocker) Lock() error {
_, err := os.Stat(pl.f)
if err == nil {
log.Fatalf("pid file %s already exists. Is snaprd running already?", pl.f)
return fmt.Errorf("pid file %s already exists. Is snaprd running already?", pl.f)
}
debugf("write pid %d to pidfile %s", pl.pid, pl.f)
err = ioutil.WriteFile(pl.f, []byte(strconv.Itoa(pl.pid)), 0666)
if err != nil {
log.Fatalf("could not write pid file %s", pl.f)
return fmt.Errorf("could not write pid file %s: %s", pl.f, err)
}
return nil
}
func (pl *pidLocker) Unlock() {
debugf("delete pidfile %s", pl.f)
err := os.Remove(pl.f)
if err != nil {
log.Fatalf("could not remove pid file %s", pl.f)
log.Printf("could not remove pid file %s: %s", pl.f, err)
}
}
......@@ -67,7 +67,11 @@ func lastGoodTicker(in, out chan *snapshot, cl clock) {
// helper goroutines.
func subcmdRun() (ferr error) {
pl := newPidLocker(filepath.Join(config.repository, ".pid"))
pl.Lock()
err := pl.Lock()
if err != nil {
ferr = err
return
}
defer pl.Unlock()
if !config.NoWait {
sigc := make(chan os.Signal, 1)
......
......@@ -23,3 +23,13 @@ func ExampleSubcmdList() {
// 2014-05-17 Saturday 16:41:56 (1s, 5s)
// 2014-05-17 Saturday 16:42:01 (1s, 5s)
}
func ExampleScheds() {
schedules.list()
// Output:
// longterm: [6h0m0s 24h0m0s 168h0m0s 672h0m0s 876000h0m0s]
// shortterm: [10m0s 2h0m0s 24h0m0s 168h0m0s 672h0m0s 876000h0m0s]
// test1: [24h0m0s 168h0m0s 672h0m0s 876000h0m0s]
// testing: [5s 20s 2m20s 4m40s 876000h0m0s]
// testing2: [5s 20s 40s 1m20s 876000h0m0s]
}
......@@ -9,6 +9,7 @@ import (
"fmt"
"io/ioutil"
"os"
"sort"
"strings"
"time"
)
......@@ -91,8 +92,13 @@ func (schl scheduleList) addFromFile(file string) {
// list prints the stored schedules in the list
func (schl scheduleList) list() {
for name, sched := range schl {
fmt.Printf("%s: %s\n", name, sched)
var sKeys []string
for k := range schl {
sKeys = append(sKeys, k)
}
sort.Strings(sKeys)
for _, name := range sKeys {
fmt.Printf("%s: %s\n", name, schl[name])
}
}
......