fabsync

This is a file-syncing tool for Fabric. It’s almost as straightforward to use as rsync, but with some of the features that will be familiar from deployment automation tools like Ansible.

Source files are kept in a local file tree—presumably under source control—in the same shape as the destination. The root of your local tree doesn’t have to correspond to the root directory (/) on the server, although that’s the most straightforward.

In addition to simply uploading files and directories with Fabric, fabsync has a mechanism to configure metadata for the files, which comes in handy for specifying ownership and permissions. You can also override the remote name, for example to upload ‘dot-local’ as ‘.local’. And there’s a mechanism to pass a file’s content through a rendering function, giving you a hook to integrate a template system or any other kind of transformation you would like.

Locally, fabsync requires Python 3.8 or later. It’s designed to work with pretty much any POSIX remote host, using standard tools like sh, getent and openssl. If you find a BSD, GNU, or similar system that doesn’t have the tools we expect, reach out or send a patch.

The reference page has an API reference, including a fully annotated sample config. This page has a step-by-step guide to the available features.

Start

We’ll start with the simplest possible fabric task using fabsync. Suppose your repo looks like this:

.
├── files
│   └── etc
│       └── lighttpd.conf
└── fabfile.py
fabfile.py
from fabric import task
import fabsync

@task
def sync(conn):
    root = fabsync.load('files', '/')

    for result in fabsync.isync(conn, root):
        print(f"{result.path}{' [modified]' if result.modified else ''}")

Running this task will be roughly equivalent to running:

rsync -rv files/ host:/

Metadata

This isn’t very useful yet, so let’s add some metadata.

.
├── files
│   ├── etc
│   │   └── lighttpd.conf
│   └── var
│       └── www
│           └── _sync.toml
└── fabfile.py
files/var/www/_sync.toml
user = 'root'
group = 'www'
perms = 0o750

_sync.toml has configuration for its parent directory and all of its files. For the moment, we’re only configuring /var/www with ownership and permissions.

user and group values can be either strings or numeric uid/gid values. perms is an integer, usually expressed in octal. All three can be -1 (the default), which means to leave it as is.

Files

Each _sync.toml is also used to configure the files in the same directory. A file’s config goes in files.<filename>, using the name in the local file tree, not the name of the remote file. Usually these are the same, but you can override the remote name, such as to more easily manage dot-files.

.
├── files
│   └── home
│       └── hg
│           ├── _sync.toml
│           └── dot-hgrc
└── fabfile.py
files/home/hg/_sync.toml
# Options for /home/hg
user = 'hg'
group = 'hg'
perms = 0o755

# Options for /home/hg/.hgrc
[files.'dot-hgrc']
name = '.hgrc'
user = 'hg'
group = 'hg'
perms = 0o640

Defaults

The previous example has some duplication in it, which we can resolve. Every _sync.toml file can have a [defaults] section, which applies to the current directory and everything underneath it (recursively). As the name implies, these are just defaults, which can be overridden by options further down the heirarchy.

.
├── files
│   └── home
│       └── hg
│           ├── _sync.toml
│           └── dot-hgrc
└── fabfile.py
files/home/hg/_sync.toml
# Options for /home/hg
perms = 0o755

# Defaults for /home/hg and everything under it
[defaults]
user = 'hg'
group = 'hg'
dir_perms = 0o750
file_perms = 0o640

# Options for /home/hg/.hgrc
[files.'dot-hgrc']
name = '.hgrc'

Selection

In many cases, it’s sensible and safe to just sync your entire tree to apply any changes. If you want to save a little time or just be absolutely sure that you’re only touching one thing, the isync() API takes an optional ItemSelector argument to filter the items.

ItemSelector can limit the sync to specific subtree and/or to a set of tags. We’ll update our task to support these. While we’re at it, we’ll add support for isync()’s dry_run parameter.

fabfile.py
from fabric import task
import fabsync

@task(iterable=['tag'])
def sync(conn, subpath=None, tag=(), dry_run=False):
    root = fabsync.load('files', '/')
    selector = fabsync.ItemSelector.new(subpath, tag)

    for result in fabsync.isync(conn, root, selector, dry_run=dry_run):
        print(f"{result.path}{' [modified]' if result.modified else ''}")

Tags are assigned to directories and files in _sync.toml. They can also be added to [defaults] sections to apply to a whole subtree. Tags accumulate, so defining tags at a given level adds them to any existing tags that were inherited. A tag can be removed by prepending a hyphen. Adding '-' as a tag will reset, removing all inherited tags.

.
├── files
│   ├── etc
│   │   ├── _sync.toml
│   │   └── lighttpd.conf
│   ├── var
│   │   └── www
│   │       └── _sync.toml
│   └── home
│       └── hg
│           ├── _sync.toml
│           └── dot-hgrc
└── fabfile.py
files/etc/_sync.toml
[files.'lighttpd.conf']
tags = ['www']
files/var/www/_sync.toml
[defaults]
user = 'root'
group = 'www'
file_perms = 0o640
dir_perms = 0o750
tags = ['www']
files/home/hg/_sync.toml
perms = 0o755

[defaults]
user = 'hg'
group = 'hg'
dir_perms = 0o750
file_perms = 0o640
tags = ['hg']

[files.'dot-hgrc']
name = '.hgrc'

To just sync /home/hg, you might run either:

fab sync --subpath home/hg

or:

fab sync --tag hg

By default, ItemSelector will automatically select the parents of all selected items. In this example, neither the subpath nor the tag matches /home, but because /home/hg is synced, /home is as well. You can disable this behavior with ItemSelector.new(..., with_parents=False).

Rendering

Some files need to be rendered at sync time, perhaps to customize them for a specific host or to embed secrets from outside source control. To this end, any file can be configured with a renderer, which is simply a function that takes the Path of the source file plus any configured render vars and returns a new str or bytes with the final contents. The default (trivial) renderer looks like this:

def renderer(path: Path, _vars: Mapping[str, Any], **kwargs) -> bytes:
    with path.open('rb') as f:
        return f.read()

Custom renderers can be hooked up to a template engine, Python string formatting, or any other transformation that you want. If a string is returned, it will be encoded as utf-8 for uploading.

Note that all renderers should include **kwargs in their argument list for forward compatibility. As of version 1.2, legacy renderers with just the two positional arguments are supported with a deprecation warning.

In _sync.toml, the renderer is specified as an arbitrary string. At sync time, you need to provide a mapping from these strings to the functions that implement them. (Renderer names beginning with 'fabsync/' are reserved and can not be registered). The renderer key is valid for individual files and also in the [defaults] section. You can also supply a mapping of arbitrary values to parameterize the render function.

.
├── files
│   └── etc
│       ├── _sync.toml
│       └── aliases
└── fabfile.py
files/etc/_sync.toml
[files.'aliases']
renderer = 'mako'
vars = {'postmaster' = 'alice@example.com'}

Individual vars can be overidden at each configuration level. Values are not merged recursively.

It’s likely that you’ll want to load a render context once at the beginning of the sync operation and reuse it for each file. Here’s an example of what this might look like:

fabfile.py
import io
from pathlib import Path
import tomli
from typing import Mapping, Any
from fabric import task
from mako.template import Template
import fabsync

def mako_renderer(conn):
    # Load some host-specific template context.
    result = conn.get('/usr/local/etc/fabsync.toml', io.BytesIO())
    host = tomli.loads(result.local.getvalue().decode())

    def render(path: Path, vars: Mapping[str, Any], **kwargs) -> str:
        return Template(filename=str(path)).render(host | vars)

    return render

@task(iterable=['tag'])
def sync(conn, subpath=None, tag=(), dry_run=False):
    root = fabsync.load('files', '/')
    selector = fabsync.ItemSelector.new(subpath, tag)
    renderers = {'mako': mako_renderer(conn)}

    for result in fabsync.isync(conn, root, selector, renderers, dry_run=dry_run):
        print(f"{result.path}{' [modified]' if result.modified else ''}")

Advanced Rendering

New in version 1.2.

Render functions are given one additional keyword argument: get_content. This is a thunk (a zero-argument function) that will return the current (remote) content of the file as a bytes object. This can be used by renderers that wish to inspect and modify an existing file rather than simply create/overwrite it. In this case, the source file could contain information you wish to merge into the target or it might simply be an empty placeholder file to trigger the renderer.

As a convenience, there is a special builtin renderer called fabsync/py that will load a source file as a Python module, look for a function named render, and call it as the render function. For example:

_sync.toml
[files.'rc.conf.py']
name = 'rc.conf'
renderer = 'fabsync/py'
rc.conf.py
import re

def render(_src, _vars, get_content, **kwargs) -> bytes:
    content = get_content()

    content = re.sub(rb'^pf_enable=.*$', b'pf_enable="YES"', content, flags=re.M)
    content = re.sub(rb'^jail_enable=.*$', b'jail_enable="YES"', content, flags=re.M)
    content = re.sub(rb'^sendmail_enable=.*$', b'sendmail_enable="NO"', content, flags=re.M)

    return content

Diffs

By default, any time we decide to upload a file, we generate a diff of the original and uploaded content. This is included in the SyncResult objected returned by isync(). This is particularly useful when combined with the dry_run paramter:

@task(iterable=['tag'], incrementable=['verbose'])
def sync(conn, subpath=None, tag=(), dry_run=False, verbose=0):
    root = fabsync.load('files', '/')
    selector = fabsync.ItemSelector.new(subpath, tag)

    for result in fabsync.isync(conn, root, selector, dry_run=dry_run):
        print(f"{result.path}{' [modified]' if result.modified else ''}")
        if verbose > 0 and result.diff:
            print(result.diff.decode())

Although the diff is provided as a bytes object, it is a standard unified diff similar to what you would get from /usr/bin/diff or a version control system. We use difflib.diff_bytes() to avoid any unneccessary assumptions about file encodings.

If you have any files that should not be diffed—perhaps because they are not text files—you can set diff = false as a file or default option in _sync.toml.

Inspection

fabsync.files has a few additional convenience functions for inspecting your configuration. The most important is fabsync.files.table(), which will generate human-readable table rows describing your file tree. Your corresponding invoke task might look something like this:

import fabsync
from invoke import task
from prettytable import PrettyTable

@task(iterable=['tag'])
def table(c, subpath=None, tag=None):
    root = fabsync.files.load('files', '/')
    selector = fabsync.ItemSelector.new(subpath, tag)
    items = fabsync.files.select(root, selector)
    rows = fabsync.files.table(items)

    table = PrettyTable()
    table.align = 'l'
    table.field_names = next(rows)
    table.add_rows(rows)

    print(table)

Of course, you could also dump it to JSON and inspect it with VisiData, or anything else you like.

The API reference below documents a few other functions for extracting information from your sources.

Tips and tricks

Shared templates

Combining a few of the above features, you can sync a single template to multiple paths with different render vars. This example has config files for two instances of an application. myapp.cfg is the shared template, which is not synced. The actual files are local symlinks, which get rendered with different parameters.

files
└── usr
    └── local
        └── etc
            └── myapp
                ├── _sync.toml
                ├── myapp.cfg
                ├── myapp1.cfg -> myapp.cfg
                └── myapp2.cfg -> myapp.cfg
usr/local/etc/myapp/_sync.toml
[defaults]
renderer = 'mako'

# This is just the template.
[files.'myapp.cfg']
ignore = true

[files.'myapp1.cfg'.vars]
db.name = 'myapp1'
http.port = 8000

[files.'myapp2.cfg'.vars]
db.name = 'myapp2'
http.port = 8001

Looping

Because items are mapped one to one from source to destination, there’s no direct way to loop over a data structure and use that to create a file list. If you’re doing this a lot, you might want to ask whether fabsync is the right tool for you.

That said, I’ll reiterate one of the fundamental principles of fabsync: it’s a library, not a framework. You can call isync() as many times as you want in any way you want to accomplish the task at hand. You can have multiple source trees, one of which you sync repeatedly with different destinations and/or render vars. You could generate a source tree into a temp directory and sync that.

And, of course, you can always just use your Fabric connection to manipulate the remote host directly. fabsync is a tool to be used where it’s appropriate or convenient, and ignored otherwise.

Local operation

Everything so far has been about syncing files to remote hosts, which is the primary intention of fabsync. However, if you pass an invoke.Context to the sync API instead of a fabric.Connection, it will manipulate the local filesystem.