Understanding Shared Storage
Why shared storage matters, how it works, and the design trade-offs
The Problem: Binary Duplication
In a typical multi-worker Concourse CI deployment, each worker unit independently downloads and stores the Concourse binaries:
- concourse binary: ~100 MB
- gdn (Garden runc): ~15 MB
- Configuration files: TSA keys, environment configs
For a deployment with 1 web server + 5 workers:
Without shared storage: 6 units Γ 115 MB = 690 MB total disk usage
With shared storage: 115 MB (62% disk savings!)
More critically, upgrades become slow and wasteful. Each worker must independently download the new version, multiplying network bandwidth usage by the number of workers.
The Solution: LXC Disk Device Passthrough
Shared storage leverages LXC's disk device feature to mount a single host directory into multiple containers simultaneously. This is not NFS, not Ceph, not a distributed filesystemβit's simpler and faster.
How LXC Disk Devices Work
When you run:
lxc config device add juju-abc123-0 concourse-shared disk \
source=/tmp/concourse-shared \
path=/var/lib/concourse \
shift=true
LXC performs these operations:
- Bind mount: The host directory
/tmp/concourse-sharedis bind-mounted into the container at/var/lib/concourse - UID/GID mapping (shift=true): Container UIDs are automatically mapped to avoid permission conflicts
- Shared access: Multiple containers can mount the same source directory simultaneously
Architecture: Leader-Writes, Workers-Read
The charm implements a simple coordination model:
| Role | Responsibilities | Writes To |
|---|---|---|
| Web/Leader | Downloads binaries, writes keys, manages version marker | /var/lib/concourse/bin//var/lib/concourse/keys//var/lib/concourse/.installed_version |
| Workers | Read binaries, create isolated work directories | /var/lib/concourse/worker/<unit-name>/ |
Why This Model Works
- No distributed locking needed: Only the leader writes to
bin/andkeys/ - Workers are isolated: Each worker creates its own subdirectory under
worker/ - Version marker acts as signal: Workers check
.installed_versionto know binaries are ready
Data Flow: Initial Deployment
βββββββββββββββββββ
β Web/Leader Unit β
ββββββββββ¬βββββββββ
β
ββ 1. Mount shared storage
ββ 2. Acquire exclusive lock (.install.lock)
ββ 3. Download binaries to bin/
ββ 4. Write .installed_version marker
ββ 5. Start concourse-server.service
ββββββββββββββββ
β Worker Units β (added later)
ββββββββ¬ββββββββ
β
ββ 1. Mount shared storage (same volume)
ββ 2. Check .installed_version (exists!)
ββ 3. Verify binaries in bin/
ββ 4. Create worker/{unit}/ directory
ββ 5. Start concourse-worker.service
Key observation: Workers don't download anything. They immediately find ready-to-use binaries and start within seconds.
Upgrade Coordination
Upgrades introduce a challenge: the leader must download new binaries while workers are using the old ones. The charm solves this with a coordinated upgrade protocol via Juju peer relations:
βββββββββββββββββββ
β Web/Leader Unit β
ββββββββββ¬βββββββββ
β
ββ 1. Set upgrade-state=prepare in peer relation
ββ 2. Wait for worker acknowledgments (2min timeout)
ββ 3. Acquire exclusive lock
ββ 4. Download new binaries
ββ 5. Write new .installed_version
ββ 6. Restart concourse-server.service
ββ 7. Set upgrade-state=complete in peer relation
ββββββββββββββββ
β Worker Units β
ββββββββ¬ββββββββ
β
ββ 1. Detect upgrade-state=prepare
ββ 2. Stop concourse-worker.service
ββ 3. Set upgrade-ready=true in peer relation
ββ 4. Poll for upgrade-state=complete (5min timeout)
ββ 5. Start concourse-worker.service
Trade-offs: When Shared Storage Makes Sense
β Advantages
- Massive disk savings: 62% reduction in a 6-unit deployment
- Fast upgrades: Download once, not N times
- Fast worker scaling: New workers start in seconds (no download)
- Simplified management: Single source of truth for binaries
β Disadvantages
- LXC-only: Requires LXC containers on the same host. Not suitable for:
- Bare metal deployments (no container isolation)
- Multi-host deployments (no shared filesystem between hosts)
- Kubernetes (different architecture)
- Manual setup required: Admin must run
setup-shared-storage.shscript to configure LXC devices - Less flexibility: All workers must use the same Concourse version (can't have mixed versions)
- Upgrade downtime: Workers stop during upgrades (though this is typically brief)
When to Use Shared Storage
| Scenario | Recommendation |
|---|---|
| Multi-worker LXD deployment (same host) | β Highly Recommended |
| Frequent Concourse version upgrades | β Recommended |
| Bare metal deployment | β Not Supported |
| Multi-host deployment | β Not Supported |
| Mixed Concourse versions needed | β Use mode=worker without shared storage |
Why Not NFS or Ceph?
You might wonder: why not use a "real" distributed filesystem like NFS or Ceph? Several reasons:
- Overkill for this use case: Shared storage is only beneficial when all units are on the same host. If units are on different hosts, there's no pointβthey can't share local disk anyway.
- Added complexity: NFS/Ceph require separate infrastructure, configuration, and maintenance.
- Performance overhead: Network filesystems introduce latency. For frequently-accessed binaries, this adds up.
- Single-host optimization: LXC disk devices are optimized for the single-host case (which is exactly our scenario).
Permission Handling: The shift=true Parameter
A subtle but critical detail: LXC containers use UID/GID namespacing. A process running as UID 1000 inside the container is actually UID 101000 on the host (with default LXC mapping).
Without shift=true, you'd see permission errors:
# Inside container (UID 1000):
$ touch /var/lib/concourse/test.txt
# On host:
$ ls -l /tmp/concourse-shared/test.txt
-rw-r--r-- 1 101000 101000 0 Feb 4 10:00 test.txt
# UID mismatch causes permission errors when web unit tries to read
The shift=true parameter tells LXC to automatically remap UIDs/GIDs:
- Transparent mapping: Container sees UID 1000, host sees UID 101000
- No manual chown needed: LXC handles translation automatically
- Works across containers: Multiple containers with different UID namespaces can share files
shift=true when setting up shared storage with LXC disk devices. The setup-shared-storage.sh script does this automatically.
Related Topics
- Tutorial: Setting Up Shared Storage - Step-by-step setup guide
- How-To: How to Setup Shared Storage - Quick setup instructions
- Reference: Configuration Options - See
shared-storageoption