Files
resf-testing-repo/docs/BOOTSTRAP-APPROACH.md
Stephen Simpson bb829c9b63 updates
2025-11-26 08:15:00 -06:00

305 lines
7.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Bootstrap Approach for Sparrowdo
## Overview
This framework bootstraps Sparrowdo **ONCE** in the golden image, not on every test VM. This provides significant time and resource savings.
## How It Works
### Traditional Approach (Slow) ❌
```
For each test:
1. Provision VM from golden image
2. Bootstrap Sparrowdo (5-10 minutes)
3. Run test
4. Destroy VM
```
**Problem**: If you run 70 tests, you bootstrap 70 times = 6-12 hours wasted!
### Our Approach (Fast) ✅
```
Once per golden image:
1. Create golden image with Raku/zef installed
2. Boot temporary VM from golden image
3. Bootstrap Sparrowdo (5-10 minutes)
4. Shutdown VM (changes saved to golden image)
For each test:
1. Provision VM from bootstrapped golden image
2. Run test immediately (no bootstrap needed!)
3. Destroy VM
```
**Benefit**: Bootstrap once, test 70 times = 5-10 minutes total!
## Implementation Details
### Step 1: Golden Image Preparation (Offline)
The `docs/default-prep.sh` script runs inside `virt-customize` (offline mode):
```bash
# Install Raku and zef package manager
dnf install -y rakudo rakudo-zef
# Create rocky user with sudo access
useradd -m rocky
echo "rocky ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers.d/rocky
```
**Why offline?** This is fast and requires no network/VM boot.
### Step 2: Bootstrap Sparrowdo (Online - Once)
The `scripts/bootstrap_golden.sh` script:
1. **Boots temporary VM** from golden image
2. **Runs sparrowdo --bootstrap** via SSH
3. **Shuts down VM cleanly** (changes persist in golden image)
4. **Cleans up VM definition**
```bash
./scripts/bootstrap_golden.sh /path/to/golden.qcow2 ~/.ssh/id_rsa
```
**What bootstrap installs:**
- Sparrowdo Raku module
- All Sparrowdo dependencies
- Testing utilities
- Configuration files
### Step 3: Test Execution (Fast)
Each test VM is a **linked clone** of the bootstrapped golden image:
```bash
# Provision takes < 1 second (copy-on-write)
VM_IP=$(provision_vm.sh test-vm golden.qcow2)
# Run test immediately (no bootstrap!)
sparrowdo --host $VM_IP --ssh_user rocky --no_sudo --sparrowfile test.raku
```
## Manual Workflow
### First Time Setup
```bash
# 1. Create base golden image
./scripts/setup_base.sh \
/var/lib/libvirt/images/Rocky-9-GenericCloud-Base.qcow2 \
docs/default-prep.sh \
/var/lib/libvirt/images/golden-rocky9.qcow2 \
~/.ssh/id_rsa.pub
# 2. Bootstrap the golden image (one time, 5-10 minutes)
./scripts/bootstrap_golden.sh \
/var/lib/libvirt/images/golden-rocky9.qcow2 \
~/.ssh/id_rsa
```
### Running Tests
```bash
# No bootstrap needed! Just run tests directly
./scripts/provision_vm.sh test-vm-1 /var/lib/libvirt/images/golden-rocky9.qcow2
# ... run sparrowdo tests ...
./scripts/cleanup_vm.sh test-vm-1
```
## Jenkins Pipeline Flow
The Jenkinsfile automatically handles bootstrap:
```groovy
stage('Prepare Golden Image') {
// Creates golden image with Raku/zef
setup_base.sh golden.qcow2 (with Raku)
}
stage('Bootstrap Golden Image') {
// Bootstraps Sparrowdo ONCE
bootstrap_golden.sh golden.qcow2 (with Sparrowdo)
}
stage('Run Tests') {
parallel {
test1: provision run test cleanup
test2: provision run test cleanup
test3: provision run test cleanup
// No bootstrap in any test!
}
}
```
## Time Savings Example
### 70 Tests with Bootstrap-Per-Test
```
Bootstrap time: 7 minutes per test
Total bootstrap time: 70 × 7 = 490 minutes (8.2 hours)
Test time: 70 × 2 minutes = 140 minutes (2.3 hours)
TOTAL: 630 minutes (10.5 hours)
```
### 70 Tests with Bootstrap-Once
```
Bootstrap time: 7 minutes once
Total bootstrap time: 1 × 7 = 7 minutes
Test time: 70 × 2 minutes = 140 minutes (2.3 hours)
TOTAL: 147 minutes (2.5 hours)
```
**Savings: 8 hours!** (80% reduction in total time)
## Disk Space Considerations
### Without Linked Clones
```
Base image: 2 GB
Golden image: 2 GB
Test VMs: 70 × 2 GB = 140 GB
TOTAL: 144 GB
```
### With Linked Clones (Our Approach)
```
Base image: 2 GB (cached, reused)
Golden image: 2.5 GB (with Sparrowdo)
Test VMs: 70 × ~100 MB = 7 GB (only diffs stored)
TOTAL: 11.5 GB
```
**Savings: 132 GB!** (92% reduction in disk usage)
## Updating the Golden Image
If you need to update Sparrowdo or dependencies:
```bash
# Option 1: Rebuild from scratch
rm -f /var/lib/libvirt/images/golden-rocky9.qcow2
./scripts/setup_base.sh ... # Create fresh
./scripts/bootstrap_golden.sh ... # Bootstrap fresh
# Option 2: Update existing golden image
# Boot a VM from golden image, update packages, shutdown
VM_IP=$(./scripts/provision_vm.sh update-vm golden-rocky9.qcow2)
ssh rocky@$VM_IP 'zef upgrade Sparrowdo'
ssh rocky@$VM_IP 'sudo shutdown -h now'
# Changes are saved to golden image
```
## Troubleshooting
### Bootstrap Script Hangs
```bash
# Check if VM is running
virsh -c qemu:///system list | grep bootstrap
# Connect to VM console
virsh -c qemu:///system console bootstrap-golden-XXXXX
# Check bootstrap logs on VM
ssh rocky@VM_IP 'cat ~/.zef/*log'
```
### Test Fails with "Sparrowdo not found"
```bash
# Verify Sparrowdo is in golden image
ssh rocky@GOLDEN_VM_IP 'which sparrowdo'
ssh rocky@GOLDEN_VM_IP 'sparrowdo --version'
# If missing, re-run bootstrap
./scripts/bootstrap_golden.sh /path/to/golden.qcow2
```
### Bootstrap Fails on First Try
```bash
# Common issue: Network not ready during bootstrap
# Solution: Increase wait time in bootstrap script
# Or manually retry bootstrap command
sparrowdo --host VM_IP --ssh_user rocky --bootstrap --color
```
## Best Practices
1. **Cache base images** - Reuse downloaded QCOW2 files
2. **Bootstrap once per Rocky version** - Create golden-rocky8.qcow2, golden-rocky9.qcow2, etc.
3. **Version your golden images** - Use timestamps: golden-rocky9-20250125.qcow2
4. **Test golden images** - Always verify bootstrap succeeded before running full test suite
5. **Update periodically** - Rebuild golden images monthly to get security updates
## Security Considerations
### SSH Keys
- Golden image contains injected SSH public key
- Anyone with private key can SSH to any VM from this golden image
- **Recommendation**: Use dedicated testing SSH keys, not personal keys
### Passwords
- Rocky user password: `rockypass` (change in prep script if needed)
- Root password: `rockytesting`
- **Recommendation**: Disable password auth, use keys only
### Sudo Access
- Rocky user has NOPASSWD sudo (required for bootstrap)
- **Recommendation**: Only use these VMs in isolated test networks
## Advanced: Pre-cached Test Dependencies
You can extend the bootstrap to pre-install common test dependencies:
```bash
# In bootstrap_golden.sh, after sparrowdo --bootstrap:
ssh rocky@$VM_IP 'zef install Test::Class'
ssh rocky@$VM_IP 'zef install JSON::Fast'
ssh rocky@$VM_IP 'sudo dnf install -y postgresql-server'
```
This makes tests even faster by eliminating package install time during tests.
## Monitoring Bootstrap Success
The bootstrap script outputs:
```
[1/4] Provisioning temporary VM...
[2/4] Waiting for SSH to be ready...
[3/4] Running Sparrowdo bootstrap...
[4/4] Shutting down VM to save changes...
```
If any step fails, the golden image is NOT bootstrapped. Check logs and retry.
## Integration with CI/CD
### Nightly Golden Image Rebuild
```bash
# Cron job to rebuild golden images nightly
0 2 * * * cd /path/to/repo && ./scripts/rebuild-golden.sh
```
### Pre-commit Hook to Validate Tests
```bash
# .git/hooks/pre-commit
./scripts/validate-tests.sh
# Provisions temp VM, runs one test, destroys
```
### Jenkins Scheduled Build
```groovy
// Rebuild golden images weekly
cron('H 2 * * 0') // Sunday 2 AM
```
## Conclusion
Bootstrapping the golden image once provides:
- **10x faster test execution** (no per-test bootstrap)
- **90% less disk usage** (linked clones vs full copies)
- **Simpler test scripts** (no bootstrap logic needed)
- **Better reliability** (bootstrap failures affect one build, not all tests)
This approach is essential for running large test suites efficiently!