Setting Up CI/CD When You Have None

Friday, 4 PM. A data pipeline is dropping records in production. No CI/CD, no deployment automation - just me and SSH. I log into the server, find the config pointing at the wrong database replica, fix it, restart the workers. Everything recovers. I go home.

Monday morning, a teammate deploys their branch. They git pull origin main, run the restart script. My fix is gone - it was never committed. The pipeline starts dropping records again. We spend two hours figuring out why a "fixed" bug is back before someone checks the server and sees the config file reverted.

This was 2022, on a system processing compliance data for 50+ jurisdictions. No CI. No CD. No staging environment. The deployment process was a wiki page titled "How to Deploy" with six steps, two of which were wrong.

If that sounds familiar, here's the good news: getting from there to "PRs run tests and deploy automatically" isn't a month-long project. It's three small steps that each take less than a day, and each one independently makes your life better.

Three Steps, Each Worth It Alone

Most CI/CD guides show you the end state - a sophisticated pipeline with Docker builds, Kubernetes rolling deploys, canary releases, and Slack notifications. That's the destination, not the starting point.

If you're deploying by SSH, you don't need Kubernetes. You need three things:

Step 1: Tests run on every pull request. 30 minutes to set up.

Step 2: Merging to main auto-deploys to staging. An afternoon.

Step 3: Production deploys are automated with rollback. A day.

Each step is independently valuable. I've worked on projects that stayed at Step 1 for months and that was fine. The team was already dramatically better off.

Step 1: Tests on Every PR

This is the single highest-ROI investment in your development workflow. Without it, code review is the only thing between a broken commit and production. And code review is fallible - people get tired, PRs pile up on Friday afternoons, someone approves a "small fix" without running it locally. Automated tests don't get tired.

If you have tests (and if you don't, start here), making them run automatically on every PR takes 30 minutes.

Here's a complete GitHub Actions workflow for a Python project:

name: Tests

on:
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'
      - run: pip install -r requirements.txt
      - run: pytest tests/ -v --tb=short

That's 15 lines. Save it as .github/workflows/test.yml, push, and you have CI. Every pull request now runs your tests and shows a green check or red X before anyone reviews the code.

GitHub Actions is free for public repos and gives private repos 2,000 minutes per month on the free tier. For most small teams, that's more than enough. A typical test run on a Python project takes 1-3 minutes, so you'd need to open over 600 PRs a month to hit the limit.

The second half of Step 1: go to your repo's Settings, then Branches, then Branch protection rules. Require the status check to pass before merging. Now it's not just informational - it's a gate. Broken code physically can't reach your main branch.

For Node.js, swap the Python steps:

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm test

What to add over the next few weeks, when you feel ready:

Linting. Add ruff check . (Python) or npx eslint . (JS) as another step. Catches style issues and common bugs without human review time.
Type checking. mypy src/ or npx tsc --noEmit. A different class of bugs than tests catch.
Coverage gate. pytest --cov=src --fail-under=60. Not 80%. Not 90%. Whatever your current coverage is. The gate's job is to stop the number from going down.

The moment that sells CI to your whole team: the first PR where tests catch a real bug before merge. Someone pushes a change, the pipeline goes red, they look at the failure, and realize that would have broken production. After that happens once, nobody wants to merge without green checks.

This is also how you make refactoring safe at scale. When every change runs through automated tests, you can refactor aggressively without the fear that you'll break something nobody catches until production.

Step 2: Auto-Deploy to Staging

Once tests pass on main, the next step is making deploys automatic. Not to production - to a staging environment where someone can look at the result before it goes live.

The trigger: when code merges to main (or develop, depending on your branch strategy), build and deploy to staging automatically.

For a static site or SPA, this is trivially simple. My portfolio site deploys to Cloudflare Pages on every push - checkout, npm run build, push the output directory. Cloudflare Pages also gives you preview deployments on every PR for free. Every branch gets its own URL.

For a backend API, the typical pattern is: build a Docker image tagged with the commit SHA, push it to a container registry (GitHub Container Registry is free for public repos), and deploy to your staging environment.

The staging environment doesn't need to be expensive. A $5/month VPS running Docker Compose works for most early-stage projects. Fly.io and Railway have free or cheap tiers. If you're on AWS, a single t3.micro runs for under $10/month.

One gotcha with database-backed services: your staging environment needs its own database. Sharing a database between staging and production is a disaster waiting to happen - I've seen a staging deploy run a migration that broke the production schema. A separate staging database with a recent copy of anonymized production data is the minimum. If your database is small enough, just dump and restore weekly.

The architecture decision here is Docker vs. direct deploy. If your project is a Python or Node API, Docker is almost always the right call. You get reproducible builds, easy rollback (run the previous image), and identical behavior locally and in CI. The Dockerfile doesn't need to be fancy:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ ./src/
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

For static sites, skip Docker entirely. Push build output to Cloudflare Pages, Netlify, or Vercel. These platforms handle SSL, CDN, and preview deploys out of the box.

Step 3: Production Deploy with Rollback

The final step: production deploys happen through the pipeline, not through SSH.

The trigger varies by team preference, and this is a genuine decision point. Some teams deploy on every merge to main - fast feedback, small increments, but requires confidence in your test suite. Others add a manual approval step where someone clicks "Deploy" in the GitHub Actions UI after reviewing staging. The right choice depends on your risk tolerance and how mature your testing is.

At my day job, production deploys trigger on merge to main, but database migrations are a separate story. They run through a dedicated workflow with a dry-run step that tests the migration against the dev database first. Only if the dry-run succeeds does the production migration execute. That dry-run has caught broken migrations twice in the last six months - column type mismatches and missing indices that would have caused downtime in production.

The deploy mechanism depends on your infrastructure. For a PaaS like Fly.io or Railway, rollback is built in. For Docker on a VPS, pull the new image, restart the container, and keep the previous image tagged for rollback. For AWS ECS, update the task definition with the new image tag and ECS handles rolling deployment. For bare metal without Docker, use the symlink approach: deploy to /releases/v42/, point /current at it, and rollback means pointing the symlink back to /releases/v41/.

The rollback strategy matters more than the deploy strategy. Before you automate production deploys, answer one question: when the deploy breaks something, how do you go back? If the answer is "we'd figure it out," solve that first.

A word on manual approval gates: GitHub Actions supports environment protection rules that require a reviewer to approve before the job runs. This gives you an automated pipeline that still requires a human to say "yes, this looks good on staging, ship it." It's a good middle ground while you're building confidence in your pipeline.

The minimum viable production pipeline adds a health check after deploy. Hit your app's /health endpoint. If it returns 200, the deploy succeeded. If it doesn't, roll back automatically. This single check catches the majority of failed deploys - broken imports, missing environment variables, bad database connections.

Five Mistakes I've Seen (and Made)

The 30-minute pipeline. A team I worked with added linting, type checking, three security scanners, Docker build, integration tests, and end-to-end browser tests - all blocking merge. The pipeline took 34 minutes. Developers started pushing to main directly to skip it. Start with the fastest check that catches real bugs. Add complexity only when you feel specific pain.

Flaky tests. One intermittently failing test destroys trust in CI faster than anything else. When tests fail randomly, developers learn to ignore red pipelines. Then they ignore real failures too. Fix flaky tests the day you find them, or delete them. A reliable pipeline with fewer tests beats a comprehensive one nobody trusts. I wrote more about this in the testing article.

Skipping staging. Deploying straight to production from CI because "we have tests." Tests catch code bugs. They don't catch wrong environment variables, missing migrations, broken CSS, or features that technically work but look wrong. A human needs to look at staging before production. Every time.

Secrets in workflow files. I've reviewed repos with AWS access keys committed in YAML files. Use ${{ secrets.AWS_ACCESS_KEY_ID }}. Use environment variables. Never put credentials in code, not even CI configuration.

No rollback plan. The deploy worked, the app didn't. Now what? If the answer involves SSHing into a server and running commands from memory, you've automated the easy part and left the hard part manual. A good rollback is one command: docker run previous-image-tag or fly releases rollback or pointing a symlink back one version. Define your rollback procedure before you automate deploys, not after the first outage.

CI as the AI Quality Gate

If your team uses AI coding tools, CI is where you enforce quality standards automatically.

Your linter catches patterns AI gets wrong: inline imports, over-engineered abstractions, inconsistent style across files. A coverage gate prevents the fake test suites AI can generate - tests that hit 90% line coverage while asserting nothing meaningful. ruff with strict rules catches more AI code smells than an hour of manual review.

If you're using Claude Code or Cursor, you probably have a CLAUDE.md or rules file defining how AI should write code for your project. CI is where those rules get teeth. The rules file tells the AI what to do. The pipeline catches it when the AI doesn't listen.

The pipeline becomes the single source of truth for code quality. It doesn't matter if code was written by a human, by Claude, or by a junior developer following a ChatGPT suggestion. The same checks run, the same gates apply. I wrote about the specific patterns to watch for and how to configure quality rules that catch AI-generated problems before they merge.

Once the rules are in CI, they're enforced consistently. No one has to remember to check. No one argues about style in PR reviews. The pipeline passes or it doesn't.

Start Today

Step 1 takes 30 minutes. Copy the workflow file above, adjust the test command for your project, push it. The next pull request that breaks tests gets caught before it reaches production.

I've focused on GitHub Actions because that's where most teams start and it's free. If you're on GitLab, the concepts are identical - .gitlab-ci.yml instead of .github/workflows/, same structure. CircleCI, Buildkite, and others all follow the same pattern: define triggers, run commands, gate on results. The tool matters less than having something automated.

Steps 2 and 3 can wait until next week. Or next month. The testing gate alone is worth the effort.

The compliance data system from my opening story? We set up CI in an afternoon. Within a week, it caught three PRs that would have broken the pipeline. Within a month, the team stopped treating deploys as events that required everyone's attention. They became boring. That's exactly what you want deploys to be.