Workflows

CI/CD Git optimization

CI/CD Git optimization strategies: shallow clone, caching, partial clone, fetching only changes, and specific configurations for GitHub Actions and GitLab CI.

Who This Is For
  • Teams turning commands into repeatable routines
  • Readers who need sequencing, branch, and sync discipline
Prerequisites
  • Basic understanding of fetch, pull, push, and branches
  • A sense of how and why branches diverge
Common Risks
  • Copying a workflow without checking branch state
  • Choosing the wrong integration path on shared branches

The short version

CI/CD Git Optimization FlowGit optimization strategies in CI: shallow clone reduces download, caching .git directory speeds up subsequent runs, partial clone fetches objects on demand.
Full Clone
Download full history (.git/)Extract all objectsBuild workspace
Optimization Result
--depth 1 shallow cloneCache .git directoryPartial clone on-demandOnly pull changed files
CI/CD usually only needs latest code, not full history. Shallow clone can reduce download time by 80-95%.

In CI/CD pipelines, Git clone and checkout are often among the most time-consuming steps, especially for large repositories. Through shallow clone, caching, partial clone, and other strategies, you can reduce Git operation time from minutes to seconds.

Why Git operations are slow in CI/CD

Full clone overhead

# Full clone downloads all history
git clone https://github.com/large/repo.git
# For large repos:
# - May need to download several GB
# - Includes full history of all branches and tags
# - Includes all versions of binary files (if not using LFS)

CI environment specifics

  • Every run is a clean environment with no local cache
  • Often only need the latest code, not full history
  • Concurrent builds mean Git operations repeat frequently
  • Network latency (CI servers and Git repos may be in different regions)

Optimization 1: Shallow clone

Basic usage

# Clone only the most recent commit
git clone --depth 1 https://github.com/user/repo.git

# Clone only the most recent N commits
git clone --depth 50 https://github.com/user/repo.git

# Clone only a specific branch
git clone --depth 1 --branch main https://github.com/user/repo.git

Performance comparison

Clone typeDownload sizeTime (10GB repo)
Full clone~10GB~120s
--depth 1~500MB~10s
--depth 1 + --single-branch~300MB~6s

Caveats

# Shallow clone limitations
# - Can't access full history
# - git blame may be incomplete
# - git log only shows recent N entries
# - Can't create branches from history

# Un-shallow when you need full history
git fetch --unshallow
# or
git fetch --depth=1000

GitHub Actions configuration

# .github/workflows/ci.yml
name: CI
on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 1  # Shallow clone, only latest commit

      # If full history is needed (e.g., changelog generation)
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full clone

GitLab CI configuration

# .gitlab-ci.yml
variables:
  GIT_DEPTH: 1  # Shallow clone

build:
  script:
    - echo "Building with shallow clone"

Optimization 2: Git caching

GitHub Actions caching

name: CI
on: [push]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Cache .git directory (advanced)
      - name: Cache Git
        uses: actions/cache@v4
        with:
          path: .git
          key: git-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
          restore-keys: |
            git-${{ runner.os }}-

Using ccache for compilation (indirect speedup)

- name: Cache build artifacts
  uses: actions/cache@v4
  with:
    path: |
      ~/.cache/ccache
      node_modules
    key: ${{ runner.os }}-${{ hashFiles('**/lockfile') }}

GitLab CI caching

# .gitlab-ci.yml
cache:
  key: ${CI_COMMIT_REF_SLUG}
  paths:
    - node_modules/
    - .cache/

# Or use distributed cache (Premium/Ultimate)
cache:
  key: "$CI_JOB_NAME"
  paths:
    - .git/
  policy: pull-push

Optimization 3: Partial clone

What is partial clone

Partial clone is a Git 2.19+ feature that allows deferred object fetching:

# Don't download blobs (file contents), only metadata
git clone --filter=blob:none https://github.com/user/repo.git

# Don't download large blobs
git clone --filter=blob:limit=1m https://github.com/user/repo.git

# On-demand download (auto-fetch when accessing files)
# Files are fetched from remote on first access

Performance comparison

Clone typeInitial downloadSubsequent access
Full clone100%Local read
--depth 1~5%No history
--filter=blob:none~2%On-demand download
--filter=blob:limit=1m~3%Large files on-demand

Using in GitHub Actions

- uses: actions/checkout@v4
  with:
    fetch-depth: 1
    filter: "blob:none"  # partial clone

Using in GitLab CI

before_script:
  # Manual partial clone
  - git clone --filter=blob:none --depth 1 $CI_REPOSITORY_URL .

Optimization 4: Fetch only changes

Use fetch instead of clone

If you've cached the .git directory, just fetch changes:

# In a cached repo, only fetch latest changes
git fetch origin main --depth 1
git checkout FETCH_HEAD

# Or use --update-head-ok
git fetch --update-head-ok origin main

Incremental update pattern in CI

#!/bin/bash
# ci-git-setup.sh

if [ -d ".git" ]; then
    # Existing repo, only fetch changes
    echo "Updating existing repository..."
    git fetch origin ${CI_BRANCH:-main} --depth 1
    git checkout -B ${CI_BRANCH:-main} FETCH_HEAD
else
    # First clone
    echo "Cloning repository..."
    git clone --depth 1 --branch ${CI_BRANCH:-main} $REPO_URL .
fi

Large repo specific optimizations

Use sparse checkout

# Only checkout needed directories
git clone --depth 1 --no-checkout https://github.com/user/repo.git
cd repo
git sparse-checkout init --cone
git sparse-checkout set src/ docs/
git checkout main

Sparse checkout in GitHub Actions

- uses: actions/checkout@v4
  with:
    fetch-depth: 1
    sparse-checkout: |
      src/
      docs/
    sparse-checkout-cone-mode: true

LFS optimization

# CI may not need LFS files
git clone --depth 1 https://github.com/user/repo.git
cd repo

# Skip LFS fetch (if not needed)
git config lfs.fetchexclude "*"

# Or only fetch specific LFS file types
git config lfs.fetchinclude "*.psd,*.ai"

Complete CI configuration examples

GitHub Actions optimized

name: CI
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    timeout-minutes: 30

    steps:
      # 1. Shallow clone
      - uses: actions/checkout@v4
        with:
          fetch-depth: 1
          persist-credentials: false

      # 2. Cache dependencies
      - name: Cache node modules
        uses: actions/cache@v4
        with:
          path: ~/.npm
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
          restore-keys: |
            ${{ runner.os }}-node-

      # 3. Install dependencies
      - name: Install dependencies
        run: npm ci

      # 4. Build
      - name: Build
        run: npm run build

      # 5. Test
      - name: Test
        run: npm test

GitLab CI optimized

# .gitlab-ci.yml
variables:
  GIT_DEPTH: 1
  GIT_STRATEGY: clone  # or fetch (if caching)

stages:
  - build
  - test

.build_template: &build_config
  before_script:
    - npm ci --cache .npm
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - .npm/

build:
  <<: *build_config
  stage: build
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 week

test:
  <<: *build_config
  stage: test
  script:
    - npm test

Performance benchmark data

Test results (5GB repo)

ConfigurationClone timeTotal pipeline time
Default (full clone)45s180s
fetch-depth: 18s143s
fetch-depth: 1 + cache3s138s
partial clone + cache2s137s

Very large repos (50GB+)

ConfigurationClone time
Default300s+
fetch-depth: 130s
sparse checkout10s
partial clone + sparse5s

Key takeaways

  1. Shallow clone doesn't suit scenarios needing full history: changelog generation, git blame analysis, etc.
  2. Cache key selection: Use lockfile hash instead of fixed keys to invalidate cache when dependencies update
  3. Partial clone requires Git 2.19+: Verify CI environment Git version
  4. Sparse checkout requires Git 2.25+: Cone mode requires 2.27+
  5. Don't over-optimize: Small repos see limited benefits; focus on build and test steps first

Summary

OptimizationBest forTime savedComplexity
Shallow cloneMost CI scenarios70-90%Low
Git cachingFrequently built projects50-80%Medium
Partial cloneLarge repos80-95%Medium
Sparse checkoutVery large repos, partial code needed90-99%High
Fetch only changesCached repo scenarios95%+Medium

Recommended combination: Shallow clone + dependency caching works for 90% of projects and offers the best cost-benefit ratio.