Workflows
CI/CD Git optimization
CI/CD Git optimization strategies: shallow clone, caching, partial clone, fetching only changes, and specific configurations for GitHub Actions and GitLab CI.
- Teams turning commands into repeatable routines
- Readers who need sequencing, branch, and sync discipline
- Basic understanding of fetch, pull, push, and branches
- A sense of how and why branches diverge
- Copying a workflow without checking branch state
- Choosing the wrong integration path on shared branches
The short version
Download full history (.git/)Extract all objectsBuild workspace
--depth 1 shallow cloneCache .git directoryPartial clone on-demandOnly pull changed files
CI/CD usually only needs latest code, not full history. Shallow clone can reduce download time by 80-95%.
In CI/CD pipelines, Git clone and checkout are often among the most time-consuming steps, especially for large repositories. Through shallow clone, caching, partial clone, and other strategies, you can reduce Git operation time from minutes to seconds.
Why Git operations are slow in CI/CD
Full clone overhead
# Full clone downloads all history
git clone https://github.com/large/repo.git
# For large repos:
# - May need to download several GB
# - Includes full history of all branches and tags
# - Includes all versions of binary files (if not using LFS)
CI environment specifics
- Every run is a clean environment with no local cache
- Often only need the latest code, not full history
- Concurrent builds mean Git operations repeat frequently
- Network latency (CI servers and Git repos may be in different regions)
Optimization 1: Shallow clone
Basic usage
# Clone only the most recent commit
git clone --depth 1 https://github.com/user/repo.git
# Clone only the most recent N commits
git clone --depth 50 https://github.com/user/repo.git
# Clone only a specific branch
git clone --depth 1 --branch main https://github.com/user/repo.git
Performance comparison
| Clone type | Download size | Time (10GB repo) |
|---|---|---|
| Full clone | ~10GB | ~120s |
| --depth 1 | ~500MB | ~10s |
| --depth 1 + --single-branch | ~300MB | ~6s |
Caveats
# Shallow clone limitations
# - Can't access full history
# - git blame may be incomplete
# - git log only shows recent N entries
# - Can't create branches from history
# Un-shallow when you need full history
git fetch --unshallow
# or
git fetch --depth=1000
GitHub Actions configuration
# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 1 # Shallow clone, only latest commit
# If full history is needed (e.g., changelog generation)
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full clone
GitLab CI configuration
# .gitlab-ci.yml
variables:
GIT_DEPTH: 1 # Shallow clone
build:
script:
- echo "Building with shallow clone"
Optimization 2: Git caching
GitHub Actions caching
name: CI
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# Cache .git directory (advanced)
- name: Cache Git
uses: actions/cache@v4
with:
path: .git
key: git-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
git-${{ runner.os }}-
Using ccache for compilation (indirect speedup)
- name: Cache build artifacts
uses: actions/cache@v4
with:
path: |
~/.cache/ccache
node_modules
key: ${{ runner.os }}-${{ hashFiles('**/lockfile') }}
GitLab CI caching
# .gitlab-ci.yml
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .cache/
# Or use distributed cache (Premium/Ultimate)
cache:
key: "$CI_JOB_NAME"
paths:
- .git/
policy: pull-push
Optimization 3: Partial clone
What is partial clone
Partial clone is a Git 2.19+ feature that allows deferred object fetching:
# Don't download blobs (file contents), only metadata
git clone --filter=blob:none https://github.com/user/repo.git
# Don't download large blobs
git clone --filter=blob:limit=1m https://github.com/user/repo.git
# On-demand download (auto-fetch when accessing files)
# Files are fetched from remote on first access
Performance comparison
| Clone type | Initial download | Subsequent access |
|---|---|---|
| Full clone | 100% | Local read |
| --depth 1 | ~5% | No history |
| --filter=blob:none | ~2% | On-demand download |
| --filter=blob:limit=1m | ~3% | Large files on-demand |
Using in GitHub Actions
- uses: actions/checkout@v4
with:
fetch-depth: 1
filter: "blob:none" # partial clone
Using in GitLab CI
before_script:
# Manual partial clone
- git clone --filter=blob:none --depth 1 $CI_REPOSITORY_URL .
Optimization 4: Fetch only changes
Use fetch instead of clone
If you've cached the .git directory, just fetch changes:
# In a cached repo, only fetch latest changes
git fetch origin main --depth 1
git checkout FETCH_HEAD
# Or use --update-head-ok
git fetch --update-head-ok origin main
Incremental update pattern in CI
#!/bin/bash
# ci-git-setup.sh
if [ -d ".git" ]; then
# Existing repo, only fetch changes
echo "Updating existing repository..."
git fetch origin ${CI_BRANCH:-main} --depth 1
git checkout -B ${CI_BRANCH:-main} FETCH_HEAD
else
# First clone
echo "Cloning repository..."
git clone --depth 1 --branch ${CI_BRANCH:-main} $REPO_URL .
fi
Large repo specific optimizations
Use sparse checkout
# Only checkout needed directories
git clone --depth 1 --no-checkout https://github.com/user/repo.git
cd repo
git sparse-checkout init --cone
git sparse-checkout set src/ docs/
git checkout main
Sparse checkout in GitHub Actions
- uses: actions/checkout@v4
with:
fetch-depth: 1
sparse-checkout: |
src/
docs/
sparse-checkout-cone-mode: true
LFS optimization
# CI may not need LFS files
git clone --depth 1 https://github.com/user/repo.git
cd repo
# Skip LFS fetch (if not needed)
git config lfs.fetchexclude "*"
# Or only fetch specific LFS file types
git config lfs.fetchinclude "*.psd,*.ai"
Complete CI configuration examples
GitHub Actions optimized
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
# 1. Shallow clone
- uses: actions/checkout@v4
with:
fetch-depth: 1
persist-credentials: false
# 2. Cache dependencies
- name: Cache node modules
uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
# 3. Install dependencies
- name: Install dependencies
run: npm ci
# 4. Build
- name: Build
run: npm run build
# 5. Test
- name: Test
run: npm test
GitLab CI optimized
# .gitlab-ci.yml
variables:
GIT_DEPTH: 1
GIT_STRATEGY: clone # or fetch (if caching)
stages:
- build
- test
.build_template: &build_config
before_script:
- npm ci --cache .npm
cache:
key:
files:
- package-lock.json
paths:
- .npm/
build:
<<: *build_config
stage: build
script:
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 week
test:
<<: *build_config
stage: test
script:
- npm test
Performance benchmark data
Test results (5GB repo)
| Configuration | Clone time | Total pipeline time |
|---|---|---|
| Default (full clone) | 45s | 180s |
| fetch-depth: 1 | 8s | 143s |
| fetch-depth: 1 + cache | 3s | 138s |
| partial clone + cache | 2s | 137s |
Very large repos (50GB+)
| Configuration | Clone time |
|---|---|
| Default | 300s+ |
| fetch-depth: 1 | 30s |
| sparse checkout | 10s |
| partial clone + sparse | 5s |
Key takeaways
- Shallow clone doesn't suit scenarios needing full history: changelog generation, git blame analysis, etc.
- Cache key selection: Use lockfile hash instead of fixed keys to invalidate cache when dependencies update
- Partial clone requires Git 2.19+: Verify CI environment Git version
- Sparse checkout requires Git 2.25+: Cone mode requires 2.27+
- Don't over-optimize: Small repos see limited benefits; focus on build and test steps first
Summary
| Optimization | Best for | Time saved | Complexity |
|---|---|---|---|
| Shallow clone | Most CI scenarios | 70-90% | Low |
| Git caching | Frequently built projects | 50-80% | Medium |
| Partial clone | Large repos | 80-95% | Medium |
| Sparse checkout | Very large repos, partial code needed | 90-99% | High |
| Fetch only changes | Cached repo scenarios | 95%+ | Medium |
Recommended combination: Shallow clone + dependency caching works for 90% of projects and offers the best cost-benefit ratio.