CI Optimization Strategy
Date: 2025-04-02
Status
Accepted
Context
Our CI/CD pipeline is critical for ensuring code quality and security in our monorepo structure. However, as the codebase grows, so does the runtime of our CI/CD pipelines. With limited CI/CD machine instances available, running full validation on every PR is becoming resource-intensive and time-consuming.
We need a strategy to optimize our CI/CD pipeline to:
- Reduce unnecessary builds
- Skip tests that aren't relevant to the changes
- Focus resources on the most critical checks
- Minimize wait times for developers
- Make efficient use of our limited CI runners
Decision
We will implement a comprehensive CI optimization strategy using skip-duplicate-actions
and custom logic to intelligently determine which checks to run based on the changes in a PR or commit.
Key Components:
-
Change Analysis: We'll analyze which files have changed to determine:
-
Which languages are affected (TypeScript, Python, etc.)
- Which applications and libraries are affected
- Whether configuration or Docker files have changed
-
If only tests or documentation have changed
-
Dynamic Test Matrix: Based on the change analysis, we'll:
-
Create a dynamic build matrix that only includes affected applications
- Skip language-specific checks if no relevant files changed
- Skip Docker builds if no Docker files changed
-
Prioritize security checks for sensitive changes
-
Duplicate Detection: We'll use
fkirc/skip-duplicate-actions
to: -
Skip checks if they're identical to a previous successful run
- Identify when checks can be safely skipped
-
Provide visibility into skipped checks
-
Transparency: We'll provide clear visibility by:
- Adding comments to PRs with optimization decisions
- Showing estimated time/resource savings
- Ensuring developers understand why checks were skipped
Implementation Details
-
Change Detection:
-
Use Git to determine which files changed
-
Group changes by file type, language, and component
-
Skip Criteria:
-
Skip JavaScript/TypeScript checks if only Python files changed
- Skip Python checks if only JavaScript/TypeScript files changed
- Skip security checks if only tests or documentation changed
-
Skip Docker builds if no Docker-related files changed
-
Affected Component Detection:
-
Extract application names from file paths
- Extract library names from file paths
-
Build dependency graph to determine indirect impacts
-
Build Matrix Generation:
-
Create a matrix that only includes affected applications
-
Include language-specific information for each app
-
Optimization Reporting:
- Add PR comments showing which checks were skipped
- Show estimated time and resource savings
- Maintain a single, updated comment per PR
Consequences
Positive
- Reduced CI runtime for most PRs
- More efficient use of CI/CD resources
- Faster feedback cycles for developers
- Ability to scale the monorepo without proportional increase in CI resources
- Clear visibility into optimization decisions
Negative
- Increased complexity in CI/CD configuration
- Risk of skipping relevant checks if dependencies aren't properly mapped
- Potential for different behavior between local and CI environments
Mitigations
- Comprehensive documentation of the optimization strategy
- Regular validation of the dependency mapping
- Clear PR comments explaining optimization decisions
- Option to force full CI checks when needed
Usage Notes
For Developers
-
Understanding Skipped Checks:
-
Look for the "CI Optimization Summary" comment on your PR
- Review which checks were skipped and why
-
Note the estimated time and resource savings
-
When to Force Full Checks:
-
For releases or significant changes
- When dependencies have changed
-
When you're unsure about the impact of your changes
-
Optimizing Your PRs:
- Group related changes together
- Separate documentation/test changes from code changes
- Consider breaking large PRs into smaller, focused ones
For Maintainers
-
Monitoring Effectiveness:
-
Track actual time savings vs. estimated
- Monitor for cases where relevant checks were incorrectly skipped
-
Adjust skip criteria based on real-world effectiveness
-
Maintaining the System:
- Update dependency mappings as the codebase evolves
- Adjust skip criteria as needed
- Ensure documentation stays current