Skip to content

Commit e6879d8

Browse files
committed
TEP-0157 Retention Policy for Tekton Results
1 parent c4c98f9 commit e6879d8

File tree

2 files changed

+287
-0
lines changed

2 files changed

+287
-0
lines changed
Lines changed: 286 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,286 @@
1+
---
2+
status: proposed
3+
title: Retention Policy for Tekton Results
4+
creation-date: '2024-07-17'
5+
last-updated: '2024-07-17'
6+
authors:
7+
- '@khrm'
8+
collaborators: []
9+
---
10+
11+
# TEP-0157: Tekton Results: Retention Policy for older Results and Records
12+
13+
<!-- toc -->
14+
- [Summary](#summary)
15+
- [Motivation](#motivation)
16+
- [Goals](#goals)
17+
- [Non-Goals](#non-goals)
18+
- [Use Cases](#use-cases)
19+
- [Requirements](#requirements)
20+
- [Proposal](#proposal)
21+
- [Notes and Caveats](#notes-and-caveats)
22+
- [Design Details](#design-details)
23+
- [Design Evaluation](#design-evaluation)
24+
- [Reusability](#reusability)
25+
- [Simplicity](#simplicity)
26+
- [Flexibility](#flexibility)
27+
- [User Experience](#user-experience)
28+
- [Performance](#performance)
29+
- [Risks and Mitigations](#risks-and-mitigations)
30+
- [Drawbacks](#drawbacks)
31+
- [Alternatives](#alternatives)
32+
- [Implementation Plan](#implementation-plan)
33+
- [Test Plan](#test-plan)
34+
- [Infrastructure Needed](#infrastructure-needed)
35+
- [Upgrade and Migration Strategy](#upgrade-and-migration-strategy)
36+
- [Implementation Pull Requests](#implementation-pull-requests)
37+
- [References](#references)
38+
<!-- /toc -->
39+
40+
## Summary
41+
Tekton Results stores Pipelineruns, TaskRuns, Events and Logs indefinitely.
42+
This proposed adding a retention policy feature for removing older Result and their associated records
43+
alongwith request to delete logs.
44+
45+
## Motivation
46+
Storing older results and records indefinitely leads to wastage of storage resources
47+
and degradation of DB performance. Sometime we don't require some pipelines to be deleted from
48+
archives.
49+
50+
### Goals
51+
- Ability to define retention period for the Results at cluster level. All records and results past that period should be deleted.
52+
- Ability to filter PipelineRuns when setting retention policy.
53+
- A way to delete associated logs also from s3 buckets, gcs buckets or PVC.
54+
55+
### Non-Goals
56+
57+
<!--
58+
Listing non-goals helps to focus discussion and make progress.
59+
- What is out of scope for this TEP?
60+
-->
61+
62+
### Use Cases
63+
- User can specify a global policy for all the results. All records and logs falling under results satisfying pruning condition will be deleted.
64+
- User can filter results based on cel expression and result Summary expression. All the associated records will be deleted.
65+
66+
### Requirements
67+
- For all results satisfying delete conditions, following things need to be deleted:
68+
* Results
69+
* Records for PipelineRun and TaskRun
70+
* Records for EventLog
71+
* Deletion of associated logs from s3 bucket, gcs bucket or PVC. EventLog Records should also be deleted.
72+
73+
## Proposal
74+
A pruner will run which will spin up job at specified interval based on configmap `config-results-retention-policy` given ttl and cel expressions.
75+
76+
### Enhanced Policy-Based Retention
77+
78+
To provide more granular control over data retention, we propose enhancing the retention policy mechanism. This new approach will allow users to define specific retention rules based on `PipelineRun` metadata, including labels, annotations, and completion status, while maintaining backward compatibility with the existing global retention setting.
79+
80+
The core of this proposal is to introduce an optional list of ordered policies to the `config-results-retention-policy` ConfigMap. The retention job will evaluate these policies in order, and the first policy that matches a `PipelineRun` will determine its retention period. If no specific policy matches, a default retention period will be applied.
81+
82+
This policy-based approach gives users the flexibility to, for example, retain successful production deployment `PipelineRuns` for a long time, while quickly pruning ephemeral builds from pull requests.
83+
84+
### Notes and Caveats
85+
86+
87+
## Design Details
88+
89+
The `config-results-retention-policy` ConfigMap will be extended to support both the existing `defaultRetention` key for backward compatibility and a new `policies` key for granular control.
90+
91+
The `defaultRetention` field will serve as the **fallback** retention period for any `PipelineRun` that does not match a rule in the `policies` list. This value does **not** override the retention period of a matching policy; it only applies when no policies match a given Result.
92+
93+
The `policies` field will contain a YAML formatted string representing a list of rules. Each rule is evaluated in order, and the first match wins. A rule consists of:
94+
- `name`: A descriptive name for the policy.
95+
- `selector`: Defines the criteria for matching Results. All conditions within a selector are combined with an **AND** logic—a Result must meet all specified criteria (`matchNamespaces`, `matchLabels`, `matchAnnotations`, `matchStatuses`) for the policy to apply. If a particular selector type (e.g., `matchLabels`) is omitted from a policy, it will match all Results for that criterion.
96+
- `matchNamespaces`: A list of namespaces to match against. A `PipelineRun` must be in one of the specified namespaces. An **OR** logic is applied to the values in the list.
97+
- `matchLabels`: A map where keys are label names and values are a list of strings. A `PipelineRun` must have all the specified label keys, and for each key, its value must be in the provided list. An **OR** logic is applied to the values within a single key's list.
98+
- `matchAnnotations`: A map where keys are annotation names and values are a list of strings. This works similarly to `matchLabels`.
99+
- `matchStatuses`: A list of completion statuses to match against. A `PipelineRun`'s status must be in the list. An **OR** logic is applied to the values in the list. The status is determined by the `reason` field of the primary `Succeeded` condition in the `PipelineRun` or `TaskRun` status. For a list of possible status reasons, refer to the [Tekton documentation on execution status](https://tekton.dev/docs/pipelines/pipelineruns/#monitoring-execution-status).
100+
- `retention`: The retention period for matching `PipelineRuns`, specified as a duration string (e.g., "730d", "90d", "24h").
101+
102+
#### Example Configuration:
103+
104+
```yaml
105+
# config/base/config-results-retention-policy.yaml
106+
apiVersion: v1
107+
kind: ConfigMap
108+
metadata:
109+
name: config-results-retention-policy
110+
data:
111+
# runAt determines when to run the pruning job.
112+
runAt: "5 5 * * 0"
113+
# defaultRetention is the fallback retention period.
114+
# This is used if no specific policy matches.
115+
defaultRetention: "30d"
116+
# policies is an optional list of retention policies, evaluated in order.
117+
policies: |
118+
- name: "prod-namespace-deployments"
119+
selector:
120+
matchNamespaces: ["prod", "staging"]
121+
matchStatuses: ["Succeeded"]
122+
retention: "365d"
123+
- name: "signed-prod-deployments"
124+
selector:
125+
matchNamespaces: ["prod"]
126+
matchLabels:
127+
'tekton.dev/pipeline': ['deploy-to-prod']
128+
matchAnnotations:
129+
'chains.tekton.dev/signed': ['true']
130+
matchStatuses: ["Succeeded"]
131+
retention: "730d" # 2 years
132+
- name: "all-terminated-runs"
133+
selector:
134+
matchStatuses: ["Failed", "Cancelled", "PipelineRunTimeout"]
135+
retention: "90d" # 90 days
136+
- name: "git-event-builds"
137+
selector:
138+
matchLabels:
139+
'tekton.dev/event-type': ["pull_request", "push"]
140+
retention: "14d" # 2 weeks
141+
```
142+
143+
### Database Interaction
144+
145+
No database schema changes are required. The retention job will leverage the existing `records` table, which stores `PipelineRun` data in a `jsonb` column and the namespace in the `parent` column.
146+
147+
The job will dynamically construct a single SQL `DELETE` query with a `CASE` statement. This `CASE` statement will iterate through the configured policies and apply the appropriate retention period based on the first matching selector. The `jsonb` querying capabilities of PostgreSQL will be used to match the selectors against the `PipelineRun` metadata stored in the `data` column.
148+
149+
The `ON DELETE CASCADE` foreign key constraint between the `results` and `records` tables ensures that deleting a `Result` will automatically delete all associated `Records`, including `PipelineRun` and `TaskRun` data.
150+
151+
152+
## Design Evaluation
153+
<!--
154+
How does this proposal affect the api conventions, reusability, simplicity, flexibility
155+
and conformance of Tekton, as described in [design principles](https://github.com/tektoncd/community/blob/master/design-principles.md)
156+
-->
157+
158+
### Reusability
159+
160+
<!--
161+
https://github.com/tektoncd/community/blob/main/design-principles.md#reusability
162+
163+
- Are there existing features related to the proposed features? Were the existing features reused?
164+
- Is the problem being solved an authoring-time or runtime-concern? Is the proposed feature at the appropriate level
165+
authoring or runtime?
166+
-->
167+
168+
### Simplicity
169+
170+
<!--
171+
https://github.com/tektoncd/community/blob/main/design-principles.md#simplicity
172+
173+
- How does this proposal affect the user experience?
174+
- What’s the current user experience without the feature and how challenging is it?
175+
- What will be the user experience with the feature? How would it have changed?
176+
- Does this proposal contain the bare minimum change needed to solve for the use cases?
177+
- Are there any implicit behaviors in the proposal? Would users expect these implicit behaviors or would they be
178+
surprising? Are there security implications for these implicit behaviors?
179+
-->
180+
181+
### Flexibility
182+
183+
<!--
184+
https://github.com/tektoncd/community/blob/main/design-principles.md#flexibility
185+
186+
- Are there dependencies that need to be pulled in for this proposal to work? What support or maintenance would be
187+
required for these dependencies?
188+
- Are we coupling two or more Tekton projects in this proposal (e.g. coupling Pipelines to Chains)?
189+
- Are we coupling Tekton and other projects (e.g. Knative, Sigstore) in this proposal?
190+
- What is the impact of the coupling to operators e.g. maintenance & end-to-end testing?
191+
- Are there opinionated choices being made in this proposal? If so, are they necessary and can users extend it with
192+
their own choices?
193+
-->
194+
195+
### Conformance
196+
197+
<!--
198+
https://github.com/tektoncd/community/blob/main/design-principles.md#conformance
199+
200+
- Does this proposal require the user to understand how the Tekton API is implemented?
201+
- Does this proposal introduce additional Kubernetes concepts into the API? If so, is this necessary?
202+
- If the API is changing as a result of this proposal, what updates are needed to the
203+
[API spec](https://github.com/tektoncd/pipeline/blob/main/docs/api-spec.md)?
204+
-->
205+
206+
### User Experience
207+
208+
<!--
209+
(optional)
210+
211+
Consideration about the user experience. Depending on the area of change,
212+
users may be Task and Pipeline editors, they may trigger TaskRuns and
213+
PipelineRuns or they may be responsible for monitoring the execution of runs,
214+
via CLI, dashboard or a monitoring system.
215+
216+
Consider including folks that also work on CLI and dashboard.
217+
-->
218+
219+
### Performance
220+
This improves the peformance of DB by deleting superfluous results and their associated datas.
221+
222+
### Risks and Mitigations
223+
224+
<!--
225+
What are the risks of this proposal and how do we mitigate? Think broadly.
226+
For example, consider both security and how this will impact the larger
227+
Tekton ecosystem. Consider including folks that also work outside the WGs
228+
or subproject.
229+
- How will security be reviewed and by whom?
230+
- How will UX be reviewed and by whom?
231+
-->
232+
233+
### Drawbacks
234+
235+
<!--
236+
Why should this TEP _not_ be implemented?
237+
-->
238+
239+
## Alternatives
240+
241+
242+
## Implementation Plan
243+
244+
<!--
245+
What are the implementation phases or milestones? Taking an incremental approach
246+
makes it easier to review and merge the implementation pull request.
247+
-->
248+
249+
250+
### Test Plan
251+
252+
- We will add a Integration tests like we have for Logging in GCS storage and other scenarios.
253+
254+
### Infrastructure Needed
255+
256+
<!--
257+
(optional)
258+
259+
Use this section if you need things from the project or working group.
260+
Examples include a new subproject, repos requested, GitHub details.
261+
Listing these here allows a working group to get the process for these
262+
resources started right away.
263+
-->
264+
265+
### Upgrade and Migration Strategy
266+
267+
<!--
268+
(optional)
269+
270+
Use this section to detail whether this feature needs an upgrade or
271+
migration strategy. This is especially useful when we modify a
272+
behavior or add a feature that may replace and deprecate a current one.
273+
-->
274+
275+
### Implementation Pull Requests
276+
277+
278+
## References
279+
280+
<!--
281+
(optional)
282+
283+
Use this section to add links to GitHub issues, other TEPs, design docs in Tekton
284+
shared drive, examples, etc. This is useful to refer back to any other related links
285+
to get more details.
286+
-->

teps/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ This is the complete list of Tekton TEPs:
146146
|[TEP-0154](0154-concise-remote-resolver-syntax.md) | Concise Remote Resolver Syntax | implementable | 2024-03-21 |
147147
|[TEP-0155](0155-store-pipeline-events-in-db.md) | Store Pipeline Events in Tekton Results | proposed | 2024-04-19 |
148148
|[TEP-0156](0156-whenexpressions-in-step.md) | WhenExpressions in Steps | implemented | 2024-07-22 |
149+
|[TEP-0157](0157-retention-policy-results.md) | Retention Policy for Tekton Results | proposed | 2024-07-17 |
149150
|[TEP-0160](0160-enhance-results-cli.md) | Enhance Tekton Results CLI | proposed | 2025-03-13 |
150151
|[TEP-0161](0161-resolver-caching.md) | Resolver Caching for Task and Pipeline Resolution | proposed | 2024-06-15 |
151152
|[TEP-0162](0162-event-based-pruning-of-tekton-resources.md) | event based pruning of tekton resources | proposed | 2025-06-18 |

0 commit comments

Comments
 (0)