-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #15 from deepgram/brent-george/autoscaling
Autoscaling
- Loading branch information
Showing
33 changed files
with
830 additions
and
228 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# Changelog | ||
|
||
All notable changes to this Helm chart will be documented in this file. | ||
|
||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). | ||
|
||
## [Unreleased] | ||
|
||
*Nothing at this time* | ||
|
||
## [0.2.0-beta] - 2024-06-20 | ||
|
||
### Added | ||
- Support for managing node autoscaling with [cluster-autoscaler](https://github.com/kubernetes/autoscaler). | ||
- Support for pod autoscaling of Deepgram components. | ||
- Support for keeping the upstream Deepgram License server as a backup even when the License Proxy is deployed. See `licenseProxy.keepUpstreamServerAsBackup` for details. | ||
|
||
### Changed | ||
|
||
- Initial installation replica count values moved from `scaling.static.{api,engine}.replicas` to `scaling.replicas.{api,engine}`. | ||
- License Proxy is no longer manually scaled. Instead, scaling can be indirectly controlled via `licenseProxy.{enabled,deploySecondReplica}`. | ||
- Labels for Deepgram dedicated nodes in the sample `cluster-config.yaml` for AWS, and the `nodeAffinity` sections of the sample `values.yaml` files. The key has been renamed from `deepgram/nodeType` to `k8s.deepgram.com/node-type`, and the values are no longer prepended with `deepgram`. | ||
- AWS EFS model download job hook delete policy changed to `before-hook-creation`. | ||
- Concurrency limit moved from API (`api.concurrencyLimit.activeRequests`) to Engine level (`engine.concurrencyLimit.activeRequests`). | ||
|
||
## [0.1.1-alpha] - 2024-06-03 | ||
|
||
### Added | ||
|
||
- Various documentation improvements | ||
|
||
## [0.1.0-alpha] - 2024-05-31 | ||
|
||
### Added | ||
|
||
- Initial implementation of the Helm chart. | ||
|
||
|
||
[unreleased]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.1.1-alpha...HEAD | ||
[0.2.0-beta]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.1.1-alpha...deepgram-self-hosted-0.2.0-beta | ||
[0.1.1-alpha]: https://github.com/deepgram/self-hosted-resources/compare/deepgram-self-hosted-0.1.0-alpha...deepgram-self-hosted-0.1.1-alpha | ||
[0.1.0-alpha]: https://github.com/deepgram/self-hosted-resources/releases/tag/deepgram-self-hosted-0.1.0-alpha |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,12 @@ | ||
apiVersion: v2 | ||
name: deepgram-self-hosted | ||
type: application | ||
version: 0.1.1-alpha | ||
version: 0.2.0-beta | ||
appVersion: "release-240528" | ||
description: A Helm chart for running Deepgram services in a self-hosted environment | ||
home: "https://developers.deepgram.com/docs/self-hosted-introduction" | ||
sources: ["https://github.com/deepgram/self-hosted-resources"] | ||
kubeVersion: ">=1.27.0-0" | ||
kubeVersion: ">=1.28.0-0" | ||
maintainers: | ||
- name: Deepgram Self-Hosted | ||
email: [email protected] | ||
|
@@ -18,13 +18,25 @@ keywords: | |
- aura | ||
- speech-to-text | ||
- stt | ||
- asr | ||
- nova | ||
- speech-to-speech | ||
- sts | ||
- voice agent | ||
- self-hosted | ||
|
||
dependencies: | ||
- name: gpu-operator | ||
version: "^24.3.0" | ||
repository: "https://helm.ngc.nvidia.com/nvidia" | ||
condition: gpu-operator.enabled | ||
- name: cluster-autoscaler | ||
version: "^9.37.0" | ||
repository: "https://kubernetes.github.io/autoscaler" | ||
condition: cluster-autoscaler.enabled | ||
- name: kube-prometheus-stack | ||
version: "^60.2.0" | ||
repository: "https://prometheus-community.github.io/helm-charts" | ||
condition: kube-prometheus-stack.includeDependency,scaling.auto.enabled | ||
- name: prometheus-adapter | ||
version: "^4.10.0" | ||
repository: "https://prometheus-community.github.io/helm-charts" | ||
condition: prometheus-adapter.includeDependency,scaling.auto.enabled |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
Binary file not shown.
79 changes: 79 additions & 0 deletions
79
charts/deepgram-self-hosted/samples/01-basic-setup-aws.cluster-config.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
apiVersion: eksctl.io/v1alpha5 | ||
kind: ClusterConfig | ||
|
||
metadata: | ||
name: deepgram-self-hosted-cluster | ||
region: us-west-2 | ||
version: "1.30" | ||
|
||
iam: | ||
withOIDC: true | ||
serviceAccounts: | ||
- metadata: | ||
name: cluster-autoscaler-sa | ||
namespace: dg-self-hosted | ||
wellKnownPolicies: | ||
autoScaler: true | ||
roleName: cluster-autoscaler-role | ||
roleOnly: true | ||
- metadata: | ||
name: efs-csi-controller-sa | ||
namespace: kube-system | ||
wellKnownPolicies: | ||
efsCSIController: true | ||
roleName: efs-csi-driver-role | ||
roleOnly: true | ||
|
||
managedNodeGroups: | ||
- name: control-plane-node-group | ||
minSize: 1 | ||
desiredCapacity: 1 | ||
maxSize: 3 | ||
instanceType: t3.large | ||
amiFamily: Ubuntu2204 | ||
iam: | ||
withAddonPolicies: | ||
autoScaler: true | ||
propagateASGTags: true | ||
- name: engine-node-group | ||
minSize: 0 | ||
desiredCapacity: 0 | ||
maxSize: 8 | ||
instanceType: g6.2xlarge | ||
amiFamily: Ubuntu2204 | ||
labels: | ||
k8s.deepgram.com/node-type: engine | ||
k8s.amazonaws.com/accelerator: nvidia-l4 | ||
iam: | ||
withAddonPolicies: | ||
efs: true | ||
autoScaler: true | ||
taints: | ||
- key: efs.csi.aws.com/agent-not-ready | ||
value: "true" | ||
effect: NoExecute | ||
propagateASGTags: true | ||
- name: api-node-group | ||
minSize: 0 | ||
desiredCapacity: 0 | ||
maxSize: 2 | ||
instanceType: c5n.xlarge | ||
amiFamily: Ubuntu2204 | ||
labels: | ||
k8s.deepgram.com/node-type: api | ||
iam: | ||
withAddonPolicies: | ||
autoScaler: true | ||
propagateASGTags: true | ||
- name: license-proxy-node-group | ||
minSize: 0 | ||
desiredCapacity: 0 | ||
maxSize: 2 | ||
instanceType: t3.large | ||
amiFamily: Ubuntu2204 | ||
labels: | ||
k8s.deepgram.com/node-type: license-proxy | ||
iam: | ||
withAddonPolicies: | ||
autoScaler: true | ||
propagateASGTags: true |
Oops, something went wrong.