[design] Cluster UX long term vision #22123

chaas · 2023-10-02T21:46:15Z

Design doc for cluster UX long term vision.
This design spells out the vision for the user experience of managing clusters in the long term which covers both the "end state" goal as well as the short and medium states.

Co-authored with @antiguru and @benesch .

benesch · 2023-10-03T03:59:37Z

Cross-referencing the epic: MaterializeInc/database-issues#6666.

antiguru

I like where this is going!

doc/developer/design/20231002_cluster_vision.md

antiguru · 2023-10-03T16:09:16Z

doc/developer/design/20231002_cluster_vision.md

+### Support & testing
+Support is able to create create unbilled or partially billed cluster resources for resolving customer issues. This is soon to be possible via unbilled replicas [#20317](https://github.com/MaterializeInc/materialize/issues/20317).
+
+Engineering is also able to create additional unbilled shadow replicas for testing new features and query plan changes, which do not serve customers' production workflows.


I'm not yet fully convinced if shadow replicas are the mechanism we'd like to have to do A/B testing. An alternative are shadow environments where all parts are cloned and we don't risk taking down the environment through a misbehaving shadow replica. All I mean to say is that we might want to leave it outside of this design!

Sounds good, I can leave this out then if we're not sure yet

I can leave out the entire "Support & testing" section if the content there is too narrow of a view of how we can support customers in the long term

No, I think it's good! TBH, I'd bring back the bit about shadow replicas and just add a caveat like "if they can be made safe." But I think it's absolutely right that we want some way to test new releases/candidate changes on real production workloads, if we can find a way to do so without putting those environments at risk.

benesch · 2023-10-05T04:19:01Z

In my queue to review! Likely not until Friday or the weekend though.

sthm · 2023-10-06T07:06:09Z

Moving towards the declarative API for managing clusters seems like the right thing to do. It's much easier to understand for customers and simpler to handle. But I would be a bit careful to deprecate the imperative API for customers. It seems reasonable to (almost completely) hide it in the documentation but I think there legit are use cases where customers can still benefit from it.

Here are two examples that require replicas with different sizes: For instance, it's an easy way to verify the effect of scaling up/down clusters. Right now, a customer I'm working with is seeing hydration times around 30min but the replica can potentially be scaled down. Adding an additional smaller replica is a very seamless way to just test what happens without actually scaling the cluster. Similarly, cost cautious customers may be willing to run replicas pretty busy, if they want to verify upfront if a smaller replica does not fall over, they can provision the smaller replica in addition to the larger one and monitor for some time (hours or few days) what happens.

People may be able to do similar things with automatic and 0 down time scaling. I'm not saying that we should not focus on the declarative approach. But keeping some basic support for the more manual imperative approach could be useful for some specific and less common use cases.

benesch

This is really great, @chaas. Left a few comments within, but so excited that you've managed to capture all of this in writing.

benesch · 2023-10-10T05:59:45Z

doc/developer/design/20231002_cluster_vision.md

+the "end state" goal as well as the short and medium states in order to:
+* Ensure alignment in the future that we are working toward
+* Make product prioritization decisions around cluster work
+* Make folks more comfortable accepting intermediate states that aren't ideal in service of a greater goal


👍🏽 on this in particular

benesch · 2023-10-10T06:01:13Z

doc/developer/design/20231002_cluster_vision.md

+To be determined: whether replica sets fits into this model, either externally exposed or internal-only. Perhaps they are a way we could recover clusters with heterogeneous replicas while retaining a declarative API.
+
+### Resource usage
+The very long-term goal is clusterless Materialize, where Materialize does automatic workload scheduling for the customer.


cc @frankmcsherry on this point in particular. We may want to try to clarify long-term (i.e, at least how many years away is it).

Eh, I'm ok with it being infinity years away. :D At least, CREATE CLUSTER does a very valuable thing at the moment, which is to allow users to express isolation. That has lots of value, and removing it seems like it removes value, rather than adds value. If "clusterless" just means "idk folks use default for work they don't care about and it autoscales" great, but I expect real production users to want to use clusters for the same reason that they continue to use VMs, containers, processes, threads, even though we could just delete all those concepts.

Eh, I'm ok with it being infinity years away. :D

I'm personally fine with "infinity"! But @antiguru was excited about th eprospect.

I think we should align across Materialize on whether clusterless Materialize is something we want to pursue soon-ish, eventually, or never. That will inform how seriously we need to consider the possibility of its existence in today's designs.

"clusterless" just means "idk folks use default for work they don't care about and it autoscales" great

I think @antiguru had something more elaborate in mind, where dataflows would move between clusters as necessary.

I'm fine with it staying as it currently is, i.e., we use clusters as a user-indicated boundary between resources. One problem that I'd eventually like to see vanish is how do users determine the right cut in their dependency graph such that they can use the least amount of resources while achieving their availability goals. From what I observed, this is a recurring problem which needs some explaining for users to get right.

How we get there is a different question. One take could be that there's something that indicates a resource assignment, but I have no strong preference whether this would be part of a component within Materialize or something on top only giving recommendations. The latter seems more practical and potentially less dangerous, at least until we figure out how to write a controller for Materialize (which we currently don't know.)

TL;DR, happy to delay this infinitely, but we should be aware of the challenge users face.

benesch · 2023-10-10T06:01:42Z

doc/developer/design/20231002_cluster_vision.md

+### Resource usage
+The very long-term goal is clusterless Materialize, where Materialize does automatic workload scheduling for the customer.
+
+An intermediary solution, which is also far off is autoscaling of clusters, where Materialize automatically resizes clusters based on the observed workload.


I don't think this needs to be that far off! We could plausibly do this next year. Whereas I don't think clusterless Materialize is something we do in the next two years.

The "auto" part here is the scary part. Just about everyone gets it wrong, and the whole control theory part of whether you should/shouldn't scale is something MZ humans need to understand first, and I think that's still a ways off.

benesch · 2023-10-10T06:01:49Z

doc/developer/design/20231002_cluster_vision.md

+An intermediary solution, which is also far off is autoscaling of clusters, where Materialize automatically resizes clusters based on the observed workload.
+
+A more achievable offering in the short-term is automatic shutdown of clusters, where Materialize can spin down a cluster to 0 replicas based on certain criteria, such as a scheduled time or amount of idle time. \
+This would reduce resource waste for development clusters. The triggering mechanism from graceful rehydration is also a requirement here.


This relates to CREATE MATERIALIZED VIEW .. REFRESH <NEVER | PERIOD> that @ggevay is keen on. There's probably an issue to link, but tl;dr folks are interested in the economics of less frequent refreshes that we still manage for them.

benesch · 2023-10-10T06:02:36Z

doc/developer/design/20231002_cluster_vision.md

+* Shadow replicas.
+
+**Much Later**
+* Autoscaling clusters / clusterless.


I think we can promote autoscaling clusters to "later"!

doc/developer/design/20231002_cluster_vision.md

benesch · 2023-10-10T06:07:24Z

doc/developer/design/20231002_cluster_vision.md

+### Support & testing
+Support is able to create create unbilled or partially billed cluster resources for resolving customer issues. This is soon to be possible via unbilled replicas [#20317](https://github.com/MaterializeInc/materialize/issues/20317).
+
+Engineering is also able to create additional unbilled shadow replicas for testing new features and query plan changes, which do not serve customers' production workflows.


No, I think it's good! TBH, I'd bring back the bit about shadow replicas and just add a caveat like "if they can be made safe." But I think it's absolutely right that we want some way to test new releases/candidate changes on real production workloads, if we can find a way to do so without putting those environments at risk.

benesch · 2023-10-10T06:20:03Z

doc/developer/design/20231002_cluster_vision.md

+This means deprecating manual cluster replica management. \
+We believe this is easier to use and manage.
+
+The primary work item for this is **graceful reconfiguration**. At the moment, a change in size causes downtime until the new replicas are hydrated. As such, customers still want the flexibility to create their own replicas for graceful resizing. We can avoid this by leaving a subset of the original replicas around until the new replicas are hydrated. \


I'm increasingly wondering if we don't need graceful reconfiguration. At least, if you buy into the theory that blue/green is you do things gracefully ... then perhaps ALTER ... SET CLUSTER should be a dangerous forcing operation that puts things into action immediately, even if doing so causes downtime.

Ah kind of like what we're thinking for indexes, where if you want to in-place drop an index that will incur downtime, and if you want to gracefully drop one we suggest blue/green?

My only hesitation there is that forcing users to do blue/green is more work and conceptual overhead for the user than graceful reconfiguration. From a UX standpoint, graceful reconfiguration is effectively an abstraction on top blue/green, that manages spinning up the new resource and cutting over for them.
Even with blue/green, we'd need (1) to give users an easy way to detect that rehydration is complete so that they can execute the cutover themselves. We just wouldn't need (2), the triggering mechanism.

Also I was thinking the syntax here would be ALTER CLUSTER ... SET (SIZE = <>), which spins up different sized replicas and cuts over to those replicas. As it stands now, adding another replica to an existing cluster doesn't require users to drop and recreate all downstream dependencies. My understanding is that blue/green requires recreating a whole second version of the stack, including all objects and clusters, which is a much heavier lift.

I think we should strive for graceful reconfiguration as an end state for resizing, with blue/green as an intermediary state.
For moving objects between clusters (ALTER ... SET CLUSTER), I'm fine with that continuing to require blue/green.

Even with blue/green, we'd need (1) to give users an easy way to detect that rehydration is complete so that they can execute the cutover themselves.

Yep, definitely! But I think we're much closer to being able to surface that rehydration complete signal than we are to being able to take action off of it directly inside of Materialize.

It's a good point that moving objects between clusters and scaling up a cluster are meaningfully different operations. Moving an object between clusters is likely part of your development workflow—e.g., maybe a big refactoring of how you map objects to clusters that you'll test extensively with a blue/green setup. But scaling up/down a cluster is likely something that you do live on the system (e.g., in response to load) and you both don't want that to cause downtime but it's overkill to reach for a dbt-managed blue/green deployment just to scale up/down.

I just updated the design to represent this distinction between development and production workflows, and call out blue/green as an intermediary but suboptimal state for resizing, with graceful reconfiguration as the ideal end state. LMK what you think

…ion workflows

benesch

I think this looks great, @chaas! Anything on your end that you're keeping it in draft for? When you're back, I think this is ready for wider circulation!

benesch · 2023-10-12T06:10:20Z

doc/developer/design/20231002_cluster_vision.md

+If a user wants to do a development workflow on a production system, they must use **blue/green
+deployments**. For example, if the user wants to move an object between clusters, they must use
+blue/green to set up another version of the object/cluster and cutover the production system
+to it once the object is rehydrated and ready.\
+Again, for this workflow, exposing hydration status is the primary work item.


This is a great framing. Thank you!

doc/developer/design/20231002_cluster_vision.md

chaas · 2023-10-23T16:08:55Z

Here are two examples that require replicas with different sizes: For instance, it's an easy way to verify the effect of scaling up/down clusters. Right now, a customer I'm working with is seeing hydration times around 30min but the replica can potentially be scaled down. Adding an additional smaller replica is a very seamless way to just test what happens without actually scaling the cluster. Similarly, cost cautious customers may be willing to run replicas pretty busy, if they want to verify upfront if a smaller replica does not fall over, they can provision the smaller replica in addition to the larger one and monitor for some time (hours or few days) what happens.

People may be able to do similar things with automatic and 0 down time scaling. I'm not saying that we should not focus on the declarative approach. But keeping some basic support for the more manual imperative approach could be useful for some specific and less common use cases.

@sthm Thanks for sharing!
The way to do this with the declarative API would be blue/green deployments, where they set up a parallel version of the objects & cluster in whatever size they want to experiment with as much as they want, without risking impacting their production workflow. However, I see how that's gonna be heavyweight and complicated for users trying to do simple things.
I will brainstorm some alternative ideas here...if it's a common enough use case, perhaps we can add syntactic sugar that allows for very specific things like "test with another size", that abstracts away the underlying replica architecture.
Allowing users to directly manipulate the replicas to run that kind of experimentation on their production environment does feel a little "scary", since it e.g. increases the risk of user error like deleting the wrong replica.

benesch · 2023-10-23T16:24:56Z

perhaps we can add syntactic sugar that allows for very specific things like "test with another size", that abstracts away the underlying replica architecture.

Just for posterity: the user-facing "replica sets" that Moritz proposed is another way out, in that we could tell users who want to test with multiple replica sizes on the same cluster to create multiple replica sets.

doc/developer/design/20231002_cluster_vision.md

frankmcsherry

I left a bunch of comments, but I didn't get a lot of clarity out of the document about the long term vision for clusters. I think part of this is that there is some background context you have that isn't written down here (what specifically you mean by blue/green, for example, which is meant to solve several problems but either does/doesn't depending on exactly what is meant by it), as well as some unexplored content (to what extent does affecting cluster definitions have deleterious consequences for downstream users).

I wrote some text that I thought was clarifying for me, and I'm curious if it is also clarifying for other readers of the doc. It doesn't have a prescriptive take, but I think lays out the options in a way that are clearer for me. Specifically, it distinguishes between cluster responsibilities and cluster provisioning, and how each of them might evolve (potentially independently).

Clusters are definitions of bundles of related but otherwise isolated work a user has framed.
Clusters are initially empty, but evolve both in their responsibilities (work they must perform) and their provisioning (which resources have been allocated to the work).
They also evolve with time, in that cluster changes may put them in transient states where their provisioned resources are not all up to date with the work they must perform, and with time (ideally) they manage to catch up.

The commands on clusters are split between modifying cluster responsibilities, and modifying cluster provisioning.

There are two ways that we will recommend folks modify cluster responsibilities: in situ or blue/green.

In situ modifications change an existing cluster definition, which can be prompt but is also very risky.
If adding a resource brings the cluster over some limit, the cluster may collapse.
Without bringing the cluster up from zero, we are less certain that the modified cluster responsibilities can be brought up at all.
We believe this is a fine mode for development operation, where fast iteration is important and the user can tolerate downtime (e.g. hydration, failure).
Blue/green modifications clone a cluster definition and modify the clone, which takes time to have effect but can expose problems before deploying them.
Blue/green could be enforced by putting it in a state that disallows cluster modification.
Ergonomically, it feels like we would want a COPY CLUSTER command, or something analogous that prepares a mutable cluster definition without forcing an undeploy of the existing cluster.
We would also potentially want a SWAP CLUSTER command, which would have the effect of installing the new definition in place of the old definition.
Such a SWAP command could incur downtime if the new cluster is not yet in a comparably healthy state.

There are multiple ways that we will recommend folks modify cluster provisioning:

Imperative (manual) modification of replicas backing a cluster.
This is a robust fall-back in cases where we have yet to invent a policy that implements user goals.
Declarative definitions of replicas backing a cluster.
Examples here include
a. Automated shut-down during inactivity, or outside of business hours.
b. Automated scaling if resource use is near maximumum or minimum thresholds.
c. Automated replication to multiple availability zones.
The list seems plausibly long and nuanced, and probably indefinitely incomplete.
Immutable cluster provisioning.
Cluster provisioning is locked at cluster creation, and to change it requires a blue/green sort of redeployment.

frankmcsherry · 2023-10-24T17:38:21Z

doc/developer/design/20231002_cluster_vision.md

+If a user wants to do a development workflow on a production system, they must use **blue/green
+deployments**. For example, if the user wants to move an object between clusters, they must use
+blue/green to set up another version of the object/cluster and cutover the production system
+to it once the object is rehydrated and ready.\


This sounds like a very prescriptive take that I could imagine folks chafing against. Unless I misunderstand, the position is "you cannot CREATE INDEX; you must always CREATE CLUSTER with your index added, and then cut over only once it is hydrated. But also, I'm maybe confused about what "doing a development workflow on a production system" means.

Causing work to happen on a cluster, for example in response to CREATE INDEX, isn't fundamentally different from what happens when you type SELECT .. FROM not_an_index; I'm unclear if the blue/green discipline is actually a "no creating dataflows" discipline, one outcome of which is forced blue/green modification but it derives from something more specific.

I think your text above is accurate here! The idea is that we don't advise users to modify cluster responsibilities on a cluster that's receiving queries from a production consumer.

But also, I'm maybe confused about what "doing a development workflow on a production system" means.

One might YOLO CREATE INDEX an an index to a production cluster, particularly to fight an emergency, but we might not want to build shortcuts for them to move indexes around production clusters outside a that flow.

Related, I would argue that we need to be more vs. less prescriptive on this workflows! Regardless of whether the above is the perfect take or not.

This sounds like a very prescriptive take that I could imagine folks chafing against.

This sounds like a very prescriptive take

Yeah, as @ggnall mentioned, the prescriptivism is intentional. The intent here is to have an opinionated take on how users should use Materialize safely.

But, I think perhaps the issue is the use of "must"?

Suggested change

If a user wants to do a development workflow on a production system, they must use **blue/green

deployments**. For example, if the user wants to move an object between clusters, they must use

blue/green to set up another version of the object/cluster and cutover the production system

to it once the object is rehydrated and ready.\

If a user wants to do a development workflow on a production system, they we strongly recommend that they use **blue/green

deployments**. For example, if the user wants to move an object between clusters, we recommend that they use

blue/green to set up another version of the object/cluster and cutover the production system

to it once the object is rehydrated and ready.

Users are free to use development workflows on production systems, but they run the risk of incurring downtime.

Causing work to happen on a cluster, for example in response to CREATE INDEX, isn't fundamentally different from what happens when you type SELECT .. FROM not_an_index; I'm unclear if the blue/green discipline is actually a "no creating dataflows" discipline, one outcome of which is forced blue/green modification but it derives from something more specific.

I think it's less about creating dataflows and more about not introducing workloads that you haven't tested. Like, maybe you've explicitly tested that you can run a SELECT ... FROM some_specific_not_an_index on your production cluster once an hour to take a backup or whatever. That's fine, as long as you've tested it. What is a bad idea is showing up to your production cluster and running some ad hoc queries that you've not tested.

I think "query creates dataflow" is strongly correlated with "development query" and "query hits fast path in index" is strongly correlated with "production query", but I think the fundamental thing is whether you have tested the query and it's part of your regular workload, or whether it's a one off thing that you've not explicitly tested/provisioned for.

frankmcsherry · 2023-10-24T17:40:58Z

doc/developer/design/20231002_cluster_vision.md

+To be determined: whether replica sets fits into this model, either externally exposed or internal-only. Perhaps they are a way we could recover clusters with heterogeneous replicas while retaining a declarative API.
+
+### Resource usage
+The very long-term goal is clusterless Materialize, where Materialize does automatic workload scheduling for the customer.


Eh, I'm ok with it being infinity years away. :D At least, CREATE CLUSTER does a very valuable thing at the moment, which is to allow users to express isolation. That has lots of value, and removing it seems like it removes value, rather than adds value. If "clusterless" just means "idk folks use default for work they don't care about and it autoscales" great, but I expect real production users to want to use clusters for the same reason that they continue to use VMs, containers, processes, threads, even though we could just delete all those concepts.

frankmcsherry · 2023-10-24T17:42:04Z

doc/developer/design/20231002_cluster_vision.md

+### Resource usage
+The very long-term goal is clusterless Materialize, where Materialize does automatic workload scheduling for the customer.
+
+An intermediary solution, which is also far off is autoscaling of clusters, where Materialize automatically resizes clusters based on the observed workload.


The "auto" part here is the scary part. Just about everyone gets it wrong, and the whole control theory part of whether you should/shouldn't scale is something MZ humans need to understand first, and I think that's still a ways off.

frankmcsherry · 2023-10-24T17:43:24Z

doc/developer/design/20231002_cluster_vision.md

+An intermediary solution, which is also far off is autoscaling of clusters, where Materialize automatically resizes clusters based on the observed workload.
+
+A more achievable offering in the short-term is automatic shutdown of clusters, where Materialize can spin down a cluster to 0 replicas based on certain criteria, such as a scheduled time or amount of idle time. \
+This would reduce resource waste for development clusters. The triggering mechanism from graceful rehydration is also a requirement here.


This relates to CREATE MATERIALIZED VIEW .. REFRESH <NEVER | PERIOD> that @ggevay is keen on. There's probably an issue to link, but tl;dr folks are interested in the economics of less frequent refreshes that we still manage for them.

frankmcsherry · 2023-10-24T18:00:32Z

doc/developer/design/20231002_cluster_vision.md

+This means deprecating manual cluster replica management. \
+We believe this is easier to use and manage.


Strong disagree here. There's maybe a false dichotomy at play, as there is a middle ground between "deprecate manual cluster management" and "default to manual cluster management". As long as MZ has downtime on a thing that could have been done manually, it's a real hard sell that we should forbid doing the manual thing (e.g. resizing).

An alternative would be "teach people to type ALTER CLUSTER REPLICAS rather than CREATE CLUSTER REPLICA and DROP CLUSTER REPLICA", which is six of one half dozen of another to me. Still mostly imperative (a human types a command, just about the goal state rather than the transition) but with less cognitive overhead. But stops short of "no manual replica management".

If nothing else, it would be helpful to unpack the intended "imperative" vs "declarative" distinction. SQL's command language, for example, is painfully imperative and not at all declarative. But it's hard for me to understand at this point what the distinction is other than removing a user's ability to control the assignment of their money (in the form of replicas) to their work.

Maybe declarative vs imperative is the wrong framing. For me, the compelling reason to move away from CREATE CLUSTER REPLICA is about not having to immediately teach people about replicas. We've seen repeatedly replicas be a major source of confusion for those new to Materialize. Common questions:

"How can you have only one replica of something?"

"How can you have zero replicas of something?"

"What does it mean that a cluster is logical and a replica is physical?"

It is much easier to explain the new (what we've been calling "declarative") API:

The initial explanation of a cluster doesn't mention replicas: "To run your dataflows, you need to provision hardware. CREATE CLUSTER provisions such hardware with resources proportional to your desired SIZE."

Replicas only enter the conversation when fault tolerance enters the conversion, and the explanation is very natural via the "replication factor": "If you want to increase fault tolerance, you can run multiple replicas of your cluster via CREATE CLUSTER ... REPLICATION FACTOR = 2."

This framing makes clear that ALTER CLUSTER REPLICAS would have the same issue as the current API: it requires that users think in terms of individual replicas, rather than a cluster with a replication factor.

As long as MZ has downtime on a thing that could have been done manually, it's a real hard sell that we should forbid doing the manual thing (e.g. resizing).

I think this is a fair take, as resizing a cluster is a "production workflow", and so we could make "Materialize supports graceful reconfiguration during resizing" a requirement for removing the manual cluster replica DDL statements.

frankmcsherry · 2023-10-24T18:22:52Z

doc/developer/design/20231002_cluster_vision.md

+For production workflows, like resizing an active cluster, blue/green is an acceptable intermediate
+solution, but is an overkill amount of work for such a simple action.


This text is confusing to me! What would it mean to blue/green a resized cluster? Like, the act of resizing would amount to creating a new cluster, with different resources behind it, and cutting over from one to the other? It is hard for me to understand this in the context of a blue/green implementation that e.g. rebuild and renames things, where downstream dependents are left confused. Would we drop sinks when we do this, for example?

What would it mean to blue/green a resized cluster? Like, the act of resizing would amount to creating a new cluster, with different resources behind it, and cutting over from one to the other?

Yes, exactly.

It is hard for me to understand this in the context of a blue/green implementation that e.g. rebuild and renames things, where downstream dependents are left confused. Would we drop sinks when we do this, for example?

Yes, we'd either drop the sinks or error on their existence.

A slightly more advanced version of blue/green would move through versions. Each deploy would leave behind the sinks with a version suffix (e.g., sink_v1, sink_v2, sink_v3), and give you the ability to remove old versions only once you've adjusted all downstream consumers to use the new version of the sink.

@chaas, I think we should consider updating this take to:

"For resizing an active cluster, blue/green is not an acceptable intermediate solution. Resizing a cluster is something that may need to be performed regularly in production in response to changes in workload, and doing a blue/green deployment for each cluster resizing would introduce to much friction.

Instead, we need to support a simple declarative interface for seamlessly resizing ... [existing text]

We cannot remove manual cluster replica management until support such an interface."

Agreed - Frank and I discussed offline and blue/green is too burdensome for scaling use-cases, and particularly with the versions of blue/green that we will realistically be building now which will be manually-controlled.

first draft of cluster ux long term vision document

eb71f7c

chaas changed the title ~~[design] Cluster ux long term vision document~~ [design] Cluster UX long term vision Oct 2, 2023

antiguru reviewed Oct 3, 2023

View reviewed changes

revisions based on review feedback from moritz + chat with jessica

ebf29dc

chaas requested a review from benesch October 4, 2023 15:49

benesch reviewed Oct 10, 2023

View reviewed changes

address review feedback, add distinction btwn development and product…

ebad11e

…ion workflows

chaas force-pushed the cluster-vision branch from 2bab5e5 to ebad11e Compare October 11, 2023 15:09

benesch reviewed Oct 12, 2023

View reviewed changes

remove irrelevant sections

fb322ba

chaas marked this pull request as ready for review October 23, 2023 16:11

antiguru reviewed Oct 23, 2023

View reviewed changes

doc/developer/design/20231002_cluster_vision.md Outdated Show resolved Hide resolved

doc/developer/design/20231002_cluster_vision.md Outdated Show resolved Hide resolved

update roadmap NOW to reflect @antiguru 's updates

3b80531

frankmcsherry reviewed Oct 24, 2023

View reviewed changes

		This means deprecating manual cluster replica management. \
		We believe this is easier to use and manage.

		For production workflows, like resizing an active cluster, blue/green is an acceptable intermediate
		solution, but is an overkill amount of work for such a simple action.

[design] Cluster UX long term vision #22123

Are you sure you want to change the base?

[design] Cluster UX long term vision #22123

Conversation

chaas commented Oct 2, 2023 • edited Loading

benesch commented Oct 3, 2023

antiguru left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benesch commented Oct 5, 2023

sthm commented Oct 6, 2023 • edited Loading

benesch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benesch Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chaas Oct 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benesch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chaas commented Oct 23, 2023

benesch commented Oct 23, 2023

frankmcsherry left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ggnall Oct 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chaas commented Oct 2, 2023 •

edited

Loading

sthm commented Oct 6, 2023 •

edited

Loading

benesch Oct 10, 2023 •

edited

Loading

chaas Oct 10, 2023 •

edited

Loading

frankmcsherry left a comment •

edited

Loading

ggnall Oct 24, 2023 •

edited

Loading