Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slack-vitess-r15.0.5: backport required Transaction Throttler PRs #302

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
22945c1
Add basic metrics to `vttablet` transaction throttler (#12418)
timvaillancourt Mar 13, 2023
9f9b0ba
Fix transaction throttler ignoring the initial rate (#12618)
ejortegau Mar 29, 2023
02af9bc
Cleanup panics in `txthrottler`, reorder for readability (#12901)
timvaillancourt May 2, 2023
7046003
Emit per workload labels for existing per table vttablet metrics (#12…
ejortegau Mar 14, 2023
4bfc74d
Fix test merge-conflict
timvaillancourt Apr 16, 2024
618423a
Fix dupe assert
timvaillancourt Apr 16, 2024
9e79931
Add priority support to transaction throttler (#12662)
ejortegau May 4, 2023
95d008f
Add flag to select tx throttler tablet type (#12174)
timvaillancourt May 16, 2023
8a04679
txthrottler: further code cleanup (#12902)
timvaillancourt May 17, 2023
e4834b5
make vtadmin_web_proto_types
timvaillancourt Apr 17, 2024
e23ef0f
Fix test
timvaillancourt Apr 17, 2024
fb4d7ea
Fix signature
timvaillancourt Apr 17, 2024
28dbd9e
TxThrottler support for transactions outside BEGIN/COMMIT (#13040)
ejortegau May 17, 2023
39e5ade
txthrottler: verify config at vttablet startup, consolidate funcs (#1…
timvaillancourt Jun 18, 2023
7a380b8
txthrottler: add metrics for topoWatcher and healthCheckStreamer (#13…
timvaillancourt Jul 25, 2023
343c024
tx throttler: healthcheck all cells if `--tx-throttler-healthcheck-ce…
timvaillancourt Jul 31, 2023
c4c6088
Per workload TxThrottler metrics (#13526)
ejortegau Jul 27, 2023
02d5c47
Add dry-run/monitoring-only mode for TxThrottler (#13604)
ejortegau Aug 2, 2023
3e57711
`txthrottler`: remove `txThrottlerConfig` struct, rely on `tabletenv`…
timvaillancourt Aug 3, 2023
e38ef73
Add nick-fields/retry@v2 to CI tests
timvaillancourt Apr 17, 2024
6d4cfae
try ubuntu-latest on flaky tests
timvaillancourt Apr 23, 2024
11b5e5c
try go1.20.14 for flaky tests
timvaillancourt Apr 23, 2024
02ade2c
Revert "try go1.20.14 for flaky tests"
timvaillancourt May 3, 2024
21e2206
Revert "try ubuntu-latest on flaky tests"
timvaillancourt May 3, 2024
b5a8b29
Revert "`slack-vitess-r15.0.5`: use go1.21 (#260)"
timvaillancourt May 3, 2024
4e01ea4
go mod tidy
timvaillancourt May 3, 2024
bec6d22
Merge branch 'slack-vitess-r15.0.5' into bp-txthrottler-pt1-slack-vit…
timvaillancourt May 6, 2024
641f2a9
`slack-vitess-r15.0.5`: use go1.21 (#260)
timvaillancourt Mar 26, 2024
3274c5b
go mod tidy
timvaillancourt May 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config/tablet/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ cacheResultFields: true # enable-query-plan-field-caching
# enable-tx-throttler
# tx-throttler-config
# tx-throttler-healthcheck-cells
# tx-throttler-tablet-types
# enable_transaction_limit
# enable_transaction_limit_dry_run
# transaction_limit_per_user
Expand Down
9 changes: 7 additions & 2 deletions doc/ReplicationLagBasedThrottlingOfTransactions.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,13 @@ If this is not specified a [default](https://github.com/vitessio/vitess/tree/mai
* *tx-throttler-healthcheck-cells*

A comma separated list of datacenter cells. The throttler will only monitor
the non-RDONLY replicas found in these cells for replication lag.
the replicas found in these cells for replication lag.

* *tx-throttler-tablet-types*

A comma separated list of tablet types. The throttler will only monitor tablets
with these types. Only `replica` and/or `rdonly` types are supported. The default
is `replica`.

# Caveats and Known Issues
* The throttler keeps trying to explore the maximum rate possible while keeping
Expand All @@ -39,4 +45,3 @@ lag limit may occasionally be slightly violated.

* Transactions are considered homogeneous. There is currently no support
for specifying how `expensive` a transaction is.

9 changes: 7 additions & 2 deletions go/flags/endtoend/vttablet.txt
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ Usage of vttablet:
--enable-consolidator Synonym to -enable_consolidator (default true)
--enable-consolidator-replicas Synonym to -enable_consolidator_replicas
--enable-lag-throttler Synonym to -enable_lag_throttler
--enable-per-workload-table-metrics If true, query counts and query error metrics include a label that identifies the workload
--enable-tx-throttler Synonym to -enable_tx_throttler
--enable_consolidator This option enables the query consolidator. (default true)
--enable_consolidator_replicas This option enables the query consolidator only on replicas.
Expand Down Expand Up @@ -341,9 +342,13 @@ Usage of vttablet:
--twopc_abandon_age float time in seconds. Any unresolved transaction older than this time will be sent to the coordinator to be resolved.
--twopc_coordinator_address string address of the (VTGate) process(es) that will be used to notify of abandoned transactions.
--twopc_enable if the flag is on, 2pc is enabled. Other 2pc flags must be supplied.
--tx-throttler-config string Synonym to -tx_throttler_config (default "target_replication_lag_sec: 2\nmax_replication_lag_sec: 10\ninitial_rate: 100\nmax_increase: 1\nemergency_decrease: 0.5\nmin_duration_between_increases_sec: 40\nmax_duration_between_increases_sec: 62\nmin_duration_between_decreases_sec: 20\nspread_backlog_across_sec: 20\nage_bad_rate_after_sec: 180\nbad_rate_increase: 0.1\nmax_rate_approach_threshold: 0.9\n")
--tx-throttler-config string Synonym to -tx_throttler_config (default "target_replication_lag_sec:2 max_replication_lag_sec:10 initial_rate:100 max_increase:1 emergency_decrease:0.5 min_duration_between_increases_sec:40 max_duration_between_increases_sec:62 min_duration_between_decreases_sec:20 spread_backlog_across_sec:20 age_bad_rate_after_sec:180 bad_rate_increase:0.1 max_rate_approach_threshold:0.9")
--tx-throttler-default-priority int Default priority assigned to queries that lack priority information (default 100)
--tx-throttler-dry-run If present, the transaction throttler only records metrics about requests received and throttled, but does not actually throttle any requests.
--tx-throttler-healthcheck-cells strings Synonym to -tx_throttler_healthcheck_cells
--tx_throttler_config string The configuration of the transaction throttler as a text formatted throttlerdata.Configuration protocol buffer message (default "target_replication_lag_sec: 2\nmax_replication_lag_sec: 10\ninitial_rate: 100\nmax_increase: 1\nemergency_decrease: 0.5\nmin_duration_between_increases_sec: 40\nmax_duration_between_increases_sec: 62\nmin_duration_between_decreases_sec: 20\nspread_backlog_across_sec: 20\nage_bad_rate_after_sec: 180\nbad_rate_increase: 0.1\nmax_rate_approach_threshold: 0.9\n")
--tx-throttler-tablet-types strings A comma-separated list of tablet types. Only tablets of this type are monitored for replication lag by the transaction throttler. Supported types are replica and/or rdonly. (default replica)
--tx-throttler-topo-refresh-interval duration The rate that the transaction throttler will refresh the topology to find cells. (default 5m0s)
--tx_throttler_config string The configuration of the transaction throttler as a text-formatted throttlerdata.Configuration protocol buffer message. (default "target_replication_lag_sec:2 max_replication_lag_sec:10 initial_rate:100 max_increase:1 emergency_decrease:0.5 min_duration_between_increases_sec:40 max_duration_between_increases_sec:62 min_duration_between_decreases_sec:20 spread_backlog_across_sec:20 age_bad_rate_after_sec:180 bad_rate_increase:0.1 max_rate_approach_threshold:0.9")
--tx_throttler_healthcheck_cells strings A comma-separated list of cells. Only tabletservers running in these cells will be monitored for replication lag by the transaction throttler.
--unhealthy_threshold duration replication lag after which a replica is considered unhealthy (default 2h0m0s)
--use_super_read_only Set super_read_only flag when performing planned failover.
Expand Down
11 changes: 11 additions & 0 deletions go/flagutil/flagutil.go
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,17 @@ func DualFormatBoolVar(fs *pflag.FlagSet, p *bool, name string, value bool, usag
}
}

// DualFormatVar creates a flag which supports both dashes and underscores
func DualFormatVar(fs *pflag.FlagSet, val pflag.Value, name string, usage string) {
dashes := strings.Replace(name, "_", "-", -1)
underscores := strings.Replace(name, "-", "_", -1)

fs.Var(val, underscores, usage)
if dashes != underscores {
fs.Var(val, dashes, fmt.Sprintf("Synonym to -%s", underscores))
}
}

// DurationOrIntVar implements pflag.Value for flags that have historically been
// of type IntVar (and then converted to seconds or some other unit) but are
// now transitioning to a proper DurationVar type.
Expand Down
28 changes: 26 additions & 2 deletions go/vt/proto/query/query.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

88 changes: 88 additions & 0 deletions go/vt/proto/query/query_vtproto.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

53 changes: 53 additions & 0 deletions go/vt/sqlparser/comments.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ import (
"strconv"
"strings"
"unicode"

vtrpcpb "vitess.io/vitess/go/vt/proto/vtrpc"
"vitess.io/vitess/go/vt/vterrors"
)

const (
Expand All @@ -44,8 +47,19 @@ const (
DirectiveQueryPlanner = "PLANNER"
// DirectiveVtexplainRunDMLQueries tells explain format = vtexplain that it is okay to also run the query.
DirectiveVtexplainRunDMLQueries = "EXECUTE_DML_QUERIES"
// DirectiveWorkloadName specifies the name of the client application workload issuing the query.
DirectiveWorkloadName = "WORKLOAD_NAME"
// DirectivePriority specifies the priority of a workload. It should be an integer between 0 and MaxPriorityValue,
// where 0 is the highest priority, and MaxPriorityValue is the lowest one.
DirectivePriority = "PRIORITY"

// MaxPriorityValue specifies the maximum value allowed for the priority query directive. Valid priority values are
// between zero and MaxPriorityValue.
MaxPriorityValue = 100
)

var ErrInvalidPriority = vterrors.Errorf(vtrpcpb.Code_INVALID_ARGUMENT, "Invalid priority value specified in query")

func isNonSpace(r rune) bool {
return !unicode.IsSpace(r)
}
Expand Down Expand Up @@ -378,3 +392,42 @@ func AllowScatterDirective(stmt Statement) bool {
}
return comments != nil && comments.Directives().IsSet(DirectiveAllowScatter)
}

// GetPriorityFromStatement gets the priority from the provided Statement, using DirectivePriority
func GetPriorityFromStatement(statement Statement) (string, error) {
commentedStatement, ok := statement.(Commented)
// This would mean that the statement lacks comments, so we can't obtain the workload from it. Hence default to
// empty priority
if !ok {
return "", nil
}

directives := commentedStatement.GetParsedComments().Directives()
priority, ok := directives.GetString(DirectivePriority, "")
if !ok || priority == "" {
return "", nil
}

intPriority, err := strconv.Atoi(priority)
if err != nil || intPriority < 0 || intPriority > MaxPriorityValue {
return "", ErrInvalidPriority
}

return priority, nil
}

// GetWorkloadNameFromStatement gets the workload name from the provided Statement, using workloadLabel as the name of
// the query directive that specifies it.
func GetWorkloadNameFromStatement(statement Statement) string {
commentedStatement, ok := statement.(Commented)
// This would mean that the statement lacks comments, so we can't obtain the workload from it. Hence default to
// empty workload name
if !ok {
return ""
}

directives := commentedStatement.GetParsedComments().Directives()
workloadName, _ := directives.GetString(DirectiveWorkloadName, "")

return workloadName
}
65 changes: 65 additions & 0 deletions go/vt/sqlparser/comments_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -468,3 +468,68 @@ func TestIgnoreMaxMaxMemoryRowsDirective(t *testing.T) {
})
}
}

func TestGetPriorityFromStatement(t *testing.T) {
testCases := []struct {
query string
expectedPriority string
expectedError error
}{
{
query: "select * from a_table",
expectedPriority: "",
expectedError: nil,
},
{
query: "select /*vt+ ANOTHER_DIRECTIVE=324 */ * from another_table",
expectedPriority: "",
expectedError: nil,
},
{
query: "select /*vt+ PRIORITY=33 */ * from another_table",
expectedPriority: "33",
expectedError: nil,
},
{
query: "select /*vt+ PRIORITY=200 */ * from another_table",
expectedPriority: "",
expectedError: ErrInvalidPriority,
},
{
query: "select /*vt+ PRIORITY=-1 */ * from another_table",
expectedPriority: "",
expectedError: ErrInvalidPriority,
},
{
query: "select /*vt+ PRIORITY=some_text */ * from another_table",
expectedPriority: "",
expectedError: ErrInvalidPriority,
},
{
query: "select /*vt+ PRIORITY=0 */ * from another_table",
expectedPriority: "0",
expectedError: nil,
},
{
query: "select /*vt+ PRIORITY=100 */ * from another_table",
expectedPriority: "100",
expectedError: nil,
},
}

for _, testCase := range testCases {
theThestCase := testCase
t.Run(theThestCase.query, func(t *testing.T) {
t.Parallel()
stmt, err := Parse(theThestCase.query)
assert.NoError(t, err)
actualPriority, actualError := GetPriorityFromStatement(stmt)
if theThestCase.expectedError != nil {
assert.ErrorIs(t, actualError, theThestCase.expectedError)
} else {
assert.NoError(t, err)
assert.Equal(t, theThestCase.expectedPriority, actualPriority)
}
})
}
}
Loading
Loading