-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ResponseOps][Alerting] Decouple feature IDs from consumers #183756
base: main
Are you sure you want to change the base?
[ResponseOps][Alerting] Decouple feature IDs from consumers #183756
Conversation
/ci |
06d8499
to
3b7c89f
Compare
f65ca2c
to
a2b73f9
Compare
/ci |
/ci |
rule_type_ids: schema.maybe(schema.arrayOf(schema.string())), | ||
consumers: schema.maybe(schema.arrayOf(schema.string())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same schema as the public plus the rule_type_ids
and the consumers
.
/ci |
@@ -82,7 +81,7 @@ export type AlertFilterControlsProps = Omit< | |||
*/ | |||
export const AlertFilterControls = (props: AlertFilterControlsProps) => { | |||
const { | |||
featureIds = [AlertConsumers.STACK_ALERTS], | |||
ruleTypeIds = [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we default it to stack rules?
ALERTS: 'alerts', | ||
DISCOVER: 'discover', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the consumers are decoupled from the feature IDs the list should include all possible consumers so far. alerts
and discover
are valid ones. Ideally, we should not have a list of possible consumers. I hope in the subsequent PRs we will remove it.
/** | ||
* TODO: Abstract it and remove it | ||
*/ | ||
export const isSiemRuleType = (ruleTypeId: string) => ruleTypeId.startsWith('siem.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the codebase, we have a lot of checks (hacks) related to security rule types. To reduce the scope of the PR as much as possible I choose to try to fix it slowly on subsequent PRs. At the moment is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of having a centralized place for the rule type IDs. Ideally, consumers of the framework should specify keywords like observablility
(category or subcategory) or even apm.*
and the framework should know which rule type IDs to pick up. But again, I think it is out of the scope of the PR, and at the moment it seems the most straightforward way to move forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inspired by the cases utils for filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored a bit the code to accommodate the async nature of getRulesClient
. It should work as before.
'apm', | ||
'slo', | ||
'uptime', | ||
'observability', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the observability
match to more rule type IDs I have put? Should I add all o11y rule type IDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you create a rule from the main o11y page the consumer is set to alerts
to most of the rules. For that reason, I added the AlertsConsumer.ALERTS
to fetch alerts with kibana.alert.rule.consumer: alerts
. Also, I am not sure if we want to include stack rules which I included because some of the stack rules can be created with the logs
or infrastructure
consumer.
/ci |
/ci |
💔 Build Failed
Failed CI Steps
Test Failures
Metrics [docs]Module Count
Public APIs missing comments
Async chunks
Page load bundle
Unknown metric groupsAPI count
ESLint disabled line counts
Total ESLint disabled count
History
cc @cnasikas |
@@ -191,7 +223,15 @@ export const ruleRegistrySearchStrategyProvider = ( | |||
); | |||
} | |||
|
|||
throw err; | |||
if (Boom.isBoom(err)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Search strategy was always thrown a 500
error without respecting the error codes, like 403, throwing by the alerts authorization. I tried to fix that.
); | ||
}); | ||
|
||
test('requests the same number of rules as the number of ids provided', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I combined the three tests into one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I split up the PageContent
into PageContentWrapper
and PageContent
so the PageContent
renders after the loading of the available rule types.
@@ -579,7 +583,7 @@ const AlertsTableStateWithQueryProvider = memo( | |||
|
|||
return ( | |||
<AlertsTableContext.Provider value={alertsTableContext}> | |||
{!isLoading && alertsCount === 0 && ( | |||
{!isLoading && alertsCount <= 0 && ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If alertsCount
is undefined
is initialized above as -1
.
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
Summary
This PR aims to decouple the feature IDs from the
consumer
attribute of rules and alerts.Fixes: #181559
Architecture
Alerting RBAC model
The Kibana security uses Elasticsearch's application privileges. This way Kibana can represent and store its privilege models within Elasticsearch roles. To do that, Kibana security creates actions that are granted by a specific privilege. Alerting uses its own RBAC model and is built on top of the existing Kibana security model. The Alerting RBAC uses the
rule_type_id
andconsumer
attribute to define who owns the rule and the alerts procured by the rule. To connect therule_type_id
andconsumer
with the Kibana security actions the Alerting RBAC registers its custom actions. They are constructed asalerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>
. For examplealerting:siem.esqlRule/siem/rule/get
. This action means that a user with a role that grants this action can get a rule of typesiem.esqlRule
with consumersiem
.Problem statement
At the moment the
consumer
attribute should be a valid feature ID. Though this approach worked well so far it has its limitation. Specifically:consumer
attribute.Proposed solution
This PR aims to decouple the feature IDs from consumers. It achieves that a) by changing the way solutions configure the alerting privileges when registering a feature and b) by changing the alerting actions. The schema changes as:
The new actions are constructed as
alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>
. For examplealerting:rule-type-id/my-consumer/rule/get
. The new action means that a user with a role that grants this action can get a rule of typerule-type
with consumermy-consumer
. Changing the action strings is not considered a breaking change as long as the user's permission works as before. In our case, this is true because the consumer will be the same as before (feature ID), and the alerting security actions will be the same. For example:Old formatting
Schema:
Action:
New formatting
Schema:
Action:
In both formating the actions are the same thus breaking changes are avoided.
Alerting authorization class
The alerting plugin uses and exports the alerting authorization class (
AlertingAuthorization
). The class is responsible to handle all authorization actions related to rules and alerts. The class changed to handle the new actions as described in the above sections. A lot of methods were renamed, removed, cleaned up, or have their argument or return types changed. These changed affected various piece of the code. The changes in this class are the most important in this PR especially the_getAuthorizedRuleTypesWithAuthorizedConsumers
method which is the cornerstone of the alerting RBAC. Please review carefully.Instatiation of the alerting authorization class
The
AlertingAuthorizationClientFactory
is used to create instances of theAlertingAuthorization
class. TheAlertingAuthorization
class needs to perform async operations upon instatiation. Because JS, at the moment, does not support async instantiation of classes theAlertingAuthorization
class was assigingPromise
objects to variables that could be resolved later in other phases of the lifecycle of the class. To improve redability and make it clearer the lifecycle of the class I seperated the construction of the class (initialization) from the bootstrap process. As a result getting theAlertingAuthorization
class or any client that depends on it (RulesClient
for example) is an async operation.Filtering
A lot of routes are using the authorization class to get the authorization filter (
getFindAuthorizationFilter
), a filter that, if applied, returns only the rule types and consumers the user is authorized to. The method that returns the filter was build in a way to also support filtering on top of the authorization filter thus coupling the authorized filter with route filtering. I believe these two operation should be decoupled and the filter method should return a filter that gives you all the authorized rule types. It is the responsibility of the consumer, route in our case, to apply extra filters on top of the authorization filter. For that reason, I did all necessary changes to decoupe them.Legacy consumers & producer
A lot of rules and alerts have been created and are still being created from observability with the
alerts
consumer. When the Alerting RBAC encounters a rule or alert withalerts
as a consumer it falls back to theproducer
of the rule type ID to construct the actions. For example if a rule withruleTypeId: .es-query
andconsumer: alerts
the alerting action will be constructed asalerting:.es-query/stackRules/rule/get
wherestackRules
is the producer of the.es-query
rule type. Theproducer
is used to be used in alerting authorization but due to its complexity, it was deprecated and only used as a fallback for thealerts
consumer. To avoid breaking changes all rule types will set thealerts
consumer as valid consumers when configuring the alerting privileges. By moving thealerts
consumer to the registration of the feature we can stop relying on theproducer
. In the next PRs theproducer
will removed entirely.Routes
The following changes were introduced to the alerting routes:
ruleTypeIds
and theconsumers
parameters for filtering./internal/rac/alerts/_feature_ids
route got deleted as it was not used anywhere in the codebase and it was internal.All the changes in the routes are related to internal routes and no breaking change is introduced.
Notable code changes
isSiemRuleType
: Temporary helper function. The plan is to be removed entirely in further iterations.kbn-rule-data-utils
.kbn-securitysolution-rules
.PluginSetupContract
andPluginStartContract
toAlertingServerSetup
andAlertingServerStart
.AlertingAuthorization
class is instantiated.getRulesClient
converted to anasync
function.AlertingAuthorization
methods and make its methods to take only an object as argument.AlertingAuthorization
class.filter_consumers
was mistakenly exposed to a public API. It was undocumented.getFindAuthorizationFilter
authorization helper function does not accept filters. It should return a filter for all authorized rule types regardless of the request. Filtering byruleTypeIds
moved to calls to ES or the SO client.list
method of theRuleTypeRegistry
fromSet<RegistryRuleType>
toMap<string, RegistryRuleType>
.KueryNode
in tests changed to assetion of KQL usingtoKqlExpression
.Testing
Caution
It is very important to test all the areas of the application where rules or alerst are being used directly or indirectly. Scenarios to consider:
Solutions
Please test all the rule types you own with all possible combinations of permissions.
ResponseOps
Please test all stack rules with all possible combinations of permissions.
Risk Matrix
FQA
filter
paramater where we can pass an arbitrary KQL filter. Why we do not use this to filter by the rule type IDs and the consumers and instead we introduce new dedicated paramaters?consumer
? Should not theruleTypeIds
be enough?