Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon-osd separation fixes. making netplugin optional for service_worker #108

Merged
merged 3 commits into from
Mar 21, 2016
Merged

mon-osd separation fixes. making netplugin optional for service_worker #108

merged 3 commits into from
Mar 21, 2016

Conversation

vvb
Copy link
Contributor

@vvb vvb commented Mar 1, 2016

  1. optionally running only mons on the service-master cluster. the osds are run only on service-worker in this mode. The ceph-playbook changes in this PR, are to deal with ansible's behaviour of how it deals with conditionals at the role level. It runs every task in the role applying the conditional to it. In this specific case the "check" tasks were doing a stat and registering the output to a variable socket. When the task is skipped socket is not what it needs to be. There are handlers that kick-in at the end of service-master workflow that depend on socket to be correct. so, the correct solution is to do the stat when the handlers are fired, so that the true state is reflected.
  2. Making contiv_network optional for the service-worker hostgroup.
  3. Necessary defaults which do not change the current behaviour.


# OSDs could be run without netplugin on them
# osd_contiv_network_include controls this behaviour
osd_contiv_network_include: "true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am presuming we will need multiple such variables based on host capabilities viz. compute only; storage only; with compute and storage) so instead of defining ceph specific boolean variables I feel it is better to introduce a single variable like say host-capability of type enum with possible values like compute-only, storage-only, compute-and-storage and use that in the conditionals below to include the roles in various host-group, something like when: host_capability | match(*storage*)

This offers the benefit of cutting down on the number of variables we need and also keeps it independent of ceph as we may support different storage types in future and same host-cpabilities will apply. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mapuri.. yes, consolidating under a common var would be good. But I think we should specify exclusion. So, everything is included by default, unless an exclusion is specified. for example the below would specify not to run mon and osd on the given host. It can be one large string of comma separated values.

host-capability => “not_mon, not_osd"
when: host-capability | match(*not_osd*)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well negation is kind of hard to read... also instead of mon and osd being capabilities I prefer more generic capabilities like compute and storage..mon and osd makes it specific to ceph and we will need something similar if we were to do NFS based storage on UCS B-series servers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to add if we really need negation it might be better achieved using logical operator so a node with no storage capability can matched using this condition when: ! host-capability | match(*storage*)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mapuri yes, I agree that negation is hard to read, but the flip side is that the user needs to specify every capability he wants on the node.

on the topic of keeping it to storage and not getting down to mon/osd. Is there a case, a host might want to run both ceph and nfs on the same node. wondering to what level should the capability be defined - being able to do nfs based storage is a capability. being able to do ceph storage is another capability. so, do you think specifying "Storage-node" is good enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the flip side is that the user needs to specify every capability he wants on the node.

hmm, yeah, it is a problem if we have too many capabilities. How about we set host-capability to default as compute,storage and then let user override it to specific values? Again this is only a problem if we have too many capabilities. Another way to deal with this (when we have a lot of properties) would be to give a short-hand/pre-defined variables for common capabilities combinations. Say, for UCS B-series we can define a common-b-series-cap variable and set it to storage,cap1,cap2.... and users can instead use the short-hand strings instead?

on the topic of keeping it to storage and not getting down to mon/osd. Is there a case, a host might want to run both ceph and nfs on the same node....
so, do you think specifying "Storage-node" is good enough?

Franky I don't know. But from what we have in contiv_network, I think in a contiv-cluster it will be either ceph or nfs but not a mix, so storage should good enough. And the differentiation of starting right set of storage daemons might be better made at the contiv_storage role (like we do for contiv_network to deploy netplugin in either aci, or standalone or bgp mode).

But to keep it extensible, may be storage-ceph is a better capability as you point out than just storage.

@vvb
Copy link
Contributor Author

vvb commented Mar 11, 2016

@mapuri PTAL

- { role: scheduler_stack, run_as: master }
- { role: contiv_network, run_as: master }
#- { role: contiv_storage, run_as: master }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave ceph* and contiv_storage roles commented out in these host-groups until #19 is fixed otherwise it breaks the commission/decommission workflows in cluster/master.

@mapuri
Copy link
Contributor

mapuri commented Mar 11, 2016

@vvb thanks for incorporating the host_capability comment. This LGTM other than one minor comment on the capabilities to start with.

@vvb
Copy link
Contributor Author

vvb commented Mar 21, 2016

@mapuri PTAL. the comments are taken care of. Is this good to merge now?

@mapuri
Copy link
Contributor

mapuri commented Mar 21, 2016

LGTM, thanks @vvb

mapuri added a commit that referenced this pull request Mar 21, 2016
mon-osd separation fixes. making netplugin optional for service_worker
@mapuri mapuri merged commit 9283a68 into contiv:master Mar 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants