Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add bootstrap kubernetes labels and validator management structs #9

Merged
merged 7 commits into from
Apr 23, 2024

Conversation

gregcusack
Copy link
Collaborator

@gregcusack gregcusack commented Apr 1, 2024

Summary of Changes

  1. create bootstrap validator kubernetes labels
  2. Address @yihau 's comments from PR add and deploy bootstrap secrets #8 , hardcode scripts into rust so we avoid having to clone this repo in order to run.
  3. Add library.rs and validator.rs to simplify the handling of various validators, their types, and their labels.
  4. Update to latest version of rustls
  5. Add three more spl args I had left out
  6. Make secret creation more flexible

@gregcusack gregcusack changed the title add bootstrap kubernetes labels add bootstrap kubernetes labels and validator management structs Apr 19, 2024
@gregcusack gregcusack requested review from yihau and joncinque and removed request for joncinque and yihau April 22, 2024 16:53
Copy link
Member

@yihau yihau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🪖

@gregcusack gregcusack merged commit 87712b9 into anza-xyz:main Apr 23, 2024
2 checks passed
Copy link

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small comments, this is all going in the right direction!

Comment on lines +231 to +235
validators
.into_iter()
.collect::<Vec<_>>() // Collect into Vec and thread push
.par_iter()
.try_for_each(|validator| Self::push_image(validator.image()))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since push_image is using Command::spawn internally, it creates a new process anyway, so the par_iter is a little overkill. You could have push_image return the process::Child and then iterate through them here with wait_with_output, and completely remove rayon

Comment on lines +31 to +42
let mut data: BTreeMap<String, ByteString> = BTreeMap::new();
for (label, value) in secrets {
match value {
SecretType::Value { v } => {
data.insert(label, ByteString(v.into_bytes()));
}
SecretType::File { path } => {
let file_content = std::fs::read(&path)
.map_err(|err| format!("Failed to read file '{:?}': {}", path, err))?;
data.insert(label, ByteString(file_content));
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you might be able to do this in one statement by collecting into a BTreeMap, ie:

    let data = secrets.into_iter().map(|(label_value)| ... (label, ByteString(v.into_bytes()))).collect::<BTreeMap<_, _>>();

secret_name: &str,
key_files: &[(PathBuf, &str)], //[pathbuf, key type]
secrets: HashMap<String, SecretType>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since Secret takes a BTreeMap, you could just pass one in here directly

Comment on lines +61 to +86
#[derive(Debug)]
struct DockerPushThreadError {
message: String,
}

impl Display for DockerPushThreadError {
fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.message)
}
}

impl Error for DockerPushThreadError {}

impl From<String> for DockerPushThreadError {
fn from(message: String) -> Self {
DockerPushThreadError { message }
}
}

impl From<&str> for DockerPushThreadError {
fn from(message: &str) -> Self {
DockerPushThreadError {
message: message.to_string(),
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It might be easier to use thiserror https://docs.rs/thiserror/latest/thiserror/ to define this error. It could become as simple as:

#[derive(Debug, Error)]
#[error(transparent)]
struct DockerPushThreadError(#[from] String);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am learning so many things about rust! thanks Jon!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turns out i can actually just fully remove this whole thing based on the suggested change to push_image()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better, no code is the best code 🧘

@@ -140,12 +179,13 @@ async fn fetch_program(
) -> Result<(), Box<dyn std::error::Error>> {
let so_filename = format!("spl_{}-{}.so", name.replace('-', "_"), version);
let so_path = solana_root_path.join(&so_filename);
let so_name = format!("spl_{}.so", name.replace('-', "_"));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is all a bit confusing!

Copy link
Collaborator Author

@gregcusack gregcusack Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya it's a little weird since the url we are getting uses the GenesisProgram name with both hyphens and underscores

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant that more as "sorry that we made this confusing for you" 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhhhh haha ok nvm lol

// 3) RPC Node -> One image for each RPC node (not implemented yet)
// 4) Clients -> Each client has its own image (not implemented yet)

pub struct Library {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why the name Library? Seems like this is just all the machines / nodes / validators, so maybe a clearer name would help?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya def not the best name. i will change.

// 4) Clients -> Each client has its own image (not implemented yet)

pub struct Library {
validators: Vec<Option<Validator>>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange... how about separating the bootstrap validator into an Option<Validator> and then having validators as a Vec<Validator>?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is a little weird. but it's basically used to store images and associated data. bootstrap, standard, rpc validators all use their own image for all validator types deployed. So a bootstrap, standard validator, and RPC would each only appear once in this Library. But for a Client, each client uses a different image, so that is why there is a vector of clients. This was the desired state with everything implemented: https://github.com/anza-xyz/validator-lab/pull/36/files#diff-47af127d675e3dc21b43c34222f42b9290334e88eb5c7ae0fc8b7539e3ecb794. But it may not be super clear.

i can change this to something like:

pub struct ValidatorImages { //struct name tbd
    bootstrap: Option<Validator>,
    standard: Option<Validator>,
    rpc: Option<Validator>,
    client: Vec<Validator>,
}

Note: we don't need a vector for standard or rpc since we only need one image per set of validators. Although i could change this if we want to store refactor a little more.

Do something like:

pub struct Cluster { //struct name tbd
    bootstrap: Option<Validator>,
    standard: Vec<Validator>,
    rpc: Vec<Validator>,
    client: Vec<Validator>,
}

and then import the kubernetes.rs functionality here. so in main we could do something like:

cluster.create_and_deploy_validators()

^ and this would create all the labels, selectors, secrets, replicasets, etc for the non-bootstrap validators. and then would deploy all the validators.

And can do the same for bootstrap, rpc, clients.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that first option, it seems clearest to me at least.

And maybe we can split the difference on the name, and go with ClusterImages since the client isn't a validator? But I like both of the options 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok cool sounds good! and yes true


pub struct Library {
validators: Vec<Option<Validator>>,
_clients: Option<Vec<Validator>>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's up to you, but the Option part might be confusing, since there are two ways to represent no clients. Meaning, what's the difference between None and Some(vec![])?

gregcusack added a commit to gregcusack/validator-lab that referenced this pull request Apr 25, 2024
gregcusack added a commit to gregcusack/validator-lab that referenced this pull request Apr 25, 2024
gregcusack added a commit to gregcusack/validator-lab that referenced this pull request Apr 25, 2024
gregcusack added a commit that referenced this pull request Apr 29, 2024
* wip. adding replicaset but need validator config flags

* add validator config

* add pod requests. need for scheduling

* create bootstrap validator replicaset

* update token-2022 to v1.0.0: anza-xyz/agave#995

* address jon nits in #9 part 1.

* jon comments from: #9 part 2. rename Library to ClusterImages and update struct

* address Jon nits
@gregcusack gregcusack deleted the bootstrap-labels branch May 24, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants