feat(provider): add gcp provider with data #372

jonperron · 2025-02-04T16:53:05Z

Changelog

Update providers.csv
Add data from gcp with documentation links

Fix #145

jonperron · 2025-02-04T16:54:29Z

Two remaining questions:

What "platform" means in this context ? If this is the CPU manufacturers, then Google is sharing a lot of info (cf https://cloud.google.com/compute/docs/cpu-platforms?hl=en)
GCP has a couple of GPU machine types (https://cloud.google.com/compute/docs/gpus?hl=en), with a range of vcpus and ram. How should they be taken into account ?

da-ekchajzer · 2025-02-04T17:18:41Z

What "platform" means in this context ?

The platform is the bare metal server that hosts the instance. Since we are calculating the impact of the instance as part of the bare metal server, we need to know its architecture. The platform field must match one of the id's of the available server archetypes. So you should add a platform name that matches a platform you are going to create in boaviztapi/data/archetypes/server.csv. This documentation may help you:

It is always difficult to get all the data, so sometimes you will have to guess. If so, you should mention it in the warning column. I recommend that you set the CPU name so that the other values are automatically filled in at runtime. If you do this, make sure the CPU is listed in the CPU data file. If it is not listed, you should add it so that everyone can benefit from it: https://doc.api.boavizta.org/contributing/cpu/.

The most important values are :

CPU.vcpu
case_type (always rack in cloud contexte)
CPU.units
SSD.units
SSD.capacity
HDD.units
HDD.capacity

For POWER_SUPPLY and USAGE feel free to use the values from the other providers.

GCP has a couple of GPU machine types (https://cloud.google.com/compute/docs/gpus?hl=en), with a range of vcpus and ram. How should they be taken into account ?

We believe that GPUs are never shared between instances, so there is no need to include the resources in the instance file. However, you can add the data to the server.csv. For know we are not calculating the impact of GPU but we will be soon so I recommend you to add it.

Hope this is clear, if not ping me on mattermost.

da-ekchajzer · 2025-02-09T17:40:14Z

I dont know if you are aware but we use the ";" as a separator in the CSV to distinguish between average;min;max value of a parameter. I just saw that you replaced them with "," in server.csv file. Just to confirm, it is normal to have ;. It can be useful if you don't know an exact value but prefer to give a range. The API will calculate the min and max values.

jonperron · 2025-02-10T07:08:04Z

Thanks for the quick review @da-ekchajzer. I am filling the csv file using open office, that's why it's filled with ",". I will make a fix once I'm done 👍

boaviztapi/data/archetypes/server.csv

jonperron · 2025-03-01T10:07:25Z

PR ready for review @da-ekchajzer 👍

da-ekchajzer · 2025-03-01T15:08:45Z

Well done! Were you able to automate it or did you do it manually?

At first glance, I see strange numbers:

302x4-megamem-1440-metal 1,4424,576000x4-megamem-1440-metalhttps://cloud.google.com/compute/docs/memory-optimized-machines?hl=en#x4_machine_types

I think the "," should be removed as it seems to be a thousand separator.

Other remark, are these only instances with SSD storage? No HDD?

da-ekchajzer · 2025-03-01T15:10:12Z

Ideally we should implement a test that computes all instances. But I should open a topic dedicated to this.

jonperron · 2025-03-02T14:22:21Z

Were you able to automate it or did you do it manually?

I did it manually, but I could add a script as it seems that most of the information are stored in HTML tables.

I think the "," should be removed as it seems to be a thousand separator.

Indeed, I have double-checked and removed the thousand separators.

are these only instances with SSD storage? No HDD?

As I understood the documentation, most of the machines does not come SSD nor HDD. One needs to rent a hard drive when creating a machine. In addition, some of the types such as n4 does not support persistent or local SSD, whereas others support a lot (see https://cloud.google.com/compute/docs/general-purpose-machines?hl=en#c3_disks for example). That's why I only added hard drive capabilities when it is included in a machine such as c4a machines.

Would you do it otherwise ?

Changelog * Update providers.csv * Add data from gcp with documentation links * Update cpu_specs.csv * Update tests

da-ekchajzer · 2025-03-04T08:21:38Z

As I understood the documentation, most of the machines does not come SSD nor HDD. One needs to rent a hard drive when creating a machine. In addition, some of the types such as n4 does not support persistent or local SSD, whereas others support a lot (see https://cloud.google.com/compute/docs/general-purpose-machines?hl=en#c3_disks for example). That's why I only added hard drive capabilities when it is included in a machine such as c4a machines.

So you include ssd when it is explicitly a local SSD ? As I understand not all C4a have SSD.

We are planning to add storage as a service to get the impacts of additional storage

da-ekchajzer · 2025-03-04T08:22:35Z

Have you been able to test all instances through a script for instance ?

jonperron · 2025-03-04T09:56:34Z

So you include ssd when it is explicitly a local SSD ? As I understand not all C4a have SSD.

Yes exactly 👍 That's why there are two archetypes for a "standard c4a" server: c4a-standard and c4a-standard-lssd. The latest corresponds to instances with a SSD included.

Have you been able to test all instances through a script for instance ?

I will add a test 👍 I only added a single instance in the tests/data/archetype at the moment.

jonperron · 2025-03-08T08:02:56Z

I thought about this

@pytest.fixture
def valid_cpu_specs():
    with open(cpu_specs, 'r') as f:
        reader = csv.DictReader(f)
        return {row['name'] for row in reader}


SERVER_ARCHETYPE_MANDATORY_FIELDS = [
    "id",
    "case_type",
    "CPU.units", 
    "RAM.units",
    "SSD.units",
    "HDD.units",
    "GPU.units", 
    "POWER_SUPPLY.units",
    "POWER_SUPPLY.unit_weight",
    "USAGE.time_workload",
    "USAGE.use_time_ratio",
    "USAGE.hours_life_time",
    "USAGE.other_consumption_ratio"
]

def test_archetypes_properly_filled(providers, valid_platforms, valid_cpu_specs):
    for provider_name in providers:
        provider_csv_path = f"{cloud_path}/{provider_name}.csv"

        try:
            with open(provider_csv_path, 'r') as f:
                reader = csv.DictReader(f)
                for row in reader:
                    for field in SERVER_ARCHETYPE_MANDATORY_FIELDS:
                        if not row.get(field):
                            pytest.fail(f"Empty value for field '{field}' in provider '{provider_name}'")
                    if cpu:= row.get('CPU.name'):
                        if cpu not in valid_cpu_specs:
                            pytest.fail(f"CPU '{cpu}' for provider '{provider_name}' not found in cpu_specs.csv")

        except FileNotFoundError:
            pytest.fail(f"CSV file for provider '{provider_name}' not found: {provider_csv_path}")

(It's basically an update of the test test_platform_exists_in_server_csv with explicits checks on mandatory fields)
Is it what you're thinking about ? If not, I'llt catch you on mattermost so we can exchange about what you have in mind.

jonperron force-pushed the feat/boaviztapi-145 branch from 771b7de to 3b6c2c8 Compare February 4, 2025 16:55

jonperron marked this pull request as draft February 5, 2025 08:03

jonperron force-pushed the feat/boaviztapi-145 branch 2 times, most recently from 4a066ef to 098e421 Compare February 9, 2025 16:30

jonperron commented Feb 10, 2025

View reviewed changes

boaviztapi/data/archetypes/server.csv Outdated Show resolved Hide resolved

jonperron force-pushed the feat/boaviztapi-145 branch from 098e421 to f001f9f Compare February 15, 2025 14:39

jonperron force-pushed the feat/boaviztapi-145 branch 7 times, most recently from 0165bc2 to ae90d6f Compare March 1, 2025 10:05

jonperron marked this pull request as ready for review March 1, 2025 10:06

jonperron force-pushed the feat/boaviztapi-145 branch from ae90d6f to fbbfca7 Compare March 1, 2025 12:38

feat(provider): add gcp provider with data

1673ddd

Changelog * Update providers.csv * Add data from gcp with documentation links * Update cpu_specs.csv * Update tests

jonperron force-pushed the feat/boaviztapi-145 branch from fbbfca7 to 1673ddd Compare March 2, 2025 14:26

da-ekchajzer merged commit 88aacba into Boavizta:main Mar 22, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(provider): add gcp provider with data #372

feat(provider): add gcp provider with data #372

jonperron commented Feb 4, 2025 •

edited

Loading

jonperron commented Feb 4, 2025

da-ekchajzer commented Feb 4, 2025

da-ekchajzer commented Feb 9, 2025

jonperron commented Feb 10, 2025

jonperron commented Mar 1, 2025

da-ekchajzer commented Mar 1, 2025

da-ekchajzer commented Mar 1, 2025

jonperron commented Mar 2, 2025 •

edited

Loading

da-ekchajzer commented Mar 4, 2025

da-ekchajzer commented Mar 4, 2025 •

edited

Loading

jonperron commented Mar 4, 2025 •

edited

Loading

jonperron commented Mar 8, 2025

feat(provider): add gcp provider with data #372

feat(provider): add gcp provider with data #372

Conversation

jonperron commented Feb 4, 2025 • edited Loading

jonperron commented Feb 4, 2025

da-ekchajzer commented Feb 4, 2025

da-ekchajzer commented Feb 9, 2025

jonperron commented Feb 10, 2025

jonperron commented Mar 1, 2025

da-ekchajzer commented Mar 1, 2025

da-ekchajzer commented Mar 1, 2025

jonperron commented Mar 2, 2025 • edited Loading

da-ekchajzer commented Mar 4, 2025

da-ekchajzer commented Mar 4, 2025 • edited Loading

jonperron commented Mar 4, 2025 • edited Loading

jonperron commented Mar 8, 2025

jonperron commented Feb 4, 2025 •

edited

Loading

jonperron commented Mar 2, 2025 •

edited

Loading

da-ekchajzer commented Mar 4, 2025 •

edited

Loading

jonperron commented Mar 4, 2025 •

edited

Loading