-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IMPORTANT: Wrong lat, long values. #43
Comments
Hi @JuanCalvoFerrandiz, thanks again for your bug report. Do you mind detailing with data source returns this values? Thanks! |
I hope this data helps :) |
This has already been reported to CoronaDataScraper : covidatlas/coronadatascraper#528 During the afternoon I will try to find more cases using the data you provided and send them also the visualization you made in case it may help them. |
Hi guys, This is my exploration code for Viz fixing: Agregation, lat,long, adding ISO 3 and adding an official name column. Hope that helps: import task_geo.data_sources as ds
import pandas as pd
# A function that returns de unique values of a column id a df sorted
def series_unique(df, column):
unique_country_base = df.loc[:, column].unique()
return pd.DataFrame(data=unique_country_base,
columns=["unique_" + column]).sort_values("unique_" + column, ignore_index=True)
# A function that creates a dictionary from a values in a column of df_carto
def create_dict(column):
dict = {}
for value in df_unique_country_cl.loc[:, "unique_country"]:
value_dict = df_carto.loc[df_carto['country'] == value, column].iloc[0]
dict[value] = value_dict
return dict
# 0_Correction of aggregate values in countries
data_cds = ds.cds()
data_cds.loc[(data_cds["state"].isnull()) & (data_cds["county"].isnull()) & (data_cds["city"].isnull()), "aggregate"]\
= "country"
# Getting unique values from country column
data_cds_country_raw = data_cds.loc[(data_cds["aggregate"] == "country")]
df_unique_country = series_unique(data_cds_country_raw, "country")
#Getting df_carto
df_carto = pd.read_csv("..\DATA\RAW\Countries data\world_borders.csv", sep=",")
df_carto.rename(columns={"name": "country"}, inplace=True)
# 1_Getting country_carto column
# Getting unique values from country column
df_unique_country_cl = series_unique(df_carto, "country")
# Getting values with no direct equivalence in df_carto
df_left = df_unique_country.merge(df_unique_country_cl, how='outer', indicator=True).loc[
lambda x: x['_merge'] == 'left_only']
list = df_left.loc[:, "unique_country"]
list2 = ["Brunei Darussalam", "Congo", "Czech Republic", "Cote d'Ivoire", "Timor-Leste", "Swaziland",
"Iran (Islamic Republic of)", "Kosovo", "Lao People's Democratic Republic", "Libyan Arab Jamahiriya",
"Republic of Moldova", "Burma", "The former Yugoslav Republic of Macedonia", "Palestine",
"Western Sahara", "Korea, Democratic People's Republic of", "South Sudan", "Syrian Arab Republic",
"Sao Tome and Principe", "United Republic of Tanzania", "Bahamas", "Gambia", "Holy See (Vatican City)",
"Viet Nam"]
# Create a zip object from two lists and then a dict
dict = dict(zip(list, list2))
data_cds.insert(4, "country_carto", data_cds.loc[:, "country"].map(dict).fillna(data_cds.loc[:, "country"]))
# 2_Getting iso
dict_iso = create_dict("iso3")
dict_iso["Kosovo"] = "RKS"
dict_iso["South Sudan"] = "SSD"
data_cds.insert(5, "iso3", data_cds.loc[:, "country_carto"].map(dict_iso))
# Data_cds_country
data_cds_country = data_cds.loc[(data_cds["aggregate"] == "country")]
# 3_Getting lat just in countries
dict_lat = create_dict("lat")
dict_lat["Kosovo"] = 42.667542
dict_lat["South Sudan"] = 6.8769908
data_cds_country['lat'] = data_cds_country.loc[:, "country_carto"].map(dict_lat)
# 4_Getting long just in countries
dict_long = create_dict("lon")
dict_long["Kosovo"] = 21.166191
dict_long["South Sudan"] = 31.3069782
data_cds_country['long'] = data_cds_country["country_carto"].map(dict_long)
data_cds_country.to_csv(r"C:\Users\juanc\Google Drive\CORONAWHY\DATASETS\data_cds_countries.csv", encoding="UTF-8")
[world_borders.zip](https://github.com/CoronaWhy/task-geo/files/4456383/world_borders.zip) |
While reading the docs I came to the realization that the values of the field Will upload this along the adding of the iso codes. |
Update from CDS team:
|
Commit SHA:
commit c4af4d7 (HEAD -> master, origin/master, origin/HEAD)
Merge: f23824d 3de7d90
Author: Manuel Alvarez Campo [email protected]
Date: Sat Apr 4 13:48:16 2020 +0200
Merge pull request Adding metadata for all country data CDS #35 from shaikh-raj/master
Adding metadata for CDS datasource
Python version:3.7
Operating System:Windows
Data source: cds
Description
Please, revise lat, long values. There are some countries that are wrong. I have made this Viz for helping to visualize the situation. Clic dot to see country, lat, long values:
https://juancalvo.carto.com/builder/3ad41c17-bc07-4889-b047-5903300806c4/embed
The text was updated successfully, but these errors were encountered: