Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert golden samples to arviz IData #225

Open
feynmanliang opened this issue Feb 12, 2021 · 3 comments
Open

Convert golden samples to arviz IData #225

feynmanliang opened this issue Feb 12, 2021 · 3 comments

Comments

@feynmanliang
Copy link

Arviz provides nice visualization tools for posterior samples, but the current golden samples data format requires a pretty long dance to get it into an `InferenceData. For example, currently I am doing this to convert the eight schools golden samples into a format arviz can visualize:

gs = my_pdb.posterior("eight_schools-eight_schools_centered").reference_draws()
gs_dict = {}
num_chains = len(gs)
num_samples = len(gs[0][next(iter(gs[0]))])

for i,chain in enumerate(gs):
    for var in chain:
        if '[' not in var:
            if var not in gs_dict:
                gs_dict[var] = np.zeros((num_chains, num_samples))
            gs_dict[var][i,:] = np.array(chain[var])
        else:
            name = var.split('[')[0]
            idx = int(var.split('[')[1].split(']')[0]) - 1
            if name not in gs_dict:
                var_size = len(list(filter(lambda x: x.startswith(name), chain)))
                gs_dict[name] = np.zeros((num_chains,num_samples,var_size))
            gs_dict[name][i,:,idx] = np.array(chain[var])
            
gs_idata = az.convert_to_inference_data(
    gs_dict,
    coords={"school": np.arange(data.values()['J'])},
    dims={
        "theta": ["school"],
    }
)

Is there an easier way to do this that I am missing? If not, would it be worthwhile to package something like this up as a library method (or could .reference_draws() return an InferenceData with the chain/draw/other dimensions set up)?

@ahartikainen
Copy link
Collaborator

I did this some time ago (it uses from_dict)

https://gist.github.com/ahartikainen/ca4ec935c78c56e2d352b8d34a286fd0

Not sure if posteriordb will add this kind of functionality, ArviZ might be a better place for it.

@feynmanliang
Copy link
Author

Thanks :) I'll go link this issue over there

@MansMeg
Copy link
Collaborator

MansMeg commented Feb 13, 2021

Great @ahartikainen . This structure is based on the posterior R package structure and I use that structure to read and write the JSON posteriors, so I guess it would probably be something that would fit in the Arviz package, although I think it would be good to get reading in the gold standard to be part of the python posteriordb library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants