Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap for Chunked Inference Approach #74

Open
7 tasks
kevinkhu opened this issue Mar 6, 2017 · 1 comment
Open
7 tasks

Roadmap for Chunked Inference Approach #74

kevinkhu opened this issue Mar 6, 2017 · 1 comment

Comments

@kevinkhu
Copy link
Collaborator

kevinkhu commented Mar 6, 2017

This will serve as a roadmap for the implementation of the "chunking" approach, which will eventually become a major mode of operation for Starfish. The rationale for doing this is that currently, fitting of spectral model grids (for main sequence stars at high resolution, cool stars at high and low resolution, and exoplanet spectra in general) is generally a systematics-dominated problem. This means that there are wavelength regions of the data spectrum for which the model grids cannot produce an accurate model. This also implies that if our main goal is inference of accurate stellar parameters, then a main focus will need to be a "calibration" of these systematic effects. Note that in some sense this roadmap supersedes that of #58, although the ideas in that roadmap are mostly complementary to those presented here.

Instead of fitting the full spectrum and downweighting discrepant regions (as in "classic" Starfish), we propose to fit individual chunks of the spectrum at a time. Chunking allows us to compare smaller regions of spectra to models and identify more easily where models are inaccurate.

The chunking approach will function by segmenting the spectrum into independent regions, where spectral inference to determine the fundamental stellar properties (Teff, log g, [Fe/H], etc...) are done on each chunk, independently. In an obvious sense, this violates much of what we know about stellar astrophysics, i.e., the emergent spectrum is the realization of complex stellar astrophysics and each spectral line is by no mean physically independent from the others. However, since we are dealing with strong model systematics (e.g., some spectral lines simply do not fit the data for any combination of Teff, log g, [Fe/H]), this approach allows us to get a better lay of the land, and provides a groundwork for exploring which regions of the spectrum we can trust and which ones we should be skeptical of.

There are a few tasks that need to be addressed in order to implement this approach.

Setup and initialization

First, the user should be able to take a model grid, a data spectrum, and a list of chunk wavelength boundaries, and then run some scripts to segment the data up into individual chunks. The idea is that the inference on each chunk can be done completely independently from any other chunk, and so the scripts should be organized to run with that in mind. Once the posterior for each chunk is delivered, however, we will want tools that can pull the posteriors from each directory and plot them.

  • Given a user input of wavelength chunks, automatically create output subdirectories, labeled by chunk ID and wavelength boundaries
  • segment the data and the model grids to a reasonable wavelength range (maybe +10% extra on either side of the edges), and effective temperature range (e.g., 2000K - 4000K) and copy to the directory in an appropriate HDF5 format.

There are a few considerations to take care of here. The individual chunk directories will be labeled by chunkID_wlstart_wlend appropriately zero-padded so that there are no conflicts when using typically sized chunks from optical to infrared wavelengths.

Also, we need an easy way to regenerate the sub-directories if the chunk wavelength boundaries change. There should also be a way to select and individual sub-directory and regenerate just that. For these reasons, we are thinking that an individual Makefile within each subdirectory might be the best option.

Tasks within each sub-directory

  • Set up an emulator to run on this chunk of the model (e.g., 2000K - 4000K, 6900AA - 7000AA)
  • Launch star_chunk.py to sample on this individual chunk, creating samples of a mini-posterior.

Are there any necessary changes that need to be made for the emulator? Currently nothing major comes to mind, but I could be forgetting something.

Necessary improvements to star_chunk.py

Note that these mini-tasks can also be launched en mass by a top-level bash script.

Inference

  • Read in "mini-posteriors" for individual chunks and then plot them.
@iancze iancze changed the title Spectrum chunking Roadmap for Chunked Inference Approach Mar 6, 2017
@gully
Copy link
Collaborator

gully commented Mar 7, 2017

Below is an illustrative figure of the types of inferences we will be able to achieve in the spectral chunking strategy.

These are posterior samples of T_hot and logg from LkCa 4 from Gully-Santiago et al. 2017 from Starfish fits to IGRINS spectral orders m = 101 (Blue kernel density estimate), m = 114 (Red KDE), and m=117 (Green KDE).

temp_logg_example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants