two requests about eazypy architecture for cloud computing and scalabilty 

I am a principal scientist at a korean astronomy institute, especially interested in applying Big Data techs to Astronomical Problems. 

I have found two issues when I try to run eazypy on my Spark Cluster. 

[1] local file access for filters and parameters 
When running programs on Cloud, we do not have local file system, though we have "bucket", a cloud storage. 
Hence, all filters and sed-parameters need to be "in-memory" objects or "cloud-storable" objects. 

your approach using symbolic links is not friendly for running eazypy on cloud or big data platform. 


[2] your hard-wired, single node + multi-thread, optimization 
Unfortunately, I have found many astronomical tools are hard-optimized on "single node" + "multithread". 

This specific optimization is not good for writing a "scalable" code. 

Just, single thread + one by one SED fitting architecture, not loading thousands objects with running them on multi-threads, 
could be enough to massively parallelize the code for thousands or millions threads simulanesouly on hundreds multi-nodes cluster using big data platform. 

=== 
I do not know whether this can be applied or not, but single node + multi-thread optimization is not good for both simple single thread run and massive multi-nodes run. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

two requests about eazypy architecture for cloud computing and scalabilty #39

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

two requests about eazypy architecture for cloud computing and scalabilty #39

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions