Skip to content

How to generate Data

galabc edited this page Aug 6, 2018 · 5 revisions

Generating data consist of running hpx for_each loops and evaluating the execution times for many candidates. Currently the only parameter supported is chunk size.

Static feature extraction

To extract static features, the ClangTool LoopConvert is used on the files located at algorithms/static/ The files are named Loop_Level_N.cpp where N is the deepest loop level of the algorithm implemented. To implement a new algorithm you must add the following template to the right .cpp file

void Name(int iterations,std::vector<double> chunk_candidates) {
  
    <Insert your variables here>
    
    auto f=[&](int i){

        <Insert your function here>   

    };
    //feature extraction Name 
// hpx::parallel::for_each(hpx::parallel::execution::par.with(hpx::parallel::adaptive_chunk_size()), time_range.begin(), time_range.end(), f);
}

Dynamic Feature Extraction and execution time measurement

To extract the dynamic features of the loops the file main.cpp must be compiled. This main.cpp file will run the functions located in the algorithms/dynamic_and_execution_times folder. The files are named Loop_Level_N.cpp where N is the deepest loop level of the algorithm implemented. To implement a new algorithm you must add the following template to the right .h file

void Name(int iterations,std::vector<double> chunk_candidates) {
    
    <Insert your variables here>
    
    auto f=[&](int i){

        <Insert your function here>   
    };
  
    std::cout<<vector_size<<" "<<hpx::get_os_thread_count()<<" ";
    auto time_range=boost::irange(0,vector_size);
    double t_chunk=0.0;
    int Nrep=10;
    double mean_time;
    double elapsed_time;
    for (int i(0);i<chunk_candidates.size();i++){
	mean_time=0;
	for(int j(0);j<Nrep+1;j++){
	    if(chunk_candidates[i]*vector_size>1){
	    t_chunk=mysecond();
            hpx::parallel::for_each(hpx::parallel::execution::par.with(hpx::parallel::execution::dynamic_chunk_size(vector_size*chunk_candidates[i])), time_range.begin(), time_range.end(), f);
            elapsed_time= mysecond() - t_chunk;
	    }
	    else{
	    t_chunk=mysecond();
            hpx::parallel::for_each(hpx::parallel::execution::par.with(hpx::parallel::execution::dynamic_chunk_size(1)), time_range.begin(), time_range.end(), f);
            elapsed_time= mysecond() - t_chunk;
	    }
	    if(j!=0){
	        mean_time+=elapsed_time;
 	    }
	}
	std::cout<<mean_time/Nrep<<" ";
    }
    std::cout<<""<<std::endl;

the vector chunk_candidates is defined in the main.cpp file

Automatically generate data

Data can be generated by using the sbatch script train.sbatch and a texte file that contains the experiments you want to run. This is the template for the texte files containing experiments:

<Function Name>,<Header File Number>,<Number of iterations>,<Number of threads>
...

To launch the data generation process on the experiments located in a file called experiments.txt simply do:

sbatch train.sbatch training_list_experiments/experiments.txt