Skip to content

Commit

Permalink
Updated tables
Browse files Browse the repository at this point in the history
  • Loading branch information
antongiacomo committed May 7, 2024
1 parent 3cd1541 commit 4d8605b
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 3 deletions.
2 changes: 1 addition & 1 deletion experiment.tex
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ \subsection{Testing Infrastructure and Experimental Settings}\label{subsec:exper

The simulator then starts the instantiation process as shown in Figure~\ref{fig:execution_example}. At each step $i$, it selects the subset \{\vi{i},$\ldots$,$v_{\windowsize+i-1}$\} of vertices with their corresponding candidate services, and generates all possible service combinations. For each combination, the simulator calculates a given metric $M$ and selects the service that instantiates \vi{i} from the optimal combination according to $M$. The window is shifted step 1 (i.e., $i$=$i$+1) and the instantiation process restart. When the sliding window reach the end of the pipeline template, that is, $v_{\windowsize+i-1}$$=$$\vi{l}$, the simulator computes the optimal service combination and instantiates the remaining vertices with the corresponding services.

We note that a hash function randomly simulates the natural interdependence between services, modeling a data removal on one service that might impact another one. \hl{LA SPECIFICHIAMO UN PO' MEGLIO?} %By assigning weights to the services using this function, the system aims to reflect the interconnected dynamics among the services.
It is reasonable to assume that within a service pipeline, any data modification made at an earlier stage could affect the performance of the service at the subsequent steps, making the services inderdependent. Consider, for example, the removal of data from the ''name`` feature; a service that relies on that column will be more significantly affected by its removal compared to a service that does not use it. During its execution, the simulator employs a specific combination of services as the seed to assign weights to the services. This reflects how changes in one service might influence others, as previously described. By assigning weights to the services using this approach, the system aims to reflect the interconnected dynamics among the services.
%The simulator is used to assess the performance and quality of our sliding window heuristic in Section \ref{sec:heuristics} for the generation of the best pipeline instance (Section \ref{sec:instance}).
% Performance measures the heuristics execution time in different settings, while quality compares the results provided by our heuristics in terms of selected services with the optimal solution retrieved using the exhaustive approach.
Expand Down
4 changes: 3 additions & 1 deletion macro.tex
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,9 @@
\newcommand{\pipeline}{Pipeline\xspace}
\newcommand{\pipelineTemplate}{Pipeline Template\xspace}
\newcommand{\pipelineInstance}{Pipeline Instance\xspace}

\newcommand{\quality}{quality\xspace}
\newcommand{\Quality}{Quality\xspace}
\newcommand{\q}{$q$\xspace}
\newcommand{\pone}{$(service\_owner=dataset\_owner)$}
\newcommand{\ptwo}{$(service\_owner=partner(dataset\_owner))$}
\newcommand{\pthree}{$\langle service\_owner \neq dataset\_owner AND owner \neq partner(dataset\_owner)$}
Expand Down
2 changes: 1 addition & 1 deletion metrics.tex
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
\section{Maximizing the Pipeline Instance Quality}\label{sec:heuristics}
%
% %Ovviamente non è sufficiente scegliere il best service per ogni vertice, ma diventa un problema complesso dove si devono calcolare/valutare tutte le possibili combinazioni dei servizi disponibili, tra le quali scegliere la migliore.
Our goal is to generate a pipeline instance with maximum quality, addressing data protection requirements while minimizing information loss \textit{dloss} throughout the pipeline execution. To this aim, we first discuss the quality metrics used to measure and monitor data quality, which guide the generation of the pipeline instance. Then, we prove that the problem of generating a pipeline instance with maximum quality is NP-hard (\cref{sec:nphard}). Finally, we present a parametric heuristic (\cref{subsec:heuristics}) tailored to address the computational complexity associated with enumerating all possible combinations within a given set. The primary aim of the heuristic is to approximate the optimal path for service interactions and transformations, particularly within the landscape of more complex pipelines composed of numerous vertices and candidate services. Our focus extends beyond identifying optimal combinations to encompass an understanding of the quality changes introduced during the transformation processes.
Our goal is to generate a pipeline instance with maximum quality, addressing data protection requirements while maximizing \textit{\quality (\q)} throughout the pipeline execution. To this aim, we first discuss the quality metrics used to measure and monitor data quality, which guide the generation of the pipeline instance. Then, we prove that the problem of generating a pipeline instance with maximum quality is NP-hard (\cref{sec:nphard}). Finally, we present a parametric heuristic (\cref{subsec:heuristics}) tailored to address the computational complexity associated with enumerating all possible combinations within a given set. The primary aim of the heuristic is to approximate the optimal path for service interactions and transformations, particularly within the landscape of more complex pipelines composed of numerous vertices and candidate services. Our focus extends beyond identifying optimal combinations to encompass an understanding of the quality changes introduced during the transformation processes.

%Inspired by existing literature, these metrics, categorized as quantitative and statistical, play a pivotal role in quantifying the impact of policy-driven transformations on the original dataset.

Expand Down

0 comments on commit 4d8605b

Please sign in to comment.