-
Notifications
You must be signed in to change notification settings - Fork 88
Improve join performance #3300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
flambda2
Prerequisite for, or part of, flambda2
Comments
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Feb 4, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Feb 14, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Feb 17, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Mar 14, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Mar 14, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Mar 18, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Mar 24, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Mar 24, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Mar 24, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
bclement-ocp
added a commit
to bclement-ocp/flambda-backend
that referenced
this issue
Apr 10, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
lthls
pushed a commit
that referenced
this issue
Apr 11, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also #3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in #3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
gretay-js
pushed a commit
to gretay-js/flambda-backend
that referenced
this issue
Apr 29, 2025
The existing join algorithm suffers from several drawbacks: - It can be slow due to the use of a quadratic algorithm, taking up to 60% of the total compilation time in -O3 mode in pathological cases (lambda_to_flambda_primitives.ml). See also ocaml-flambda#3300. - It is inefficient as it computes the join of all types appearing in *any* joined environment prior to filtering out the types that are not needed, instead of first computing the types whose join will be needed. - It is sensitive to the names of local variables that only exist in some of the joined environments but not in the target environment. - It relies on a global binding time of variables across all joined environments and the target environment that does not exist, as figured in ocaml-flambda#3278. Subsequently, it can lose aliasing information, and breaks typing env invariants by recording the same variable as defined multiple times (with dubious semantics). This patch implements a new join algorithm, based on a n-way join of types. The new algorithm is: - Faster, as it avoids quadratic complexity (outside of complex nesting of env extensions). Compared to the existing join algorithm (with advanced meet), on my machine, the new join algorithm is 30x faster on the pathological lambda_to_flambda_primitives.ml, taking only around 10% of the total compilation time and speeding up the compilation of the file by 3.5x. On camlinternalFormat.ml, the new join is about 2.5-3x faster, reducing the time spent in the join from 20% to less than 10% and speeding up the total compilation time by about 20%. - More efficient, as it only computes a join if it can possibly result in a more precise type, i.e. if the variable has been assigned a new type in all joined environments (otherwise the existing type in the target environment is already the most precise). - Independent of the names of local variables. - Only depends on a consistent binding time *order* of the shared variables (defined in both the target environment and all joined environments), which is respected. Since the result is independent of the binding times of local / existential variables, the typing env invariants are respected.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Turns out we spent quite some time joining environments (when that is enabled, to
-O2
and up), up to 30% of compilation times (even 60% on rare outliers), which is a lot. It is expected that joining would be expensive, but this sounds like a bit much. #3299 should be able to give us more data.The text was updated successfully, but these errors were encountered: