-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Common table expression (CTE) optimizations in CBO #5154
Commits on Apr 5, 2024
-
Prototype CteRewriteRule and implement conversion back to AST using W…
…ITH clauses Change-Id: I442c179183ce66f2e9ecc5ee7a5891863beb584f
Configuration menu - View commit details
-
Copy full SHA for 3e9a908 - Browse repository at this point
Copy the full SHA 3e9a908View commit details -
Copy TableFunctionScan to new cluster
Change-Id: Id1ee6ceabb090b2622c3227dab693593f58dc1b3
Configuration menu - View commit details
-
Copy full SHA for 9337785 - Browse repository at this point
Copy the full SHA 9337785View commit details -
Add configuration property controlling CTE rewrite via CBO
Do not set materialize threshold programmatically. If the initial query has CTEs and the value is set to true then its fine to materialize those ones. Change-Id: I54c7935ae408bdd87cf6eae5ae3f96951f9799c4
Configuration menu - View commit details
-
Copy full SHA for 9a2fb94 - Browse repository at this point
Copy the full SHA 9a2fb94View commit details -
Add first self-contained end-to-end test using CTE rewrite based on E…
…MPS/DEPT schema Change-Id: I1625953fa8bc73c3e89d9e73ebfab388021e3973
Configuration menu - View commit details
-
Copy full SHA for e59a382 - Browse repository at this point
Copy the full SHA e59a382View commit details -
Use locally built workload-insights dependency (1.0.1.2024.0.18.0-12)
Change-Id: I2b659ba518033553c5230aa47c50245f5ae0db56
Configuration menu - View commit details
-
Copy full SHA for ac74eac - Browse repository at this point
Copy the full SHA ac74eacView commit details -
Replace TemporaryRelOptTable with RelOptTableImpl
Drop some boilerplate code which anyways doesn't do much at the moment. Change-Id: Ic8b614bd97375b870772f7f88d9bb0890f557ac1
Configuration menu - View commit details
-
Copy full SHA for 07ec2b7 - Browse repository at this point
Copy the full SHA 07ec2b7View commit details -
Refactor to separate package and extract rules (Buggy)
There is a bug somewhere and TableSpool is not introduced correctly. Need to revisit the patch. Change-Id: I705d71bb5e6dc63f5e1440d49f9bac7fac5f68d8
Configuration menu - View commit details
-
Copy full SHA for 1005049 - Browse repository at this point
Copy the full SHA 1005049View commit details -
Disable dag mode for HepPlanner to avoid introducing spools for every…
… single TableScan Change-Id: I28a7756eb6841656880084d209f10d362d812044
Configuration menu - View commit details
-
Copy full SHA for 254998b - Browse repository at this point
Copy the full SHA 254998bView commit details -
Small refactoring for RelCteTransformer
Change-Id: Idf099037e049ca9c145471f4f6ce71f543bab357
Configuration menu - View commit details
-
Copy full SHA for ad527ea - Browse repository at this point
Copy the full SHA ad527eaView commit details -
Use general copy for arbitrary nodes in HiveRelCopier
Haven't hit a problem yet but its good to have it in place. Change-Id: I9d36e0355e788f01c207bd71f0457b3a4dbfd7af
Configuration menu - View commit details
-
Copy full SHA for 914c154 - Browse repository at this point
Copy the full SHA 914c154View commit details -
Use Hive cost model for doing the CTE/MV rewrite + general refactoring
TODO: Need to pass the RelOptMaterialization to the HepPlanner. Change-Id: I76175a38045fd4816d3311aa9ba93a32627e520e
Configuration menu - View commit details
-
Copy full SHA for 3ad26bc - Browse repository at this point
Copy the full SHA 3ad26bcView commit details -
Add .q file for facilitate debugging on a single query
Change-Id: I49da052b09e03dffeaecc64070181d2c042df044
Configuration menu - View commit details
-
Copy full SHA for 413d833 - Browse repository at this point
Copy the full SHA 413d833View commit details -
Refactor RelCteTransformer to allow registering CTEs in both Heuristi…
…c & Cost-based planner Apart from improving readability the refactoring fixes the problem in TablescanToSpoolRule that couldn't see the materializations. Changes were in needed with respect to trait handling to ensure that HiveRelCopier works as expected. Now the ed_cte_0_debug.q test passes returning same result as before. Change-Id: I86a3755a60b0bd8c582f969cd5574756afd590b6
Configuration menu - View commit details
-
Copy full SHA for 4716ddf - Browse repository at this point
Copy the full SHA 4716ddfView commit details -
Move CTE rewriting logic in CalcitePlanner before join ordering trans…
…formations The refactoring is desired/necessary for the following reasons: * Spool sub-plans take advantage of all standard optimization rules leading to more efficient plans. The Spool sub-plans are coming from potentially external tools so we have no guarantees about their shape; as it can be seen by existing plans before this change the plans below a spool operator was far from optimal (presence of cartesian products, missed filter pushdown opportunities, etc.). * Moving the logic inside CalcitePlanner allows to use existing code for MV rewritting, and individual rule application (using executeProgram * The ASTConverter along with other code relies on the fact that the plan has a certain shape, which is obtained by applying all rules. If the Spool sub-plan (or other parts of the query) do not adhere to desired structure this can lead to failures. For instance, cbo_query64.q was failing with the following stacktrace due to the unexpected structure under the Spool: mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver -Dqfile=cbo_query64.q -Dtest.output.overwrite java.lang.IndexOutOfBoundsException: Index: 92, Size: 7 at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_261] at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_261] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitInputRef(ASTConverter.java:709) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitInputRef(ASTConverter.java:664) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:354) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:569) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:261) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:580) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:261) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:580) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:531) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:261) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:580) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:261) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:122) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1458) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:627) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13450) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:479) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:319) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:184) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:319) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:227) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:108) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:202) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:656) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:602) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:596) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:267) [hive-cli-3.1.3000.2024.0.18.0-12.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:210) [hive-cli-3.1.3000.2024.0.18.0-12.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:136) [hive-cli-3.1.3000.2024.0.18.0-12.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:436) [hive-cli-3.1.3000.2024.0.18.0-12.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) [hive-cli-3.1.3000.2024.0.18.0-12.jar:?] at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:887) [hive-it-util-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:857) [hive-it-util-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.cli.control.CorePerfCliDriver.runTest(CorePerfCliDriver.java:108) [hive-it-util-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:173) [hive-it-util-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.cli.TestTezTPCDS30TBPerfCliDriver.testCliDriver(TestTezTPCDS30TBPerfCliDriver.java:79) [test-classes/:?] Change-Id: I231bd303611f891b1502760fefcb91dab798e916
Configuration menu - View commit details
-
Copy full SHA for 1143973 - Browse repository at this point
Copy the full SHA 1143973View commit details -
NPE in HiveRelMdRowCount and ASTConverter when running cbo_query64
1. Add distinctRowCount handler for Spool operator to avoid the NPE below. java.lang.NullPointerException: null at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.analyzeJoinForPKFK(HiveRelMdRowCount.java:312) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:101) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at GeneratedMetadataHandler_RowCount.getRowCount_$(janino2882819400139487130.java:117) ~[?:?] at GeneratedMetadataHandler_RowCount.getRowCount(janino2882819400139487130.java:31) ~[?:?] at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1882) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1756) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.pushDownFactor(LoptOptimizeJoinRule.java:1153) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addFactorToTree(LoptOptimizeJoinRule.java:937) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createOrdering(LoptOptimizeJoinRule.java:728) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.findBestOrderings(LoptOptimizeJoinRule.java:459) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.onMatch(LoptOptimizeJoinRule.java:128) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2826) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2770) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyJoinOrderingTransform(CalcitePlanner.java:2455) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1878) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1730) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.plan(CalcitePlanner.java:1389) ~[hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:600) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13450) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:486) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:319) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:184) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:319) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:227) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:108) [hive-exec-3.1.3000.2024.0.18.0-12.jar:3.1.3000.2024.0.18.0-12] 2. Modify PlanModifierForASTConv to ensure that there is a Project below every Spool operator to avoid problems when converting back to AST. Change-Id: I15ead58cd12a3bd7ec2fbd3300e6b2111ce44c6c
Configuration menu - View commit details
-
Copy full SHA for f3af2d9 - Browse repository at this point
Copy the full SHA f3af2d9View commit details -
Cancel CTE transformation effects when plan doesn't contain any Spool
If there are no Spool operators at the end of the CTE transformation then cancel any potential side effects by returning the base plan. Change-Id: Ia71084f7825e696e64ab83a0a3079c6a246ef915
Configuration menu - View commit details
-
Copy full SHA for 5036602 - Browse repository at this point
Copy the full SHA 5036602View commit details -
Remove TODO fixed by previous commits
Change-Id: Iea5bb407f07b9b2b8cf6b6d77baa587a27a27429
Configuration menu - View commit details
-
Copy full SHA for 08d4e0d - Browse repository at this point
Copy the full SHA 08d4e0dView commit details -
Remove now unused RelCteTransformer class
Change-Id: I98410984ec33500f9fcb259442a921abf825e5a7
Configuration menu - View commit details
-
Copy full SHA for e9c6ed9 - Browse repository at this point
Copy the full SHA e9c6ed9View commit details -
Move HiveRelCopier to more general package and improve Javadoc
Change-Id: If00652c7e4c579f8997e51b5f2a522677ea6cc6b
Configuration menu - View commit details
-
Copy full SHA for 1171423 - Browse repository at this point
Copy the full SHA 1171423View commit details -
Change-Id: Ief7cbfa05a3d16687cab29b08e46285d0d57ed35
Configuration menu - View commit details
-
Copy full SHA for 24081fe - Browse repository at this point
Copy the full SHA 24081feView commit details -
Change-Id: I8a090788598dce63bb6a5347ece3a30d18d12fa5
Configuration menu - View commit details
-
Copy full SHA for 6f0dc7e - Browse repository at this point
Copy the full SHA 6f0dc7eView commit details -
Add HiveTableSpool for consistency with other RelNodes
The spool specialization does not bring anything new to the table but it follows the general design pattern in Hive where all operators have their Hive equivalent. Change-Id: I19812c574536acd942c522f4ed88345236c8a70b
Configuration menu - View commit details
-
Copy full SHA for fd55af9 - Browse repository at this point
Copy the full SHA fd55af9View commit details -
Prototype explicit RelNode to Operator transformation for CTEs
Completely untested. At the moment it just compiles and drafts the idea Change-Id: I2ce978c3d8052b8469c3ec2c8c3012f7d7e037e9
Configuration menu - View commit details
-
Copy full SHA for 6c9443b - Browse repository at this point
Copy the full SHA 6c9443bView commit details -
Rename ed_cte_0 to cte_cbo_rewrite_0
Change-Id: I4f2e567fb72c645358f3167b0b154f3811cd6a57
Configuration menu - View commit details
-
Copy full SHA for 508624b - Browse repository at this point
Copy the full SHA 508624bView commit details -
Update cte_cbo_rewrite_0.q.out after introducing HiveTableSpool
Change-Id: I3a0b42c3bbfbc2047f2558ce6ca73ce64e5d2438
Configuration menu - View commit details
-
Copy full SHA for d196900 - Browse repository at this point
Copy the full SHA d196900View commit details -
Enhance HiveTableScan operators over CTE tables with ColumnInfo and m…
…utable caching data structures The presence of ColumnInfo in RelOptHiveTable is important when going directly from RelNode to Operator tree in particular for creating the mapping from HiveTableScan to TableScanOperator (HiveTableScanVisitor). The caching data structures in RelOptHiveTable must be mutable cause they are enriched gradually during the optimization process. Collections.empty are immutable and using them leads to exceptions during query compilation. Change-Id: I127bdd4156e8652bda69561402c246f39909c72e
Configuration menu - View commit details
-
Copy full SHA for 6fe18a2 - Browse repository at this point
Copy the full SHA 6fe18a2View commit details -
Circumvent HMS stat retrieval logic for virtual CTE tables by creatin…
…g empty column stats Change-Id: I4f3a33c033e0d68caceb4724e1f8be9278561109
Configuration menu - View commit details
-
Copy full SHA for 7b41ff1 - Browse repository at this point
Copy the full SHA 7b41ff1View commit details -
Missing connections between CTE producers/consumers in HiveOpConverter
1. Use ForwardWalker (instead of DefaultGraphWalker) to connect CTE producers & consumers since we are starting the traversal from the "sink" operator. DefaultGraphWalker is appropriate only when the traversal starts from the scans operators. 2. TableScanOperators that represent CTEs must not be part of topOps since these operators only appear temporarily in the plan. Change-Id: I852a1058eeb124507ca53b00ec9b512bb11b33b4
Configuration menu - View commit details
-
Copy full SHA for feeca2b - Browse repository at this point
Copy the full SHA feeca2bView commit details -
Add test with CTEs and hive.cbo.returnpath.hiveop enabled
The Operator DAG is created successfully but the plan is not executable cause it contains parallel edges. The plan contains two edges from Reducer 2 to Reducer 3. Change-Id: Ibac55afefe105e3096d63655af7d9e66c63a211c
Configuration menu - View commit details
-
Copy full SHA for e89d950 - Browse repository at this point
Copy the full SHA e89d950View commit details -
Compilation failures in hive-exec module
1. Drop dependency to workload-insights since compile classpath is messed up with transitive hive dependencies coming from downstream fork. 2. Create trivial interface/implementation for CTE suggestions for making the code compile (potentially it runs as well but didn't try yet). 3. Add RelOptHiveTable.Type enumeration to be able to distinguish tables that correspond to transient CTEs and update references. Maybe there is a better way to achieve this without introducing a new attribute to an already heavy implementation. There is a TableType enumeration in the metastore module but it doesn't seem appropriate to add new fields there for something that should never reach the metastore.
Configuration menu - View commit details
-
Copy full SHA for a30aa79 - Browse repository at this point
Copy the full SHA a30aa79View commit details -
NPE when estimating rowCount for CTE table
java.lang.NullPointerException at org.apache.hadoop.hive.ql.stats.BasicStats$DataSizeEstimator.getFileSizeForPath(BasicStats.java:220) at org.apache.hadoop.hive.ql.stats.BasicStats$DataSizeEstimator.apply(BasicStats.java:207) at org.apache.hadoop.hive.ql.stats.BasicStats.apply(BasicStats.java:305) at org.apache.hadoop.hive.ql.stats.BasicStats$Factory.build(BasicStats.java:70) at org.apache.hadoop.hive.ql.stats.BasicStats$Factory.buildAll(BasicStats.java:81) at org.apache.hadoop.hive.ql.stats.StatsUtils.getNumRows(StatsUtils.java:231) at org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getRowCount(RelOptHiveTable.java:454) at org.apache.calcite.rel.core.TableScan.computeSelfCost(TableScan.java:100) at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getNonCumulativeCost(RelMdPercentageOriginalRows.java:174) at GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost_$(Unknown Source) at GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(RelMetadataQuery.java:288) at org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.getCost(HiveVolcanoPlanner.java:113) at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:415) at org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:398) at org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1268) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1227) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283) at org.apache.calcite.rel.rules.materialize.MaterializedViewRule.perform(MaterializedViewRule.java:474) at org.apache.calcite.rel.rules.materialize.MaterializedViewProjectJoinRule.onMatch(MaterializedViewProjectJoinRule.java:50) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229) at org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.rewriteUsingViews(CalcitePlanner.java:2089) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyCteRewriting(CalcitePlanner.java:2110) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1713) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1575) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1327) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:579) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13148) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:474) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) Since the table does not correspond to an actual path in FS it is normal to get a NPE when trying to find the path. Set explicitly a rowCount to avoid going down the NPE path.
Configuration menu - View commit details
-
Copy full SHA for 280604e - Browse repository at this point
Copy the full SHA 280604eView commit details -
Run cte_cbo_rewrite_0.q and update plan
The query runs fine and passes from the CTE code path but there is no spool operator cause the trivial suggester does not find a meaningfull CTE.
Configuration menu - View commit details
-
Copy full SHA for b6fdec3 - Browse repository at this point
Copy the full SHA b6fdec3View commit details -
Add suggester for join CTEs and fix problems to make cte_cbo_rewrite_…
…0 pass 1. Add CommonTableExpressionJoinSuggester with very simplistic logic for detecting join CTEs in the plan. 2. Propagate noDag option when executing rules with HepPlanner in CalcitePlanner 3. Use last element of qualified name in ASTConverter to create the reference to spool/CTE. 4. Update cte_cbo_rewrite_0.q.out based on updates
Configuration menu - View commit details
-
Copy full SHA for 53ed412 - Browse repository at this point
Copy the full SHA 53ed412View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0e2612b - Browse repository at this point
Copy the full SHA 0e2612bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d4c989 - Browse repository at this point
Copy the full SHA 5d4c989View commit details -
Unify SpoolRemoveRule with TableScanToSpoolRule and extract table cou…
…nts from metadata
Configuration menu - View commit details
-
Copy full SHA for c181b50 - Browse repository at this point
Copy the full SHA c181b50View commit details -
Configuration menu - View commit details
-
Copy full SHA for d156c0c - Browse repository at this point
Copy the full SHA d156c0cView commit details -
Add new CTE suggester based on centralized registry populated during …
…planning 1. Generalize the CTE registry idea outside the Join Suggester and incorporate in planning.The global registry may be too expensive (cpu & memory) to keep always on need to revisit this option. 2. Add utilities for stripping HepVertices and counting nodes used by the new suggester.
Configuration menu - View commit details
-
Copy full SHA for e903f0a - Browse repository at this point
Copy the full SHA e903f0aView commit details -
Invalid table alias or column reference exception in SemanticAnalyzer…
….genOPTree When introducing the spool the type names of the input operator match those of the table and this is guaranteed by the respective rule. However, other optimization rules may change the names. If the names are not inline the ASTConverter will create invalid named column references for those expressions over the CTE table which will lead to compilation failures similar to the one below. org.apache.hadoop.hive.ql.parse.SemanticException: Line 30:9 Invalid table alias or column reference 'i_item_id': (possible column names are: i_item_desc, i_category, i_class, i_current_price, itemrevenue, revenueratio) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:13584) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13526) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13494) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13488) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:9407) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11592) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11483) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12419) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12285) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:13036) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13148) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12663) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:4
Configuration menu - View commit details
-
Copy full SHA for 0037f87 - Browse repository at this point
Copy the full SHA 0037f87View commit details -
Decouple RelOptMaterialization from CTESuggester interface
The materialization is really coupled with Hive since we need to create a HiveTableScan and HiveRelOptTable so it doesn't make much sense to put the responsibility of creating the MV object in the Suggester. It is highly unlikely that a consumer not familiar with the internals of Hive will be able to create such objects correctly.
Configuration menu - View commit details
-
Copy full SHA for 6ec5f90 - Browse repository at this point
Copy the full SHA 6ec5f90View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4d7e128 - Browse repository at this point
Copy the full SHA 4d7e128View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2fb6481 - Browse repository at this point
Copy the full SHA 2fb6481View commit details -
Configuration menu - View commit details
-
Copy full SHA for d0c9afe - Browse repository at this point
Copy the full SHA d0c9afeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 263c7b1 - Browse repository at this point
Copy the full SHA 263c7b1View commit details -
Configuration menu - View commit details
-
Copy full SHA for c1dec82 - Browse repository at this point
Copy the full SHA c1dec82View commit details -
Configuration menu - View commit details
-
Copy full SHA for 208e9ca - Browse repository at this point
Copy the full SHA 208e9caView commit details -
Update TPCDS query plans (Unexpected changes)
Not sure why we had changes especially in the non-cbo plans.
Configuration menu - View commit details
-
Copy full SHA for ed9751c - Browse repository at this point
Copy the full SHA ed9751cView commit details -
Update TPCDS query plans (confirms that there is flakiness)
Rerunning the same tests without any changes leads again to plan changes so there is some kind of flakiness in the CTE logic.
Configuration menu - View commit details
-
Copy full SHA for 684c052 - Browse repository at this point
Copy the full SHA 684c052View commit details -
Configuration menu - View commit details
-
Copy full SHA for 70ff113 - Browse repository at this point
Copy the full SHA 70ff113View commit details -
AssertionError when registering HiveIntersect to VolcanoPlanner
java.lang.AssertionError: Relational expression rel#115688:HiveIntersect.HIVE.[].any(input#0=HiveProject#115671,input#1=HiveProject#115686,all=false) has calling-convention HIVE but does not implement the required interface 'interface org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveRelNode' of that convention at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1123) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewBoxing$HiveMaterializedViewUnboxingRule.onMatch(HiveMaterializedViewBoxing.java:206) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229) at org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.rewriteUsingViews(CalcitePlanner.java:2081) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyCteRewriting(CalcitePlanner.java:2103) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1689) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1571) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1323) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:575)
Configuration menu - View commit details
-
Copy full SHA for 8bcd8fb - Browse repository at this point
Copy the full SHA 8bcd8fbView commit details -
AssertionError when registering HiveExcept to VolcanoPlanner
java.lang.AssertionError: Relational expression rel#274505:HiveExcept.HIVE.[].any(input#0=HiveExcept#274490,input#1=HiveProject#274503,all=false) has calling-convention HIVE but does not implement the required interface 'interface org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveRelNode' of that convention at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1123) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) at org.apache.calcite.plan.volcano.VolcanoPlanner.setRoot(VolcanoPlanner.java:265) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.rewriteUsingViews(CalcitePlanner.java:2080) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyCteRewriting(CalcitePlanner.java:2103) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1689) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1571) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
Configuration menu - View commit details
-
Copy full SHA for 26bc808 - Browse repository at this point
Copy the full SHA 26bc808View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2963ecd - Browse repository at this point
Copy the full SHA 2963ecdView commit details -
Configuration menu - View commit details
-
Copy full SHA for dee7682 - Browse repository at this point
Copy the full SHA dee7682View commit details -
Configuration menu - View commit details
-
Copy full SHA for aceae7c - Browse repository at this point
Copy the full SHA aceae7cView commit details -
Configuration menu - View commit details
-
Copy full SHA for af86494 - Browse repository at this point
Copy the full SHA af86494View commit details -
Small improvements & documentation for CommonTableExpressionRegistry
1. Ensure that we don't modify RelNode when stripping the HepVertices 2. Use ArrayList instead of HashSet hoping for less flakiness and more stability in the plans
Configuration menu - View commit details
-
Copy full SHA for 5a23913 - Browse repository at this point
Copy the full SHA 5a23913View commit details -
Update TPC-DS plans (still flaky)
In cbo_query9.q.out we can observe that the spool operator changed again places. Why is that?
Configuration menu - View commit details
-
Copy full SHA for 8364d48 - Browse repository at this point
Copy the full SHA 8364d48View commit details -
Configuration menu - View commit details
-
Copy full SHA for 21e2743 - Browse repository at this point
Copy the full SHA 21e2743View commit details -
Use rowCount and rowSize to break ties across maximal CTEs mainly for…
… plan stability purposes
Configuration menu - View commit details
-
Copy full SHA for ad8db62 - Browse repository at this point
Copy the full SHA ad8db62View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d621ff - Browse repository at this point
Copy the full SHA 9d621ffView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7440f3b - Browse repository at this point
Copy the full SHA 7440f3bView commit details -
Avoid putting trivial CTEs in the registry
1. Do not add plain table scans since having tables appearing more than once in the query is pretty common. 2. Do not add simple Project + TableScan combinations since they are hardly ever useful. 3. Add only CTES rooted at a Join, Aggregate, Filter, and Project since anything else will not be able to be exploited at the moment (view based rewritting limitations).
Configuration menu - View commit details
-
Copy full SHA for 7d44cd3 - Browse repository at this point
Copy the full SHA 7d44cd3View commit details -
Update TPC-DS plans after changes
Not sure why query5.q.out had changes since cbo was left intact
Configuration menu - View commit details
-
Copy full SHA for 785b803 - Browse repository at this point
Copy the full SHA 785b803View commit details -
Configuration menu - View commit details
-
Copy full SHA for 38617ef - Browse repository at this point
Copy the full SHA 38617efView commit details -
Configuration menu - View commit details
-
Copy full SHA for 14e0fb3 - Browse repository at this point
Copy the full SHA 14e0fb3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2e2b0fc - Browse repository at this point
Copy the full SHA 2e2b0fcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 84b9dc3 - Browse repository at this point
Copy the full SHA 84b9dc3View commit details -
Consider hive.optimize.cte.materialize.full.aggregate.only in CBO CTE…
… selection 1. Add new CBO metadata classes for deriving if a CTE expression is a fully aggregate query. 2. Prune non fully aggregate CTE suggestions when the respective conf is enabled.
Configuration menu - View commit details
-
Copy full SHA for 75b079e - Browse repository at this point
Copy the full SHA 75b079eView commit details -
Refactor and document metadata classes for inferring if expression is…
… fully aggregated
Configuration menu - View commit details
-
Copy full SHA for 3af800c - Browse repository at this point
Copy the full SHA 3af800cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 196a965 - Browse repository at this point
Copy the full SHA 196a965View commit details -
Remove unused CTE scans from the plan
The CTE rewriting logic may add CTE table scans to the plan but this should always be accompanied by a Spool operator. If there is no spool operator there is no way to populate the content of the CTE suggestion thus it must be removed. The TableScanToSpoolRule alone is not enough to guarantee that there will not be orphan CTEs in the plan thus we need a rule to remove (or rather expand) those CTE scans without a corresponding Spool operator.
Configuration menu - View commit details
-
Copy full SHA for ec1c2ce - Browse repository at this point
Copy the full SHA ec1c2ceView commit details -
Table reference count must be constant for the spool rules to work co…
…rrectly Table reference count cannot rely on planner.getRoot() since rule applications will affect the metadata and thus the rules will fire incorrectly.
Configuration menu - View commit details
-
Copy full SHA for c5a8b88 - Browse repository at this point
Copy the full SHA c5a8b88View commit details -
Add suggester creating scans with disjunctive predicates
This is tailored around queries such as TPC-DS query9. It is a quick n dirty implem that probably will not make it in the final cut but it is useful for testing and experimentation
Configuration menu - View commit details
-
Copy full SHA for 3ae07ca - Browse repository at this point
Copy the full SHA 3ae07caView commit details -
Configuration menu - View commit details
-
Copy full SHA for 21c0d75 - Browse repository at this point
Copy the full SHA 21c0d75View commit details -
Remove redundant cte_cbo_rewrite_1 test case
There is nothing fancy or new in here.
Configuration menu - View commit details
-
Copy full SHA for ceaebca - Browse repository at this point
Copy the full SHA ceaebcaView commit details -
Configuration menu - View commit details
-
Copy full SHA for e1c79c9 - Browse repository at this point
Copy the full SHA e1c79c9View commit details -
Configuration menu - View commit details
-
Copy full SHA for c915219 - Browse repository at this point
Copy the full SHA c915219View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1287c2f - Browse repository at this point
Copy the full SHA 1287c2fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 07e87f9 - Browse repository at this point
Copy the full SHA 07e87f9View commit details -
Add Hook for discovering and materializing CTEs in queries without WI…
…TH clause The Hook sets aggressively the CTE materialization properties for all queries without an explicit WITH clause. It is mostly used for testing purposes to measure the impact of the new CTE suggestion/materialization logic without relying on the explicit presence of a WITH clause (which also suffers from few bugs e.g., HIVE-24167).
Configuration menu - View commit details
-
Copy full SHA for 687c1b3 - Browse repository at this point
Copy the full SHA 687c1b3View commit details -
Add unit tests for CommonTableExpressionIdentitySuggester using appro…
…vals framework & plus module
Configuration menu - View commit details
-
Copy full SHA for 0ed2f5c - Browse repository at this point
Copy the full SHA 0ed2f5cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 81b2a5d - Browse repository at this point
Copy the full SHA 81b2a5dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 79a3e4f - Browse repository at this point
Copy the full SHA 79a3e4fView commit details -
Update .q.out files after applying the CommonTableExpressionAutoTunin…
…gHook and keep notes about changes mapjoin_hint.q.out: constant_prop_3.q.out: notInTest.q.out: 1. Duplication is not obviously present in the initial SQL 2. Physical plan not better than SWO but could possibly replace the latter at the CBO level. correlationoptimizer3.q.out: 1. Materialization and reuse of join result (not exploited by SWO) masking_2.q.out: masking_12.q.out: * Materializes scan + filter * The new plan looks reasonable but not sure why SWO was not kicking in the initial plan. masking_10.q.out: * Materializes scan + filter * New plan has a cartesian product which is strange dynamic_partition_pruning.q.out: 1. Materiliazes scan+aggregate but this nukes out DPP pruning (not really sure if that helpful in this case). dynamic_semijoin_reduction_2.q.out: 1. Materializes scan+aggregate and leads to a different SJ than the original one explainuser_2.q.out: explainanalyze_2.q.out: * Materializes join between two tables filter_aggr.q.out: 1. OPTIMIZED SQL is not shown probably because we don't handel the spool operator 2. Seems to cancel some optimization with UNION_ALL and identical parts (NOT GOOD) groupby_sort_1_23.q.out: groupby_sort_skew_1_23.q.out 1. Materializes scan + aggregate 2. OPTIMIZED SQL does not show intersect_all.q.out: intersect_distinct.q.out: * Materializes join over two tables * Check if there is blocked optimizations due to INTERESECT with identical parts. offset_limit_ppd_optimizer.q.out: limit_pushdown.q.out: * Materializes scan + aggregate but interferes with limit pushdown optimization as it is right now. mrr.q.out: * Materializes scan + aggregate (CTE referenced 3 times) sharedwork.q.out: * Materializes scan + very simple filter (IS NOT NULL) * In such simple cases the materialization is probably useless. Do we gain anything from this? * Moreover the materilaized filter seems to remain in the final plan. sharedworkext.q.out: vectorized_multi_output_select.q.out: * Materializes a join (the case here is very similar to the main e2e test motivating this work) * The multiple reducers in the SWO plan is probably due to parallel edges problem; the temporary table materialization does not need the workaround although if we were going directly to the Operator tree we would need to do something similar. skewjoin_mapjoin7.q.out: * Materiliazes join (CTE reference twice by UNION ALL) * we have seen the same pattern multiple times smb_mapjoin_14.q.out: * Materializes scan + filter * seen this before subquery_ALL.q.out: subquery_ANY.q.out: * Duplication is not obviously present in the initial SQL * Materializes scan + aggregate (Expected and seen) subquery_multi.q.out: * The CBO plan shows materialization of a semijoin but the physical plan does not have this; definitely needs further investigation. subquery*: * Most materializations are of the form scan + aggregate usually with a IS NOT NULL filter union_remove*: * AS noted earlier there is an interference of the CTE materialization logic with the UNION Remove logic which operates at the physical level. * If we go from RelNode to Operator then maybe this becomes less of a problem but as it is the plans seem less efficient. clientnegative: Nothing worrisome; the failing vertex changes since CTE materialization adds additional operators to the plan. General notes: 5. Most queries with union or intersection have identical branches and in this case the CTE detection logic kicks in and generates scan+?filter+aggregate 4. Lineage shows the temporary table (e.g., union28.q.out) 3. I have to add some tests with subqueries since they exhibit implicit sharing. 1. In general there seems to be some optimizations missing in terms of UNION/INTERSECT ALL with identical branches 2. Since we are introducing a tmp table it is very likely that we are changing the data format. If the initial table is ORC we may materialize to TEXT and various other combinations which may not be performant. The latter may not be that relevant cause all operators writting to files use a specific format: File Output Operator compressed: false Statistics: Num rows: 493 Data size: 42891 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Configuration menu - View commit details
-
Copy full SHA for ad2ef62 - Browse repository at this point
Copy the full SHA ad2ef62View commit details -
SemanticException: View definition references temporary table
org.apache.hadoop.hive.ql.parse.SemanticException: View definition references temporary table default@cte_suggestion_0 at org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211) at org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:471) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:436) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:430) Reproducible using subquery_views.q and view_cast.q
Configuration menu - View commit details
-
Copy full SHA for 0d97e9e - Browse repository at this point
Copy the full SHA 0d97e9eView commit details -
AssertionError: Type mismatch when using MaterializedViewProjectFilte…
…rRule java.lang.AssertionError: Type mismatch: rel rowtype: RecordType(NULL int_col) NOT NULL equivRel rowtype: RecordType(BOOLEAN NOT NULL boolean_col, BOOLEAN NOT NULL literalTrue) NOT NULL at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) at org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:2193) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:580) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283) at org.apache.calcite.rel.rules.materialize.MaterializedViewRule.perform(MaterializedViewRule.java:454) at org.apache.calcite.rel.rules.materialize.MaterializedViewProjectFilterRule.onMatch(MaterializedViewProjectFilterRule.java:50) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229) at org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.rewriteUsingViews(CalcitePlanner.java:2113) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyCteRewriting(CalcitePlanner.java:2147) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1708) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1579) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1331) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:580) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13177) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:473) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) The AssertionError can be reproduced by running subquery_null_agg.q The problem happens due to two things: * the rule matches a plan with a filter condition that is simplified to false during the rewritting * there is a view (cte suggestion) that is basically a trivial project on top of the table CTE suggestions with just project+scan do not make much sense so we can drop them by tuning the CommonRelSubExprRegisterRule and workaround the problem for now. Depending on the bandwidth we may want to attack the bug in the MaterializedViewProjectFilterRule and make the latter more robust; that would be the actual fix.
Configuration menu - View commit details
-
Copy full SHA for 8967501 - Browse repository at this point
Copy the full SHA 8967501View commit details -
SemanticException: CREATE-TABLE-AS-SELECT creates a VOID type when CT…
…E suggestion contains untyped NULLs org.apache.hadoop.hive.ql.parse.SemanticException: CREATE-TABLE-AS-SELECT creates a VOID type, please use CAST to specify the type, near field: int_col at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8391) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.deriveFileSinkColTypes(SemanticAnalyzer.java:8350) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:7901) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11645) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11508) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12444) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12310) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:645) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13177) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:473) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.materializeCTE(CalcitePlanner.java:1069) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2389) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2337) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2500) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2337) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2500) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2337) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2500) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2322) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:642) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13177) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:473) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:471) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:436) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:430) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) ~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257) ~[hive-cli-4.1.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201) ~[hive-cli-4.1.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127) ~[hive-cli-4.1.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425) ~[hive-cli-4.1.0-SNAPSHOT.jar:?] The problem can be reproduced using subquery_null_agg.q when CTE suggestions are used but can also be seen for any CTAS query with untyped NULLs. ``` create table testctas1 (id int); create table testctas3 as select 1, 2, NULL, 4 as ncol from testctas1; ``` Since this is a limitation with CTAS we have to filter out CTEs suggestions that contain untyped NULLs in the result type.
Configuration menu - View commit details
-
Copy full SHA for 11e5b2b - Browse repository at this point
Copy the full SHA 11e5b2bView commit details -
Update internal_interval in query32,92 outputs
This is probably caused by the rebase on master and changes affecting the parser.
Configuration menu - View commit details
-
Copy full SHA for 9107806 - Browse repository at this point
Copy the full SHA 9107806View commit details -
SemanticException: Ambiguous table alias since references to CTE (WIT…
…H clause) have the same alias The problem can be reproduced using join0.q and the full stack trace is shown below. org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Ambiguous table alias 'cte_suggestion_0' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processTable(SemanticAnalyzer.java:1167) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processJoin(SemanticAnalyzer.java:1679) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:1899) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:2113) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:1754) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:636) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13177) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:474) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519)
Configuration menu - View commit details
-
Copy full SHA for 02cd53e - Browse repository at this point
Copy the full SHA 02cd53eView commit details -
UnsupportedOperationException when serializing Spool to JSON
The problem can be reproduced by running join0.q java.lang.UnsupportedOperationException: type not serializable: LAZY (type org.apache.calcite.rel.core.Spool.Type) at org.apache.calcite.rel.externalize.RelJson.toJson(RelJson.java:319) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelJson.toJson(HiveRelJson.java:46) at org.apache.calcite.rel.externalize.RelJsonWriter.put(RelJsonWriter.java:83) at org.apache.calcite.rel.externalize.RelJsonWriter.explain_(RelJsonWriter.java:66) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelJsonImpl.explain_(HiveRelJsonImpl.java:59) at org.apache.calcite.rel.externalize.RelJsonWriter.done(RelJsonWriter.java:128) at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:246) at org.apache.calcite.rel.externalize.RelJsonWriter.explainInputs(RelJsonWriter.java:91) at org.apache.calcite.rel.externalize.RelJsonWriter.explain_(RelJsonWriter.java:69) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelJsonImpl.explain_(HiveRelJsonImpl.java:59) at org.apache.calcite.rel.externalize.RelJsonWriter.done(RelJsonWriter.java:128) at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:246) at org.apache.calcite.rel.externalize.RelJsonWriter.explainInputs(RelJsonWriter.java:91) at org.apache.calcite.rel.externalize.RelJsonWriter.explain_(RelJsonWriter.java:69) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelJsonImpl.explain_(HiveRelJsonImpl.java:59) at org.apache.calcite.rel.externalize.RelJsonWriter.done(RelJsonWriter.java:128) at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:246) at org.apache.calcite.rel.externalize.RelJsonWriter.explainInputs(RelJsonWriter.java:91) at org.apache.calcite.rel.externalize.RelJsonWriter.explain_(RelJsonWriter.java:69) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelJsonImpl.explain_(HiveRelJsonImpl.java:59) at org.apache.calcite.rel.externalize.RelJsonWriter.done(RelJsonWriter.java:128) at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:246) at org.apache.calcite.rel.externalize.RelJsonWriter.explainInputs(RelJsonWriter.java:91) at org.apache.calcite.rel.externalize.RelJsonWriter.explain_(RelJsonWriter.java:69) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelJsonImpl.explain_(HiveRelJsonImpl.java:59) at org.apache.calcite.rel.externalize.RelJsonWriter.done(RelJsonWriter.java:128) at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:246) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelOptUtil.toJsonString(HiveRelOptUtil.java:1073) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:669) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13177) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:474) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519)
Configuration menu - View commit details
-
Copy full SHA for 58bff69 - Browse repository at this point
Copy the full SHA 58bff69View commit details -
Update q.out files after fixing Spool serialization and alias generat…
…ion during AST conversion
Configuration menu - View commit details
-
Copy full SHA for cab67ca - Browse repository at this point
Copy the full SHA cab67caView commit details
Commits on Apr 9, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b39878a - Browse repository at this point
Copy the full SHA b39878aView commit details -
Configuration menu - View commit details
-
Copy full SHA for b8fbd71 - Browse repository at this point
Copy the full SHA b8fbd71View commit details -
Configuration menu - View commit details
-
Copy full SHA for e39bd25 - Browse repository at this point
Copy the full SHA e39bd25View commit details -
Configuration menu - View commit details
-
Copy full SHA for 23e9f4f - Browse repository at this point
Copy the full SHA 23e9f4fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 93e5f50 - Browse repository at this point
Copy the full SHA 93e5f50View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8d8a2be - Browse repository at this point
Copy the full SHA 8d8a2beView commit details
Commits on Apr 17, 2024
-
Configuration menu - View commit details
-
Copy full SHA for acae14f - Browse repository at this point
Copy the full SHA acae14fView commit details -
IndexOutOfBoundsException in MaterializedViewAggregateRule due to une…
…xpected output from union rewriting program The cte_cbo_iobe_mv_union_rewrite file contains a repro of the problem: java.lang.IndexOutOfBoundsException: Index: 0 at java.util.Collections$EmptyList.get(Collections.java:4456) at org.apache.calcite.rel.AbstractRelNode.getInput(AbstractRelNode.java:143) at org.apache.calcite.rel.rules.materialize.MaterializedViewAggregateRule.rewriteQuery(MaterializedViewAggregateRule.java:250) at org.apache.calcite.rel.rules.materialize.MaterializedViewRule.perform(MaterializedViewRule.java:374) at org.apache.calcite.rel.rules.materialize.MaterializedViewOnlyAggregateRule.onMatch(MaterializedViewOnlyAggregateRule.java:68) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229) at org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.rewriteUsingViews(CalcitePlanner.java:2114) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyCteRewriting(CalcitePlanner.java:2152) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1750) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1580) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1332) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:581) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13177) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:474) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519) If the input to the MV rule is a combination of Aggregate + Scan and there is a registered MV that qualifies for union rewriting then an IOBE is thrown; the result of the union rewriting program is a Scan operator that does not have any inputs. The IOBE is triggered only during the CTE rewrite phase in cases where the HiveAggregateProjectMergeRule has fired before. In normal MV rewrite this cannot happen since HiveAggregateProjectMergeRule is applied after the MV rewrite.
Configuration menu - View commit details
-
Copy full SHA for 476301a - Browse repository at this point
Copy the full SHA 476301aView commit details
Commits on Apr 18, 2024
-
SemanticException: Line 0:-1 Ambiguous table alias when there are sel…
…f-joins of CTE/MV/Table When CTES/MVs are in use, and the plan contains self-joins with the same table/cte/mv the resulting AST does not have the expected shape so we endup with ambiguity when creating the AST from the RelNode. The auto_smb_mapjoin_14.q and other tests are failing with errors similar to the one below: org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Ambiguous table alias 'cte_suggestion_0' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processTable(SemanticAnalyzer.java:1167) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processJoin(SemanticAnalyzer.java:1679) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:1899) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:2113) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:1754) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:636) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13177) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:474) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107)
Configuration menu - View commit details
-
Copy full SHA for 6450967 - Browse repository at this point
Copy the full SHA 6450967View commit details