Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix make space failed when space is available #304

Open
wants to merge 59 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
8206619
Folder: scripts, third_party, website, .github
zhejiangxiaomai May 8, 2023
9b0fd0a
Folder: common, connectors
zhejiangxiaomai May 8, 2023
09533ac
Folder: core, duckdb
zhejiangxiaomai May 8, 2023
f9ed9a3
Folder: dwio
zhejiangxiaomai May 8, 2023
f567ffd
Folder: exec
zhejiangxiaomai May 8, 2023
9a66846
Folder: expression
zhejiangxiaomai May 8, 2023
d4b81ae
Folder: functions
zhejiangxiaomai May 8, 2023
197d44a
Folder: row
zhejiangxiaomai May 8, 2023
8f8906f
Folder: substrait
zhejiangxiaomai May 9, 2023
9ae2cdb
Folder: type
zhejiangxiaomai May 9, 2023
06754fd
Folder: vector
zhejiangxiaomai May 9, 2023
b03f9ec
Add back not node (#228)
rui-mo May 10, 2023
fce80c4
enable all tests (#247)
rui-mo May 10, 2023
70b9cfb
comments unstable customPlanNodeWithExchangeClient (#248)
zhejiangxiaomai May 10, 2023
0b50cfd
Update build dependencies (#185)
ccat3z May 11, 2023
217a93b
fix code style. (#252)
Yohahaha May 12, 2023
c22fa76
Add mapping for bit_or and bit_and (#251)
Yohahaha May 15, 2023
b055ddd
Avoid include Abi.h twice (#253)
zhejiangxiaomai May 15, 2023
6a43f9a
add decimal column reader support (#254)
zuochunwei May 15, 2023
61ec655
Support timestamp reader (#205)
rui-mo May 16, 2023
e26f9ef
Fix the intermediate type of First/Last, and support decimal (#245)
Yohahaha May 16, 2023
b711c8e
Removed duplicated memory copy in "upcastScalarValues" (#256)
yimin-yang May 16, 2023
ff91ff0
Added RleEncoderV2 (#240)
yimin-yang May 17, 2023
bde7b6a
[GLUTEN-1434] Serialize and deserialize RowVector (#250)
jinchengchenghh May 17, 2023
812dbd5
Expand timestamps in page reader (#260)
rui-mo May 17, 2023
12be4e3
[GLUTEN-1638] Add Hdfs support in parquet write (#255)
JkSelf May 19, 2023
a159948
Fix the array out of bounds while getting offsets (#257)
jackylee-ch May 22, 2023
9817ce5
whitelist approx_distinct (#270)
zhli1142015 May 22, 2023
3f33535
Support hash for timestamp type (#269)
liujiayi771 May 23, 2023
8f969eb
Add long decimal type support for ORC (#271)
yimin-yang May 23, 2023
c8a6d55
Add processedStrides and processedSplits metrics (#264)
rui-mo May 24, 2023
2f954b1
Refine make decimal to align with spark sql (#272)
JkSelf May 24, 2023
5b1806e
Add hash seed parameter to sparksql hash functions (#275)
May 24, 2023
62570a9
Fix type check in MapFunction (#273)
rui-mo May 24, 2023
57ec320
Support spark asinh, acosh, atanh, sec, csc math functions (#274)
Yohahaha May 24, 2023
3e2b6f5
Create folder if not exits on HDFS write
JkSelf May 24, 2023
98d0451
Implement Spark's version of log2, log10 (#266)
zhztheplayer May 25, 2023
82ddf50
Implement Spark's version of atan2 (#263)
zhztheplayer May 25, 2023
0de5f0f
Fix replace SparkSQL function (#277)
izchen May 25, 2023
70898af
Align the implementation for ascii function with spark sql (#268)
PHILO-HE May 25, 2023
23c0569
Fix chr SparkSQL function (#278)
izchen May 25, 2023
97d2829
Fix semantic issues in cast function (#280)
PHILO-HE May 26, 2023
40fcf8f
Fix casting from string to decimal (#281)
rui-mo May 26, 2023
f258c6f
Fix casting from decimal to bool (#283)
rui-mo May 30, 2023
a808b04
remove log (#286)
Yohahaha May 31, 2023
d3da837
Support kPreceeding & kFollowing for window range frame type (#287)
PHILO-HE May 31, 2023
dba81cb
Fix the bug of orc reader test (#288)
zuochunwei May 31, 2023
551e1cd
Enable date type for kPreceeding & kFollowing window range bound (#291)
PHILO-HE Jun 1, 2023
f3d5e0f
Add spark comparison functions (#276)
yma11 Jun 2, 2023
b4f9103
remove unused DoubleValues (#292)
zhejiangxiaomai Jun 5, 2023
ff03bd6
Fix use pre-build arrow (#289)
Yohahaha Jun 6, 2023
5f450bd
Fallback timestamp sort (#295)
rui-mo Jun 6, 2023
2d0dd93
Use correct name in struct type (#297)
rui-mo Jun 8, 2023
e01c54f
[DWIO] refactor the reader of dwrf/orc (#261)
zuochunwei Jun 8, 2023
e3ec2b9
Fallback murmur3hash on complex types (#299)
rui-mo Jun 8, 2023
cd897b1
Optimize the search for bound index in window range frame (#300)
PHILO-HE Jun 9, 2023
7e73041
update dnf cache on centos (#302)
zhouyuan Jun 9, 2023
e595f78
fix make space failed when space is available
jackylee-ch Jun 9, 2023
0a72af2
refresh code
jackylee-ch Jun 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Folder: exec
main changes:
1. use companion function to support mixed aggregation steps.
2. add expand operator.
3. Set partial full when cardinality is high or memory exceeds.
4. Config option "join_spill_memory_threshold" (kJoinSpillMemoryThreshold) not working.
5. fix hashjoin runtime issue.
zhejiangxiaomai committed May 9, 2023
commit f567ffdda9117f36239834289cc6b2d1740a36b2
19 changes: 14 additions & 5 deletions velox/exec/Aggregate.cpp
Original file line number Diff line number Diff line change
@@ -15,6 +15,7 @@
*/

#include "velox/exec/Aggregate.h"
#include "velox/exec/AggregateFunctionAdapter.h"
#include "velox/exec/AggregateWindow.h"
#include "velox/expression/FunctionSignature.h"
#include "velox/expression/SignatureBinder.h"
@@ -36,7 +37,6 @@ AggregateFunctionMap& aggregateFunctions() {
return functions;
}

namespace {
std::optional<const AggregateFunctionEntry*> getAggregateFunctionEntry(
const std::string& name) {
auto sanitizedName = sanitizeName(name);
@@ -49,19 +49,28 @@ std::optional<const AggregateFunctionEntry*> getAggregateFunctionEntry(

return std::nullopt;
}
} // namespace

bool registerAggregateFunction(
const std::string& name,
std::vector<std::shared_ptr<AggregateFunctionSignature>> signatures,
AggregateFunctionFactory factory) {
AggregateFunctionFactory factory,
bool registerCompanionFunctions) {
auto sanitizedName = sanitizeName(name);

aggregateFunctions()[sanitizedName] = {
std::move(signatures), std::move(factory)};
aggregateFunctions()[sanitizedName] = {signatures, std::move(factory)};

// Register the aggregate as a window function also.
registerAggregateWindowFunction(sanitizedName);

// Register companion function if needed.
if (registerCompanionFunctions) {
// RegisterAdapter::registerPartialFunction(name, signatures);
RegisterAdapter::registerMergeFunction(name, signatures);
// RegisterAdapter::registerExtractFunction(name, signatures);
// TODO: register retract function only when the original UDAF supports
// retracting. RegisterAdapter::registerRetractFunction(name, signatures);
}

return true;
}

28 changes: 25 additions & 3 deletions velox/exec/Aggregate.h
Original file line number Diff line number Diff line change
@@ -87,7 +87,7 @@ class Aggregate {
// the row. Only applies to accumulators that store variable size data out of
// line. Fixed length accumulators do not use this. 0 if the row does not have
// a size field.
void setOffsets(
virtual void setOffsets(
int32_t offset,
int32_t nullByte,
uint8_t nullMask,
@@ -149,6 +149,22 @@ class Aggregate {
const std::vector<VectorPtr>& args,
bool mayPushdown) = 0;

virtual void retractIntermediateResults(
char** group,
const SelectivityVector& rows,
const std::vector<VectorPtr>& args,
bool mayPushdown) {
VELOX_NYI();
}

virtual void retractRawInput(
char** group,
const SelectivityVector& rows,
const std::vector<VectorPtr>& args,
bool mayPushdown) {
VELOX_NYI();
}

// Updates the single partial accumulator from raw input data for global
// aggregation.
// @param group Pointer to the start of the group row.
@@ -324,11 +340,14 @@ using AggregateFunctionFactory = std::function<std::unique_ptr<Aggregate>(
const std::vector<TypePtr>& argTypes,
const TypePtr& resultType)>;

/// Register an aggregate function with the specified name and signatures.
/// Register an aggregate function with the specified name and signatures. If
/// registerCompanionFunctions is true, also register companion aggregate and
/// scalar functions with it.
bool registerAggregateFunction(
const std::string& name,
std::vector<std::shared_ptr<AggregateFunctionSignature>> signatures,
AggregateFunctionFactory factory);
AggregateFunctionFactory factory,
bool registerCompanionFunctions = false);

/// Returns signatures of the aggregate function with the specified name.
/// Returns empty std::optional if function with that name is not found.
@@ -348,6 +367,9 @@ struct AggregateFunctionEntry {
AggregateFunctionFactory factory;
};

std::optional<const AggregateFunctionEntry*> getAggregateFunctionEntry(
const std::string& name);

using AggregateFunctionMap =
std::unordered_map<std::string, AggregateFunctionEntry>;

44 changes: 44 additions & 0 deletions velox/exec/AggregateFunctionAdapter.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include "velox/expression/FunctionSignature.h"

namespace facebook::velox::exec {

void addVariablesInTypeToList(
const TypeSignature& type,
const std::unordered_map<std::string, SignatureVariable>& allVariables,
std::unordered_map<std::string, SignatureVariable>& usedVariables) {
auto iter = allVariables.find(type.baseName());
if (iter != allVariables.end()) {
usedVariables.emplace(iter->first, iter->second);
}
for (const auto& parameter : type.parameters()) {
addVariablesInTypeToList(parameter, allVariables, usedVariables);
}
}

std::unordered_map<std::string, SignatureVariable> getUsedTypeVariables(
const std::vector<TypeSignature>& types,
const std::unordered_map<std::string, SignatureVariable>& allVariables) {
std::unordered_map<std::string, SignatureVariable> usedVariables;
for (const auto& type : types) {
addVariablesInTypeToList(type, allVariables, usedVariables);
}
return usedVariables;
}

} // namespace facebook::velox::exec
Loading