Skip to content

Commit 819788f

Browse files
authored
[MINOR] Remove repetitive words in docs (#10844)
Signed-off-by: studystill <[email protected]>
1 parent 3698d49 commit 819788f

File tree

6 files changed

+6
-6
lines changed

6 files changed

+6
-6
lines changed

hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/HoodieCatalystPlansUtils.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ trait HoodieCatalystPlansUtils {
144144
def createMITJoin(left: LogicalPlan, right: LogicalPlan, joinType: JoinType, condition: Option[Expression], hint: String): LogicalPlan
145145

146146
/**
147-
* true if both plans produce the same attributes in the the same order
147+
* true if both plans produce the same attributes in the same order
148148
*/
149149
def produceSameOutput(a: LogicalPlan, b: LogicalPlan): Boolean
150150
}

hudi-common/src/main/java/org/apache/hudi/common/bloom/InternalBloomFilter.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,7 @@ public String toString() {
199199
}
200200

201201
/**
202-
* @return size of the the bloomfilter
202+
* @return size of the bloomfilter
203203
*/
204204
public int getVectorSize() {
205205
return this.vectorSize;

hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/sort/SortOperator.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ public void open() throws Exception {
100100

101101
collector = new StreamRecordCollector<>(output);
102102

103-
// register the the metrics.
103+
// register the metrics.
104104
getMetricGroup().gauge("memoryUsedSizeInBytes", (Gauge<Long>) sorter::getUsedMemoryInBytes);
105105
getMetricGroup().gauge("numSpillFiles", (Gauge<Long>) sorter::getNumSpillFiles);
106106
getMetricGroup().gauge("spillInBytes", (Gauge<Long>) sorter::getSpillInBytes);

hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ import scala.util.{Failure, Success, Try}
6363
* who's directory level is 3).We can still read it as a partitioned table. We will mapping the
6464
* partition path (e.g. 2021/03/10) to the only partition column (e.g. "dt").
6565
*
66-
* 3、Else the the partition columns size is not equal to the partition directory level and the
66+
* 3、Else the partition columns size is not equal to the partition directory level and the
6767
* size is great than "1" (e.g. partition column is "dt,hh", the partition path is "2021/03/10/12")
6868
* , we read it as a Non-Partitioned table because we cannot know how to mapping the partition
6969
* path with the partition columns in this case.

rfc/rfc-76/rfc-76.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Let's consider following scenario: while persisting the dataset, writing one of
6161
To provide for aforementioned requirement of the records obtaining globally unique synthetic keys either of the 2 following properties have to hold true:
6262
Key generation has to be deterministic and reproducible (so that upon Spark retries we could be certain same records will be obtaining the identity value they did during previous pass)
6363
Records have to be getting globally unique identity value every time (such that key collisions are simply impossible)
64-
Note that, deterministic and reproducible identity value association is is only feasible for the incoming datasets represented as "determinate" RDDs. However, It's worth pointing out that other RDD classes (such as "unordered", "indeterminate") are very rare occurrences involving some inherent non-determinism (varying content, order, etc), and pose challenges in terms of their respective handling by Hudi even w/o auto-generation (for ex, for such RDDs Hudi can't provide for uniqueness guarantee even for "insert" operation in the presence of failures).
64+
Note that, deterministic and reproducible identity value association is only feasible for the incoming datasets represented as "determinate" RDDs. However, It's worth pointing out that other RDD classes (such as "unordered", "indeterminate") are very rare occurrences involving some inherent non-determinism (varying content, order, etc), and pose challenges in terms of their respective handling by Hudi even w/o auto-generation (for ex, for such RDDs Hudi can't provide for uniqueness guarantee even for "insert" operation in the presence of failures).
6565
For achieving our goal of providing globally unique keys we're planning on relying on the following synthetic key format comprised of 2 components
6666
(Reserved) Commit timestamp: Use reserved commit timestamp as prefix (to provide for global uniqueness of rows)
6767
Row id: unique identifier of the row (record) w/in the provided batch

scripts/pr_compliance.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ def test_title():
108108
# #
109109
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
110110

111-
#Enums for the the outcome of parsing a single line
111+
#Enums for the outcome of parsing a single line
112112
class Outcomes:
113113
#error was found so we should stop parsing and exit with error
114114
ERROR = 0

0 commit comments

Comments
 (0)