-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
translation: update searching_algorithm_revisited.md #1559
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,48 +1,48 @@ | ||
# Search algorithms revisited | ||
|
||
<u>Searching algorithms (searching algorithm)</u> are used to search for one or several elements that meet specific criteria in data structures such as arrays, linked lists, trees, or graphs. | ||
<u>Searching algorithms (search algorithms)</u> are used to retrieve one or more elements that meet specific criteria within data structures such as arrays, linked lists, trees, or graphs. | ||
|
||
Searching algorithms can be divided into the following two categories based on their implementation approaches. | ||
Searching algorithms can be divided into the following two categories based on their approach. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 'Implementation approaches" is redundant (it says the same thing twice). We could also say 'based on their implementation'. |
||
- **Locating the target element by traversing the data structure**, such as traversals of arrays, linked lists, trees, and graphs, etc. | ||
- **Using the organizational structure of the data or the prior information contained in the data to achieve efficient element search**, such as binary search, hash search, and binary search tree search, etc. | ||
- **Using the organizational structure of the data or existing data to achieve efficient element searches**, such as binary search, hash search, binary search tree search, etc. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would propose this version:
The benefits are:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since the context is about the implementation, I think using a gerund here is better. This version may not be consistent with the former clause (Locating...). @thomasq0 |
||
It is not difficult to notice that these topics have been introduced in previous chapters, so searching algorithms are not unfamiliar to us. In this section, we will revisit searching algorithms from a more systematic perspective. | ||
These topics were introduced in previous chapters, so they are not unfamiliar to us. In this section, we will revisit searching algorithms from a more systematic perspective. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would use "have been discussed" instead of "introduced", which may be paired with "idea". |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
## Brute-force search | ||
|
||
Brute-force search locates the target element by traversing every element of the data structure. | ||
A Brute-force search locates the target element by traversing every element of the data structure. | ||
|
||
- "Linear search" is suitable for linear data structures such as arrays and linked lists. It starts from one end of the data structure, accesses each element one by one, until the target element is found or the other end is reached without finding the target element. | ||
- "Breadth-first search" and "Depth-first search" are two traversal strategies for graphs and trees. Breadth-first search starts from the initial node and searches layer by layer, accessing nodes from near to far. Depth-first search starts from the initial node, follows a path until the end, then backtracks and tries other paths until the entire data structure is traversed. | ||
- "Linear search" is suitable for linear data structures such as arrays and linked lists. It starts from one end of the data structure and accesses each element one by one until the target element is found or the other end is reached without finding the target element. | ||
- "Breadth-first search" and "Depth-first search" are two traversal strategies for graphs and trees. Breadth-first search starts from the initial node and searches layer by layer (left to right), accessing nodes from near to far. Depth-first search starts from the initial node, follows a path until the end (top to bottom), then backtracks and tries other paths until the entire data structure is traversed. | ||
|
||
The advantage of brute-force search is its simplicity and versatility, **no need for data preprocessing and the help of additional data structures**. | ||
The advantage of brute-force search is its simplicity and versatility, **no need for data preprocessing or the help of additional data structures**. | ||
|
||
However, **the time complexity of this type of algorithm is $O(n)$**, where $n$ is the number of elements, so the performance is poor in cases of large data volumes. | ||
However, **the time complexity of this type of algorithm is $O(n)$**, where $n$ is the number of elements, so the performance is poor with large data sets. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is not related to the translation. The performance is not BTW, it should be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There is no need to update here since the markdown has not been converted to HTML and rendered by mathjax.js yet. |
||
|
||
## Adaptive search | ||
|
||
Adaptive search uses the unique properties of data (such as order) to optimize the search process, thereby locating the target element more efficiently. | ||
An Adaptive search uses the unique properties of data (such as order) to optimize the search process, thereby locating the target element more efficiently. | ||
|
||
- "Binary search" uses the orderliness of data to achieve efficient searching, only suitable for arrays. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We may use "elements" instead of "data" here according to the context to avoid ambiguity. This may apply to the rest of the paragraphs. |
||
- "Hash search" uses a hash table to establish a key-value mapping between search data and target data, thus implementing the query operation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we may use "key" instead of "data" here. For example, "between search key and target entry" |
||
- "Tree search" in a specific tree structure (such as a binary search tree), quickly eliminates nodes based on node value comparisons, thus locating the target element. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the last part "which leads to better performance" is unnecessary and the two sentences before it (line 自适应搜索利用数据的特有属性(例如有序性)来优化搜索过程,从而更高效地定位目标元素。 “二分查找”利用数据的有序性实现高效查找,仅适用于数组。 此类算法的优点是效率高 ...... You can see that the explanation focuses on how different search methods locate the target element, rather than on their performance. Therefore, performance is not the focus, and the other two sentences do not mention it either. |
||
|
||
The advantage of these algorithms is high efficiency, **with time complexities reaching $O(\log n)$ or even $O(1)$**. | ||
|
||
However, **using these algorithms often requires data preprocessing**. For example, binary search requires sorting the array in advance, and hash search and tree search both require the help of additional data structures, maintaining these structures also requires extra time and space overhead. | ||
However, **using these algorithms often requires data preprocessing**. For example, binary search requires sorting the array in advance, and hash search and tree search both require the help of additional data structures. Maintaining these structures also requires more overhead in terms of time and space. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. both hash search and tree search require ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both are correct. |
||
|
||
!!! tip | ||
|
||
Adaptive search algorithms are often referred to as search algorithms, **mainly used for quickly retrieving target elements in specific data structures**. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. referred to as "lookup algorithm", otherwise, it is confusing. |
||
|
||
## Choosing a search method | ||
|
||
Given a set of data of size $n$, we can use linear search, binary search, tree search, hash search, and other methods to search for the target element from it. The working principles of these methods are shown in the figure below. | ||
Given a set of data of size $n$, we can use a linear search, binary search, tree search, hash search, or other methods to retrieve the target element. The working principles of these methods are shown in the figure below. | ||
|
||
![Various search strategies](searching_algorithm_revisited.assets/searching_algorithms.png) | ||
|
||
The operation efficiency and characteristics of the aforementioned methods are shown in the following table. | ||
The characteristics and operational efficiency of the aforementioned methods are shown in the following table. | ||
|
||
<p align="center"> Table <id> Comparison of search algorithm efficiency </p> | ||
|
||
|
@@ -55,23 +55,23 @@ The operation efficiency and characteristics of the aforementioned methods are s | |
| Data preprocessing | / | Sorting $O(n \log n)$ | Building tree $O(n \log n)$ | Building hash table $O(n)$ | | ||
| Data orderliness | Unordered | Ordered | Ordered | Unordered | | ||
|
||
The choice of search algorithm also depends on the volume of data, search performance requirements, data query and update frequency, etc. | ||
The choice of search algorithm also depends on the volume of data, search performance requirements, frequency of data queries and updates, etc. | ||
|
||
**Linear search** | ||
|
||
- Good versatility, no need for any data preprocessing operations. If we only need to query the data once, then the time for data preprocessing in the other three methods would be longer than the time for linear search. | ||
- Good versatility, no need for any data preprocessing operations. If we only need to query the data once, then the time for data preprocessing in the other three methods would be longer than the time for a linear search. | ||
- Suitable for small volumes of data, where time complexity has a smaller impact on efficiency. | ||
- Suitable for scenarios with high data update frequency, because this method does not require any additional maintenance of the data. | ||
- Suitable for scenarios with very frequent data updates, because this method does not require any additional maintenance of the data. | ||
|
||
**Binary search** | ||
|
||
- Suitable for large data volumes, with stable efficiency performance, the worst time complexity being $O(\log n)$. | ||
- The data volume cannot be too large, because storing arrays requires contiguous memory space. | ||
- Not suitable for scenarios with frequent additions and deletions, because maintaining an ordered array incurs high overhead. | ||
- Suitable for larger data volumes, with stable performance and a worst-case time complexity of $O(\log n)$. | ||
- However, the data volume cannot be too large, because storing arrays requires contiguous memory space. | ||
- Not suitable for scenarios with frequent additions and deletions, because maintaining an ordered array incurs a lot of overhead. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The first bullet point ('Suitable for large data volumes') seems to directly contradict the second bullet point ('The data volume cannot be too large') so it helps to clarify by adding 'However'. (I imagine taking notes and seeing 'Suitable for large data volumes' so I write that down -> 'Good for large data'. Then I see 'The data volume cannot be too large' so I write that down -> 'Not good for large data'. With 'However' we clarify that there is a balance we need to be aware of.) |
||
**Hash search** | ||
|
||
- Suitable for scenarios with high query performance requirements, with an average time complexity of $O(1)$. | ||
- Suitable for scenarios where fast query performance is essential, with an average time complexity of $O(1)$. | ||
- Not suitable for scenarios needing ordered data or range searches, because hash tables cannot maintain data orderliness. | ||
- High dependency on hash functions and hash collision handling strategies, with significant performance degradation risks. | ||
- Not suitable for overly large data volumes, because hash tables need extra space to minimize collisions and provide good query performance. | ||
|
@@ -80,5 +80,5 @@ The choice of search algorithm also depends on the volume of data, search perfor | |
|
||
- Suitable for massive data, because tree nodes are stored scattered in memory. | ||
- Suitable for maintaining ordered data or range searches. | ||
- In the continuous addition and deletion of nodes, the binary search tree may become skewed, degrading the time complexity to $O(n)$. | ||
- With the continuous addition and deletion of nodes, the binary search tree may become skewed, degrading the time complexity to $O(n)$. | ||
- If using AVL trees or red-black trees, operations can run stably at $O(\log n)$ efficiency, but the operation to maintain tree balance adds extra overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.