Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: String2int error #58

Open
wants to merge 1 commit into
base: unstable
Choose a base branch
from

Conversation

hahahashen
Copy link
Collaborator

@hahahashen hahahashen commented Oct 16, 2024

解决issue #55
原来的String2int判断逻辑是通过错误码来进行判断,但是根据函数解释,"11 "会被认为转换成功,于是修改判断逻辑,改为是否都匹配上了,即ptr与s+slen的比较
image

修改后测试通过
image

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features
    • Introduced stricter string-to-integer conversion methods, enhancing validation for parsing.
    • Added new methods for scanning and renaming hash keys in the Redis implementation.
  • Bug Fixes
    • Improved error handling for hash operations, providing more specific error messages related to stale values and type mismatches.
  • Chores
    • General code cleanup for improved readability and maintainability.

Copy link

coderabbitai bot commented Oct 16, 2024

Walkthrough

The changes introduce new functions for stricter string-to-integer conversions in the pstd and storage namespaces. The String2intStrict function ensures that the entire string is consumed during parsing, while StrToInt64Strict provides similar functionality for 64-bit integers. Additionally, several enhancements are made to the Redis class in redis_hashes.cc, including new methods for scanning and renaming hash keys, improved error handling, and general code cleanup.

Changes

File Change Summary
src/pstd/pstd_string.h Added int String2intStrict(const char* s, size_t slen, T* val) for stricter string-to-integer conversion.
src/storage/include/storage/util.h Added int StrToInt64Strict(const char* s, size_t slen, int64_t* value) for stricter string-to-64-bit-integer conversion.
src/storage/src/redis_hashes.cc Enhanced Redis class with new methods: HScan, HScanx, HashesRename, HashesRenamenx. Updated HIncrby to use StrToInt64Strict. Improved error handling and code readability.
src/storage/src/util.cc Added int StrToInt64Strict(const char* s, size_t slen, int64_t* value) for stricter string-to-long-long conversion.

Poem

In the land of code, where strings do play,
A rabbit hops forth, with changes today.
From errors to endings, the logic now flows,
Parsing with ease, as the integer grows.
Hooray for the tweaks, let the functions align,
With each little hop, our code will shine! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 00ca779 and 4079e97.

📒 Files selected for processing (4)
  • src/pstd/pstd_string.h (1 hunks)
  • src/storage/include/storage/util.h (1 hunks)
  • src/storage/src/redis_hashes.cc (1 hunks)
  • src/storage/src/util.cc (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/pstd/pstd_string.h
  • src/storage/include/storage/util.h
  • src/storage/src/util.cc
🧰 Additional context used
🔇 Additional comments (1)
src/storage/src/redis_hashes.cc (1)

293-293: 🛠️ Refactor suggestion

Ensure consistent usage of StrToInt64Strict across the codebase

The function StrToInt64Strict is now used for strict integer parsing. Verify that other instances requiring strict parsing have been updated accordingly to prevent inconsistent behavior.

Run the following script to identify any remaining uses of StrToInt64:

✅ Verification successful

Consistent usage of StrToInt64Strict confirmed across the codebase.

All instances requiring strict integer parsing have been appropriately updated to use StrToInt64Strict, ensuring consistent behavior.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find occurrences of StrToInt64 in the codebase.

# Test: Search for usage of StrToInt64. Expect: Only valid uses remain.
rg 'StrToInt64\('

Length of output: 242


Script:

#!/bin/bash
# Description: Find all invocations of StrToInt64 in the codebase, excluding its definitions.

ast-grep --language cpp --pattern 'StrToInt64($_)' --type function_call

Length of output: 262


Script:

#!/bin/bash
# Description: Find all invocations of StrToInt64 in the codebase, excluding its definitions.

ast-grep --lang cpp --pattern 'StrToInt64($_, $_, $_)' --type function_call

Length of output: 271


Script:

#!/bin/bash
# Description: Find all invocations of StrToInt64 in the codebase, excluding its definitions.

ast-grep --lang cpp --pattern 'StrToInt64($_, $_, $_)'

Length of output: 56


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
src/pstd/pstd_string.h (1)

63-67: Approve changes with a minor suggestion for improvement

The modifications to the String2int function effectively address the issue mentioned in the PR objectives. The new implementation correctly ensures that the entire string is parsed, fixing the problem where strings like "11 " were incorrectly considered valid conversions.

The logic is now more accurate and aligns with the expected behavior of string-to-integer conversion.

Consider simplifying the return statement for better readability:

- if (ptr==s+slen) {
-   return 1;
- } else {
-   return 0;
- }
+ return (ptr == s + slen) ? 1 : 0;

This change maintains the same logic while making the code more concise.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between d65be5c and 26a9280.

📒 Files selected for processing (1)
  • src/pstd/pstd_string.h (1 hunks)
🧰 Additional context used
🔇 Additional comments (1)
src/pstd/pstd_string.h (1)

63-67: Verify impact on existing code

The changes to the String2int function improve its accuracy but may affect existing code that relies on the previous behavior.

To ensure a smooth transition, please run the following script to identify potential areas that might be affected by this change:

Review the output to identify any code that might need adjustment due to the changed behavior of String2int.

if (ec != std::errc()) {
return 0;
} else {
if (ptr == s + slen) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

特殊 case:
"123 "
期望:false
实际:true

Copy link
Collaborator

@luky116 luky116 Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string aa = "123 "
if(aa.trim().length != aa.length) {
return false;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template <std::integral T>
int String2intStrict(const char* s, size_t slen, T* val) {
    auto [ptr, ec] = std::from_chars(s, s + slen, *val);
    if (ec != std::errc()) {
        return 0;
    }
    if (ptr != s + slen) {
        return 0;
    }
    return 1;
}

如果期望 "123 " 这样的字符串是非法的, 可以加一个严格模式下的 string 2 int 函数

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的理解了

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
修改结果如下所示
image

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (9)
src/storage/include/storage/util.h (1)

20-20: LGTM! Consider adding a brief comment for clarity.

The addition of StrToInt64Strict is well-placed and aligns with the PR objectives. It provides a stricter alternative to StrToInt64, which should address the issue with partial string matches.

Consider adding a brief inline comment to clarify the difference between this function and StrToInt64. For example:

// Converts string to int64 with strict validation (entire string must match)
int StrToInt64Strict(const char* s, size_t slen, int64_t* value);
src/pstd/pstd_string.h (1)

70-83: Excellent addition of String2intStrict function.

The new String2intStrict function addresses the issue raised in previous comments and provides a stricter string-to-integer conversion. The implementation is correct and efficient, using std::from_chars and checking for complete string consumption.

Consider adding a brief example in the function documentation to illustrate the difference between String2int and String2intStrict:

/* Convert a string into an integral. Returns 1 if the entire string could be parsed
 * into an integer, 0 otherwise. The value will be set to
 * the parsed value when appropriate.
 * 
 * Example:
 *   String2int("123", 3, &val)     returns 1 (val = 123)
 *   String2int("123 ", 4, &val)    returns 1 (val = 123)
 *   String2intStrict("123", 3, &val)     returns 1 (val = 123)
 *   String2intStrict("123 ", 4, &val)    returns 0 (invalid)
 */

This addition would help developers quickly understand the difference between the two functions.

src/storage/src/util.cc (1)

40-43: LGTM! Consider clarifying the comment.

The new StrToInt64Strict function looks good and addresses the issue mentioned in the PR objectives. It provides a stricter conversion method, which should prevent incorrect conversions like "11 " that were problematic before.

Consider updating the comment to clarify the difference between this function and StrToInt64:

-/* Convert a string into a long long. Returns 1 if all char of string could be parsed
+/* Convert a string into a long long. Returns 1 if and only if all characters of the string could be parsed
  * into a (non-overflowing) long long, 0 otherwise. The value will be set to
- * the parsed value when appropriate. */
+ * the parsed value when successful. This function is stricter than StrToInt64. */
src/storage/src/redis_hashes.cc (6)

Line range hint 311-313: Ensure proper overflow checks when incrementing integers

The current overflow check in the condition:

if ((value >= 0 && LLONG_MAX - value < ival) || (value < 0 && LLONG_MIN - value > ival)) {
  return Status::InvalidArgument("Overflow");
}

may lead to incorrect results due to potential integer overflows during the calculation LLONG_MAX - value and LLONG_MIN - value. Consider rearranging the overflow checks to prevent this issue.

Apply this diff to correct the overflow checks:

-            if ((value >= 0 && LLONG_MAX - value < ival) || (value < 0 && LLONG_MIN - value > ival)) {
+            if ((value > 0 && ival > LLONG_MAX - value) || (value < 0 && ival < LLONG_MIN - value)) {

This adjustment ensures that the subtraction does not cause an overflow itself.


Line range hint 327-329: Return appropriate status for invalid float input

When StrToLongDouble fails to parse by as a valid floating-point number (returns -1), the function returns Status::Corruption("value is not a valid float"). It's more appropriate to use Status::InvalidArgument("value is not a valid float") to indicate invalid input from the user.

Apply this diff to correct the error status:

-      return Status::Corruption("value is not a valid float");
+      return Status::InvalidArgument("value is not a valid float");

Line range hint 345-347: Handle invalid stored float values appropriately

If the existing hash field's value cannot be parsed as a float by StrToLongDouble (returns -1), the code returns Status::Corruption("value is not a valid float"). Since this indicates invalid data rather than data corruption, consider returning Status::InvalidArgument("hash value is not a valid float") instead.

Apply this diff to adjust the error status:

-            return Status::Corruption("value is not a valid float");
+            return Status::InvalidArgument("hash value is not a valid float");

Line range hint 690-695: Replace rand() with C++11 <random> facilities for better randomness

The use of rand() for generating random indices may result in poor randomness and is not thread-safe. It's recommended to use the C++11 <random> library for improved randomness and thread safety.

Apply this diff to use std::mt19937 and std::uniform_int_distribution:

-        idxs.push_back(rand() % hlen);
+        {
+          static thread_local std::mt19937 generator(std::random_device{}());
+          std::uniform_int_distribution<uint32_t> distribution(0, hlen - 1);
+          idxs.push_back(distribution(generator));
+        }

Repeat this change in the loop below:

-        while (idxs.size() < -count) {
-          idxs.push_back(rand() % hlen);
+        while (idxs.size() < static_cast<size_t>(-count)) {
+          static thread_local std::mt19937 generator(std::random_device{}());
+          std::uniform_int_distribution<uint32_t> distribution(0, hlen - 1);
+          idxs.push_back(distribution(generator));
+        }

Line range hint 741-743: Check for division by zero in HRandField

In the calculation of distribution(0, hlen - 1), if hlen is zero, this will result in an invalid distribution range. Ensure that hlen is greater than zero before using it to prevent potential runtime errors.

Apply this diff to add a check for hlen == 0:

+    if (hlen == 0) {
+      return Status::NotFound("Hash is empty");
+    }

Line range hint 212-295: Consider code reuse between HIncrby and HIncrbyfloat

Both HIncrby and HIncrbyfloat functions have similar logic for handling increments. Refactoring common code into a shared helper function can improve maintainability and reduce code duplication.

🛑 Comments failed to post (1)
src/storage/src/redis_hashes.cc (1)

293-295: ⚠️ Potential issue

Use appropriate error status for invalid integer conversion

When StrToInt64Strict fails to convert old_value to an integer (returns 0), the function currently returns Status::Corruption("hash value is not an integer"). For invalid input errors, it's more suitable to return Status::InvalidArgument("hash value is not an integer") to indicate that the provided data is invalid rather than corrupted.

Apply this diff to correct the error status:

-            return Status::Corruption("hash value is not an integer");
+            return Status::InvalidArgument("hash value is not an integer");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

        if (StrToInt64Strict(old_value.data(), old_value.size(), &ival) == 0) {
          return Status::InvalidArgument("hash value is not an integer");
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants