-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port-StringTools #56
Port-StringTools #56
Conversation
include/StringTools.h
Outdated
/// @brief Checks if a character is a space. | ||
/// @param ToCheck The character to be checked. | ||
/// @return Returns true if the character is a space, false otherwise. | ||
Boole MEZZ_LIB IsSpace(const Char8 ToCheck); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have signed and unsigned versions of this? I get that Cha8 is our type and we will set it appropriately, but someone using and external type could benefit.
EDIT - Same question applies to any function accepting a char.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be an enormous amount of extra boilerplate and I'm not sure what the use case for it would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. I suggested it for the cornercase where the game dev cared about signedness of their chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made all those functions templates with a type trait for sanity.
include/StringTools.h
Outdated
Boole MEZZ_LIB IsTab(const Char8 ToCheck); | ||
/// @brief Checks if a character is a newline. | ||
/// @param ToCheck The character to be checked. | ||
/// @return Returns true if the character is a newline, false otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do with a \r? It is part of a newline on some platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the source, the function checks for both \r and \n.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was implying that should be in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
include/StringTools.h
Outdated
/// @brief Gets a copy of a String with the first letter of each word upper cased and other letters lower cased. | ||
/// @param Source The String that will be the basis for the modified String. | ||
/// @return Returns a copy of the String that has been modified to be in Camel Case. | ||
String MEZZ_LIB CamelCaseCopy(String Source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be PascalCase or DragonCase. See #57
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
include/StringTools.h
Outdated
/// @param Source The String that will be the basis for the modified String. | ||
/// @param Left Whether or not to trim the left side of the String. | ||
/// @param Right Whether or not to trim the right side of the String. | ||
String MEZZ_LIB TrimCopy(String Source, const Boole Left = true, const Boole Right = true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am used to seeing this as 3 different functions instead of parameters on one. LeftTrim, RightTrim and Trim.
Also a version of this could operate on iterator ranges like std::erase and return an iterator to the new end of the range or the end of the erased range,.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified it to have 3 versions. The Non-copy version operates like std::erase as you suggested.
include/StringTools.h
Outdated
void MEZZ_LIB RemoveDuplicateWhitespaces(String& Source); | ||
/// @brief Gets a copy of a string with all duplicate whitespaces removed. | ||
/// @param Source The String that will be the basis for the modified String. | ||
String MEZZ_LIB RemoveDuplicateWhitespacesCopy(String Source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again there could be an iterator range version like std::erase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified.
include/StringTools.h
Outdated
/// @param Delims The characters to look for and use as split points in the source String. | ||
/// @param MaxSplits The maximum number of splits to perform on this String. Value of zero means unlimited splits. | ||
/// @return Returns a vector containing all the substrings generated from the source String. | ||
StringVector MEZZ_LIB Split(const String& Source, const String& Delims = " \t\n", const Whole MaxSplits = 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a matching: String Join(StringVector ToJoin, const String& Delim=" ");
Js, Ruby, Python, and PHP all have some function to match their split/explode functions that join groups into a single string following a simple pattern of ToJoin[0] + Delim +ToJoin[1] + Delim ... ToJoin[size-1];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do now!
/////////////////////////////////////////////////////////////////////////////// | ||
// Convert-From-String Functions | ||
|
||
/// @brief Converts a String to any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this needs the phrase "Lexicographical Conversion"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you highlighted multiple parts here....do you mean the section header or the method brief doc?
StringStream Converter; | ||
Converter << ToConvert; | ||
return Converter.str(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should This delegate to std::to_string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. std::to_string is intended for numeric conversions. Whereas this will work with any type that has a streaming operator.
src/StringTools.cpp
Outdated
const String Delims = " \t\r"; | ||
if( Left ) { | ||
size_t Forward = 0; | ||
while( Forward < Source.size() ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like a reasonable implementation. Why not use std::find_first_of or std::find_if_not to find the range for std::erase.
Also, this is just conjecture, but I cannot imagine that the branching helps with optimization, if this were just a call RightTrim and LeftTrim then the game dev could decide which of the 3 to call and just never branch.
EDIT - Strings have a find_first_not_of on them. This could make ltrim super easy to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation has been updated as per a previous review comment.
src/StringTools.cpp
Outdated
CurrIndex = Source.find_first_of(Delims,CurrIndex,2) ) | ||
{ | ||
size_t EndIndex = CurrIndex; | ||
Char8 CurrWhite = Source[EndIndex]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this work on a string with several alternating tabs and spaces in between words?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, nor was it meant to when it was written. It could be updated to handle such cases...although I'm curious what the use case for that would be and which of the alternating whitespace characters should be prioritized when replacing the sequence with one character.
Real ConvertHexToColourChannel(const StringView Hex) | ||
{ | ||
if( Hex.size() != 2 ) { | ||
String ExceptionString("Hex code requires 2 characters to express a Colour channel."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CSS has color hex code triplets. It just omits the lower precision number. If a single digit is passed we could simply duplicate that digit before resuming.
return Ret; | ||
} | ||
|
||
Ret.reserve( MaxSplits ? MaxSplits + 1 : 10 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case for limiting the amount of splits? I have not seen this before.
EDIT - I get this is just reserving and the capacity, but it I clicked add comment before I finished thinking about where to put this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value is no maximum. I want to say I have used the maximum somewhere in the Mezzanine, but I can't recall where. I think this is a relic from Ogre, but unlike other Ogre relics this is doing no harm.
Boole ConvertToBool(const StringView ToConvert, const Boole Default) | ||
{ | ||
auto Checker = [ToConvert](const StringView Str) { | ||
if( Str.size() == ToConvert.size() ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better as a call to std::equal which likely does a range length check on its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given it's parameters, I don't see how it could:
https://en.cppreference.com/w/cpp/algorithm/equal
} | ||
|
||
template<> | ||
UInt8 ConvertFromString<UInt8>(const StringView ToConvert) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this specialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implicit conversions turn it into an int, which produces unexpected results when interacting with streams. The specializations work around that.
test/StringToolsTests.h
Outdated
TEST_EQUAL("IsLowerHexLetter(const_Char8)-Fail",false,StringTools::IsLowerHexLetter('g')); | ||
TEST_EQUAL("IsUpperHexLetter(const_Char8)-Pass",true,StringTools::IsUpperHexLetter('F')); | ||
TEST_EQUAL("IsUpperHexLetter(const_Char8)-Fail",false,StringTools::IsUpperHexLetter('1')); | ||
TEST_EQUAL("IsHexLetter(const_Char8)-Pass",true,StringTools::IsHexLetter('B')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a lowercase 'b' test expecting a true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
test/StringToolsTests.h
Outdated
|
||
{//String Manipulation | ||
String ToUpperString("This is typed in a reasonable, soft-spoken tone."); | ||
String ToUpperResult("THIS IS TYPED IN A REASONABLE, SOFT-SPOKEN TONE."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/Nit Results for test should be const, the strings to be copied could be too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I got them all but I made a large number of the source strings const.
TEST_EQUAL_EPSILON("ConvertHexToColourChannel(const_StringView)-Fourth", | ||
Real(6.0 / 255.0),StringTools::ConvertHexToColourChannel(HexStr4)); | ||
|
||
Real ChannelVal1 = 0.8675f;//221.2125 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/Nit - The fractions above are more readable in my opinion.
StringVector ISplitResult = StringTools::Split(SplitSourceString,"i",5);// Should normally split 7 times. | ||
TEST_EQUAL("Split(const_String&,const_String&,const_Whole)-DefaultCount", | ||
size_t(11),DefaultSplitResult.size()); | ||
TEST_EQUAL("Split(const_String&,const_String&,const_Whole)-DefaultElement1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mixes two different tests. It is a minor nit, but 2 sets of tests, one testing the split char and another testing split limit would better follow best practices. That said, this is simple enough its not a real issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two sets of tests for Split as is. One uses the default value for splits and the other doesn't. Although the one that doesn't (the ISplit tests) also happens to use a different delimiter. I guess it's still not ideal, but we do have one default value test and one non-default value test.
test/StringToolsTests.h
Outdated
String ThirdSource("Test\\Data\\Results.txt"); | ||
String FourthSource("*Star*"); | ||
|
||
String FirstPattern("Test/Data/*.txt"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pattern names that indicate what they match would improve readabilty. This could "PatternMatchesFirst" or "PatternMatchesUnixTextFile" for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
TEST_EQUAL("ConvertToBool(const_StringView,const_Boole)-Default", | ||
true,StringTools::ConvertToBool("Yes")); | ||
TEST_EQUAL("ConvertToBool(const_StringView,const_Boole)-Gibberish", | ||
false,StringTools::ConvertToBool("Gibberish")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a ("gibberish", true) test should be added to specifically test the defaulting behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
Replaced all Size Modifying String Manipulations that accepted a String with versions that accept an iterator range. Renamed CamelCase to Pascal Case. Added minor documentation. Changed String parameters in Split to StringViews.
Latest round of tests only failed because of CountedPtr TestWarns. |
More TestWarn "failures". |
A port of the StringTools from the monolithic repo.