-
I have started prototyping a feature that parses natural time such as "tomorrow at 3pm". Thanks to petit parser, this feature is simpler than it would have been otherwise :) Now I'm looking to add timezone/location support such as "tomorrow at 3pm california time". Are there any performance considerations if we are looking at a list of ~1000-2000 locations? I'm guessing would have something like: Are parsers optimized for this sort of matching? There'll be some sort of compact representation? We will be running this parser on every keystroke in a TextField but the text field content we parse won't get that large (probably no more than 500 characters). Forgive my ignorance on how parsers work because petit parser is my first time using a parser in production! Side note: It seems like I'll need to sort the words by length descending to ensure something like "india" doesn't prematurely match on "indiana"? |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments
-
The code is not automatically optimized to match a large number of choices as described in the example. PetitParser will literally do a case-insensitive prefix match for every single location in your list. Though I don't think this is necessarily a problem and I wouldn't worry about optimizing at this point. Get it working correctly first, and then — if necessary — profile and improve the performance as needed. I see various ways of effectively optimizing:
|
Beta Was this translation helpful? Give feedback.
-
Thanks for the ideas. I might later give a combination of 2+3 a try even if just as an exercise. Would that be a good example parser subclass with similar goals? And how do
But then why is |
Beta Was this translation helpful? Give feedback.
-
Yes, the |
Beta Was this translation helpful? Give feedback.
-
Ok great! Sounds like a plan of action (if I need to optimize) would be something like:
Then in
|
Beta Was this translation helpful? Give feedback.
-
Yes, that sounds like a good plan. |
Beta Was this translation helpful? Give feedback.
-
I implemented an It's using a naive algorithm of iterating word by word. But if I want to optimize further, I could replace the import 'package:petitparser/petitparser.dart';
import 'package:collection/collection.dart' show equalsIgnoreAsciiCase;
Parser<String> anyStringIgnoringCase(Iterable<String> strings) =>
AnyStringParser(
strings,
equalsIgnoreAsciiCase,
'words expected',
);
Parser<String> anyString(Iterable<String> strings) =>
AnyStringParser(strings, (a, b) => a == b, 'words expected');
typedef StringEquals = bool Function(String a, String b);
class AnyStringParser extends Parser<String> {
AnyStringParser(
Iterable<String> words,
this.equals,
this.message,
) : assert(predicate != words, 'words must not be null'),
assert(equals != null, 'equals must not be null'),
assert(message != null, 'message must not be null'),
words = _sortedByLengthDescThenName(words);
final List<String> words;
final String message;
final StringEquals equals;
@override
Result<String> parseOn(Context context) {
final start = context.position;
final buffer = context.buffer;
final bufferLength = context.buffer.length;
for (final w in words) {
if (start + w.length > bufferLength) continue;
if (equals(buffer.substring(start, start + w.length), w)) {
final result = buffer.substring(start, start + w.length);
return context.success(result, start + result.length);
}
}
return context.failure(message);
}
@override
String toString() => '${super.toString()}[$message]';
@override
AnyStringParser copy() => AnyStringParser(words, equals, message);
@override
bool hasEqualProperties(AnyStringParser other) =>
super.hasEqualProperties(other) &&
words == other.words &&
equals == other.equals &&
message == other.message;
}
List<String> _sortedByLengthDescThenName(Iterable<String> strings) {
return strings.toList()
..sort((a, b) {
return a.length == b.length
? a.compareTo(b)
: b.length.compareTo(a.length);
});
} Feedback appreciated :) |
Beta Was this translation helpful? Give feedback.
-
Looks great! Probably using |
Beta Was this translation helpful? Give feedback.
-
Good point so we don't keep allocating strings. Looks like Comment from the code: /// If two strings differ only on the case of ASCII letters, the one with the
/// capital letter at the first difference will compare as less than the other
/// string. This tie-breaking ensures that the comparison is a total ordering
/// on strings and is compatible with equality. But I found there is an Thanks! |
Beta Was this translation helpful? Give feedback.
The code is not automatically optimized to match a large number of choices as described in the example. PetitParser will literally do a case-insensitive prefix match for every single location in your list. Though I don't think this is necessarily a problem and I wouldn't worry about optimizing at this point. Get it working correctly first, and then — if necessary — profile and improve the performance as needed. I see various ways of effectively optimizing: