Performances and Multithreading #174

emabiz · 2023-04-11T15:20:28Z

Hi,
first of all thank you for your job.

I tried the library parsing nlohmannjson objects.
Performances are good parsing a lot of objects using one pre-created validator.
What is the expected behavior using the same validator by multiple threads that work concurrently?
Must be the access to validator locked?

Thank you

tristanpenman · 2023-04-11T22:23:09Z

Hi @emabiz, glad to hear you're finding the library useful.

I haven't done any testing with regard to concurrency. The Validator is more-or-less stateless. While it is validating a particular document, pretty much everything relevant to that document lives on the stack, or in short-lived heap allocations. However, there are two risks that come to mind:

The Validator does maintain a regex cache, which is not protected by any locking. While this may not cause issues in practice, this cannot be guaranteed.
Individual Parsers may behave unexpectedly if you were to validate the same document concurrently. i.e. Two threads using the same Validator, and same Parser Adapter.

In the case of 1), I believe that would be ease enough to address using std::mutex.
In the case of 2), that's already a bit of a degenerate case, so is probably not worth worrying about.

Hope that helps!

emabiz · 2023-04-12T07:40:59Z

Hi,
for 1. yes my idea is to protect the validator with a mutex. Can you confirm that there is no cache sharing among validators, then it is possible to create different validators and using them concurrently without protecting them with a global mutex?

Not clear the case 2.
My idea is to use the validator to check json coming from the network. Then each document is a different object.
I'd like to create valijson::Schema ,valijson::SchemaParser and valijson::Validator once and then use them for all received objects. Each object can be validated by a different thread.
Something like this:

MMValidator::MMValidator():
	_validator(valijson::Validator::kStrongTypes)
{
	json schemaDocument = json::object({
		{"$schema","https://json-schema.org/draft/2020-12/schema"},
		{"type" , "object" },
		...
		});

	// Parse the json schema into an internal schema format
	
	valijson::adapters::NlohmannJsonAdapter schemaDocumentAdapter(schemaDocument);
	try {
		_parser.populateSchema(schemaDocumentAdapter, _schema);
	}
	catch (std::exception& e) {
		wm::LogManager::error(_log_target, fmt::format("Failed to parse schema: {}", e.what()));		
	}
	
}

bool MMValidator::validate(json targetDocument)
{
	bool valid = false;
	valijson::ValidationResults results;
	valijson::adapters::NlohmannJsonAdapter targetDocumentAdapter(targetDocument);
	{
		std::lock_guard<std::mutex> lock(_mutex);		
		valid = _validator.validate(_schema, targetDocumentAdapter, &results);
	}
	
	if (!valid) {
		valijson::ValidationResults::Error error;
		while (results.popError(error)) {
			std::string context;
			std::vector<std::string>::iterator itr = error.context.begin();
			for (; itr != error.context.end(); itr++) {
				context += *itr;
			}
			wm::LogManager::error(_log_target, fmt::format("invalid mm {}, context={}, error={}", targetDocument.dump(), context, error.description));
		}
	}
	return valid;
}

Also not sure if the mutex can be released immediately after the validate, or if it must be locked also during results error context manipulation.

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performances and Multithreading #174

Performances and Multithreading #174

emabiz commented Apr 11, 2023

tristanpenman commented Apr 11, 2023

emabiz commented Apr 12, 2023 •

edited

Loading

Performances and Multithreading #174

Performances and Multithreading #174

Comments

emabiz commented Apr 11, 2023

tristanpenman commented Apr 11, 2023

emabiz commented Apr 12, 2023 • edited Loading

emabiz commented Apr 12, 2023 •

edited

Loading