Skip to content

The engine should not attempt to force synchronization between relations and corresponding foreign keys #19788

Open
@PowerGamer1

Description

@PowerGamer1

Since PHP 8.2 has deprecated dynamic properties I had to switch to storing relation data in ActiveRecord model using ActiveRecord::populateRelation() method. This lead to surfacing of surprising problems in my code the root cause of which
was tracked to a feature introduced in #13618. While attempting to make a fix in #19785 I studied and tested the implementation of #13618 extensively and provide my findings and conclusion below.

The concept.

Yii2 allows to define foreign key dependencies between corresponding database tables and conveniently fetch data from related tables using lazy or eager loading.

Before the #13618, once the relation was loaded the engine never attempted to synchronize the values of foreign key in ActiveRecord model with the data of corresponding relation. EXCEPT a very single isolated case which provided such rudimentary synchronization: calling ActiveRecord::refresh(). That call simply unset all loaded relations (if used again the relations would be lazy loaded from DB and if they changed the user will get the updated data). This mechanic already had negative consequences in some cases, for example:

$transaction = app()->getDb()->beginTransaction();
$project = Project::find()->where(['id' => 123])->with('manager')->one();
$project->name = 'New name';
$project->updated_at = new \yii\db\Expression('CURRENT_TIMESTAMP');
$project->update(false, ['name', 'updated_at']);
$project->refresh(); // Fetch the value of $project->updated_at generated by DB.
$transaction->commit();
// Now lets notify $project->manager that project name has changed.
var_dump($project->isRelationPopulated('manager')); // false - the relation is unset ???
// Why is that - relation data didn't change! The engine FORCES me to run ANOTHER query into DB to fetch the SAME data AGAIN!
// And this time OUTSIDE the carefully placed transaction!

Later, the functionality of #13618 (referred to as "magic" below) attempted to introduce proper automatic synchronization between the foreign key values of a model and corresponding relations in all other cases.

$project = Project::findOne(123);
$project->manager; // Load and use relation.
$project->manager_id = $newManagerId; // Change the value of ID to a different one.

// BEFORE the "magic":
var_dump($project->manager->id == $project->manager_id); // false - relation still has its originally loaded data.

// AFTER the "magic":
var_dump($project->manager->id == $project->manager_id); // true - relation has new data - the "magic" works!

The analysis.

Unfortunately, the implementation of the "magic" has failed miserably in its goal and a few instances where it actually works come with significant performance and memory price to ActiveRecord. The following examples illustrate cases where "magic" currently fails to work correctly, has negative consequences or will be impossible to implement properly at all.

Currently, the "magic" simply does not work in many cases were it should:

Example 1. Relations loaded through with() aren't "magically" synced.

$project = Project::find()->where(['id' => 123])->with(['manager'])->one();
$project->manager; // Use relation.
$project->manager_id = $newManagerId; // Change the value of ID to a different one.
var_dump($project->manager->id == $project->manager_id); // false - surprise, surprise - the "magic" no longer works.

Example 2. Relations set manually through populateRelation() aren't "magically" synced.

$project = Project::findOne(123);
$manager = Manager::findOne(456);
$project->manager_id = $manager->id;
$project->populateRelation('manager', $manager); // Set and use relation.
$project->manager_id = $newManagerId; // Change the value of ID to a different one.
var_dump($project->manager->id == $project->manager_id); // false - surprise, surprise - the "magic" no longer works.

or "works" where it shouldn't (by unsetting synchronized relations):

Example 3. If the foreign key value remains the same but changes its type from int to string (for ex., submitted form data comes as string) the "magic" considers relation to be de-synced (leading to extra query into DB to fetch the same data).

$project = Project::findOne(123); // manager_id value comes from DB as int.
$project->manager; // Load and use relation.
$project->manager_id = (string)$project->manager_id; // Convert to string with the same value (happens, for ex., when loading submitted form data).
var_dump($project->isRelationPopulated('manager')); // false - the relation is unset. If relation is used again the same data will have to be fetched from DB with extra query.

Example 4. On ActiveRecord::refresh() all relations are unset regardless of synchronization state (see #19785).

or has other issues due to poor implementation quality:

Example 5. In some cases the housekeeping data of the "magic" lingers even when relation is no longer loaded.

// Lets say we have a primary model and the related model with composite FK:
//  hasOne(RelatedModel::class, ['related_model.a' => 'primary_model.a', 'related_model.b' => 'primary_model.b']).
$primaryModel = PrimaryModel::findOne(123);

$primaryModel->relatedModel; // Incurring a "housekeeping" memory cost here (in _relationsDependencies).
unset($primaryModel->relatedModel); // OK - the relation is gone and housekeeping data is gone too.

$primaryModel->relatedModel;
$primaryModel->b = null; // NOT OK - the relation is gone but a part of housekeeping data remained (in _relationsDependencies).

Upon deeper consideration it becomes obvious that the concept of "forced synchronization" is flawed and unachievable. It is because some desynchronized states are perfectly fine and unavoidable.

Example 6. The enforcement of the rule (unsetting relation when foreign key value changes) upon which the "magic" implementation is based interferes with user intentions.

Setting the relation manually (which actually is often very useful) in general case involves two separate steps:

  • setting the value of relation itself using populateRelation();
  • setting the value of corresponding foreign key attributes.
    These steps can be performed in any order and each one of them may be optional (why perform both steps if only relation OR the foreign key attribute is used later in the code).
// DISCLAIMER: This example assumes a potential bug-free implementation of "magic" from scratch was attempted!
$project = Project::findOne(123);
$project->manager; // Load and use relation.
// Now lets manually change the manager to a different one ...
$newManager = Manager::findOne(456);
$project->populateRelation('manager', $newManager); // ... by manually setting relation first ...
$project->manager_id = $newManager->id; // ... followed by manually setting the value of foreign key.
// So what should happen here? The value of manager_id has changed so "magic" should go and frigging nuke perfectly valid data in $project->manager the user just assigned a moment ago ??? Come on ...  
var_dump($project->isRelationPopulated('manager')); // false - the relation is unset - what the hell ???

Example 7. User prepares new data of primary model and its relations for insertion into the database.

Another case where "desynchronized" state is very inherent and is OK.

// DISCLAIMER: This example assumes a potential bug-free implementation of "magic" from scratch was attempted!
$project = new Project();
$project->populateRelation('jobs', []);
for($i=0; $i<3; $i++)
    $project->jobs[] = new Job();
$project->load(Yii::$app->getRequest()->post());
Job::loadMultiple($project->jobs, Yii::$app->getRequest()->post());
if($project->validate() && Model::validateMultiple($project->jobs))
{
    // Up until now the relations can be considered synchronized, i.e. $project->jobs[0]->project_id == null and $project->id == null.
    $project->insert(false);
    // Notice how related jobs are inherently desynchronized after the call to `insert()` because $project just now acquired its primary key value.
    // I.e. $project->jobs[0]->project_id is null but $project->id is no longer null.
    
    $project->refresh(); // Lets fetch CURRENT_TIMESTAMP values first and afterwards deal with inserting jobs.
    
    // So what should happen here? The call to refresh() (or maybe the call to insert()?) is supposed to unset all currently desynchronized relations, yes?
    // So lets go and frigging nuke perfectly valid data in $project->jobs the user just carefully prepared a moment ago ??? Come on ...  
    var_dump($project->isRelationPopulated('jobs')); // false - the relation is unset - what the hell ???
    
    // Also, notice how THIS time, the nuked relation data cannot be silently lazy-loaded from DB again, like in other examples, because it never even made it into the database in the first place! 
}

Example 8. The synchronization the "magic" is currently tries to achieve is very one-sided.

$project = Project::findOne(123);
$project->manager; // Load and use relation.

$project->manager_id = $newManagerId; // Change the value of ID to a different one.
var_dump($project->manager->id == $project->manager_id); // true - synchronization is "magically" enforced!

$project->manager->id = $anotherManagerId;  // The other side.
var_dump($project->manager->id == $project->manager_id); // false - ??? Desynchronization here is suddenly OK ??? 

Example 9. How should synchronization look when used together with other existing features of ActiveRecord - for ex. markAttributeDirty()?

$project->markAttributeDirty('manager_id');
// Now what? The attribute value didn't really change, should we unset the relation or not?

Example 10. How should synchronization look when used together with other existing features of ActiveRecord - for ex. fetching a PARTIAL set of attributes from DB (when COMPOSITE foreign key is fetches partially)?

I am sure there are many more examples like this and this is just a tip of the iceberg.

The price.

Another side of the story that the examples above do not show is the PRICE paid to achieve the "magic" (obvious from implementation code):

  1. The "magic" increases memory usage of ActiveRecord instances that have relations loaded (by adding yet another array _relationDependencies). Anyone who tried to handle a few thousand ARs would NOT be welcoming another memory footprint
    increase of AR that could have been easily avoided.
  2. The "magic" comes at additional performance cost present even when NO relations are used at all! For ex., when doing something as SIMPLE as assigning a value to ActiveRecord attribute (like $project->name = 'New name';).
  3. The "magic" further increases complexity of already complex, hard to understand and support machinery of ActiveRecord and related classes.

Summary and solution.

So, lets see what we currently have:

  1. The "magic" is not even documented anywhere other than a mention in CHANGELOG.md and UPGRADE.md.
  2. The "magic" is not working at all as it was supposed to in a lot of cases (see examples above). This, along with absent documentation, leads to a situation where it is completely unclear what engine behaviour Yii2 users are supposed to rely upon. The "relations are automatically synced" guarantee the "magic" was supposed to give does not hold.
  3. In cases where the "magic" does work it can lead to the same data being fetched from DB again, or, at worst, to losing the data before it was inserted into DB (see examples above).
  4. The "magic" comes at very undesirable performance and memory costs (see "The price" above).
  5. There are situations where the "magic" becomes unachievable or interferes (in a bad way) with other perfectly valid uses of ActiveRecord (see the last four examples above).

Now, lets take a look on a situation before enforcing of synchronization was attempted. Was it really so bad?

# THE SOLUTION. Making "magic" work without "magic".
$project = Project::findOne(123);
$project->manager; // Load and use relation.
$project->manager_id = $newManagerId; // Change the value of ID to a different one.
unset($project->manager); // <-- The TRUE "magic".
var_dump($project->manager->id == $project->manager_id); // true - the whole "magic" is achieved in single line with ZERO performance/memory costs !!!

The WHOLE "magic" above could have been easily achieved by something as simple as a single line of unset() in a user code when user expects ActiveRecord's foreign key attributes to change !!!

The conclusion.

The only sensible solution is giving up on an attempt to enforce automatic synchronization between relation and corresponding foreign key attributes. This means that the only guarantee the engine should provide is: once set ActiveRecord relations are never automatically or silently changed/unset by the engine. Which means it is user responsibility to ensure his data stays synchronized to whatever extent user deems necessary and to explicitly call unset() on relations that are expected to change and need to be reloaded.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions