Skip to content

[ENG-10028] SHARE is not consistently indexing OSF content#11671

Open
mkovalua wants to merge 1 commit intoCenterForOpenScience:feature/pbs-26-6from
mkovalua:fix/ENG-10028-pbs-26-6
Open

[ENG-10028] SHARE is not consistently indexing OSF content#11671
mkovalua wants to merge 1 commit intoCenterForOpenScience:feature/pbs-26-6from
mkovalua:fix/ENG-10028-pbs-26-6

Conversation

@mkovalua
Copy link
Copy Markdown
Contributor

Ticket

Purpose

Content on the OSF is not consistently being SHARE indexed. Newly created content on the OSF does not seem to be consistently being automatically indexed in SHARE, some content seems to be indexed but other content is not. There does not seem to be any way to discern what content has been indexed vs not indexed . The inability of some content to be re-indexed in admin, this is causing significant issues for OSF users, their content is not Discoverable on the OSF.

Changes

Take code from the PR

#11631 to avoid so much merge conflicts to solve with newpbs-26-6 target branch

Side Effects

QE Notes

CE Notes

Documentation

Copy link
Copy Markdown
Collaborator

@adlius adlius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment and some questions.

recatalog(queryset, start_id, chunk_count, chunk_size)


def get_not_indexed_guids_for_resource_with_no_indexed_guid(resource_type: str, first_guid: bool = True):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of the argument first_guid could be a bit more descriptive. How about something like only_oldest_guid?

def mark_indexing_failed(self):
self.has_been_indexed = False
from addons.osfstorage.models import OsfStorageFile
if isinstance(self, OsfStorageFile):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doe we need special casing for OstStorageFile?

Comment on lines +2570 to +2571
has_been_indexed = models.BooleanField(default=None, null=True, blank=True, db_index=True)
date_last_indexed = models.DateTimeField(null=True, blank=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean these two fields would only be populated for the objects that are indexed after this PR is released? What happen to the objects that were indexed before this PR is merged/released?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants