Skip to content

Conversation

@zzacharo
Copy link
Member

@zzacharo zzacharo commented Nov 25, 2025

closes #490
addresses inveniosoftware/rfcs#107

@zzacharo zzacharo force-pushed the comments-threading-join branch 3 times, most recently from 416f07a to 8db5551 Compare November 26, 2025 07:58
indexer_queue_name = "events"

# ResultItem configurations
links_item = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add replies to the links? This would make the code on the frontend more consistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these. Sorry to ask about this again, would it be possible to avoid showing the replies and reply links when the RequestEvent in question is a reply? Maybe using the when argument

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link works also for children as it is always point to the parent document. I did that so you dont have to check conditionally for the links. But I am happy to discuss why do you think it is better to remove it.

@zzacharo zzacharo force-pushed the comments-threading-join branch 3 times, most recently from 1771f8f to 8d82923 Compare November 26, 2025 15:39
@zzacharo zzacharo force-pushed the comments-threading-join branch 7 times, most recently from c8b7f5c to 3e745ac Compare November 27, 2025 16:40
@zzacharo zzacharo marked this pull request as ready for review November 27, 2025 16:42
Copy link
Member

@palkerecsenyi palkerecsenyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and it makes sense! It's definitely a lot simpler and cleaner

"parent_child": {
"type": "join",
"relations": {
"parent": "child"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also consider using less abstract names as in the OpenSearch example, maybe {"comment": "reply"}. Just to make it a bit more readable. But I think {"parent": "child"} is pretty good too so it's up to you!

Comment on lines +20 to +23
if event.parent_id is not None:
# Check if event type supports children
if hasattr(event, "type") and event.type:
if not getattr(event.type, "allow_children", False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could refactor this to use guard clauses instead for slightly better readability, just to avoid nesting the if statements. But this is also not too big of an issue

Comment on lines 110 to 114
request_type = current_request_type_registry.lookup(vars["request_type"])
vars.update({"id": obj.id, "request_id": obj.request_id})
parent_id = obj.parent_id if obj.parent_id else obj.id
vars.update({"parent_id": parent_id})
vars.update(request_type._update_link_config(**vars))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
request_type = current_request_type_registry.lookup(vars["request_type"])
vars.update({"id": obj.id, "request_id": obj.request_id})
parent_id = obj.parent_id if obj.parent_id else obj.id
vars.update({"parent_id": parent_id})
vars.update(request_type._update_link_config(**vars))
request_type = current_request_type_registry.lookup(vars["request_type"])
parent_id = obj.parent_id if obj.parent_id else obj.id
vars.update(
{"id": obj.id, "request_id": obj.request_id, "parent_id": parent_id}
)
vars.update(request_type._update_link_config(**vars))

Comment on lines +514 to +520
links_tpl=LinksTemplate(
self.config.links_replies,
context={
"parent_id": parent_id,
"request_id": str(request.id),
"args": params,
},
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be expressed as a call to self.links_tpl_factory

"""Log event type."""

type_id = "L"
"""Allow log events to preserve parent-child structure (e.g., deleted comments)."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this comment doesn't match the variable 🙈

Comment on lines +327 to +352
# Query structure:
# - must: filter to this request
# - must_not: exclude children (events with parent_id)
# - should: optionally add children preview via has_child + inner_hits
search = search.query(
"bool",
must=[
dsl.Q("term", request_id=str(request.id)),
],
must_not=[
dsl.Q("exists", field="parent_id"), # Exclude replies
],
should=[
dsl.Q(
"has_child",
type="child",
query=dsl.Q("match_all"),
score_mode="none",
inner_hits={
"name": "replies_preview",
"size": preview_size,
"sort": [{"created": "desc"}],
},
),
],
minimum_should_match=0, # Make should clause optional
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get what you meant by this being less code, even though we're using a relatively "complex" feature like join types.

My only concern is actually performance, i.e. is the overhead of two queries (one for fetching the first 10 parents + one agg query to fetch the last 5 comments for each parent), that much bigger? Insert/update performance is ok to be worse IMHO, since the UI anyways can handle this in an optimistic way.

I read the tests in the CodiMD, but wasn't convinced of a clear answer, so maybe that's the only thing to discuss IRL.

@zzacharo zzacharo force-pushed the comments-threading-join branch from 3e745ac to 05d8491 Compare November 28, 2025 11:03
Comment on lines 28 to 29
if hasattr(self._record, 'parent_id') and self._record.parent_id:
arguments['routing'] = str(self._record.parent_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice handling of this here, I thought it would have caused more problems in general.

One more consideration is to take into account bulk-indexing, which would now need to add the routing=<parent_id> param for all child comments/replies...

Would it make sense to bake this whole logic instead into the indexer? I think we have some functions there for pre/post-processing indexing args.

@zzacharo zzacharo force-pushed the comments-threading-join branch 3 times, most recently from ac48770 to 5c57b81 Compare November 28, 2025 13:26
@zzacharo zzacharo force-pushed the comments-threading-join branch from 5c57b81 to 88c4bb6 Compare November 28, 2025 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable single-threaded discussion per comment

3 participants