Add block #436

JohnCoene · 2024-10-15T14:03:19Z

We have an issue in the add block, where it does not the right blocks or does not show any block available when it should.

See {blockr.composer} register: https://github.com/BristolMyersSquibb/blockr.composer/blob/master/R/register.R

I would expect to be able to add a new_freq_block after a new_create_table_block since their input and output match but when I have a new_create_table_block I'm unable to add any block after it.

The text was updated successfully, but these errors were encountered:

DivadNojnarg · 2024-10-15T16:06:36Z

Thanks. I can reproduce it.

DivadNojnarg · 2024-10-16T06:55:55Z

Could you add the table_block class to new_create_table_block in the register function? https://github.com/BristolMyersSquibb/blockr.composer/blob/master/R/register.R#L9.

JohnCoene · 2024-10-16T08:56:01Z

No, it does not use that class. the table_block is used for the stubs that come after it, namely to control the rendering of the table in the block which cannot be done in new_create_table_block.

See here, it does not use that class.

DivadNojnarg · 2024-10-16T12:44:54Z

The issue is at this line of get_compatible_blocks():

last_blk_output <- registry[grep(cls, registry$classes), "output"]

There isn't any match between cls (that are the classes of the block) and registry$classes that are the class written during the block registration. A short term workaround: you can add the transform_block class to new_create_table_block.

blockr::register_block(
  new_create_table_block,
  "Clinical table",
  "Create a clinical table",
  input = "parent_data",
  output = "clinical_table",
  package = "blockr.composer",
  classes = c("create_table_block", "transform_block")
)

In the long run, the registry has to be slightly adjusted. For instance, I am not convinced that the registry classes are useful, given that this information can be extracted after constructing the block with the class attribute. This can even lead to more troubles: imagine a select block (select_block, transform_block) where you can pass whatever class in the registry (like data_block ... by mistake), that does not add any information and we can't rely on that.

JohnCoene · 2024-10-16T13:01:50Z

Why are we comparing on classes? I thought this is why we defined input and output.

Can we not just remove this check on classes instead? I'm not sure why it's there.

Wasn't the logic:

No block on stack: show blocks where input NA
Show blocks that have input that matches last block in stack output

nbenn · 2024-11-04T12:29:00Z

@JohnCoene the problem here is the following I believe: In order to check whether a new block can be inserted in a (tail) position we have to check whether the input type of the new block matches the output type of the previous.

In order to do so, we have to be able to look up the registry entry of the previous block. And this is what I am attempting in #445, essentially by using a "registry ID" and saving that with the block.

One issue wich such an approach (cc @DivadNojnarg) is that we introduce a dependency (for a serialized block) on the runtime (registry) state from when/where the block was created.

This will cause issues if for example a block was added to the registry in one session, but is not available in the session where the block is being restored.

nbenn · 2024-11-05T10:35:40Z

Another issue: blocks that are not created in-app, but eg via set-up as

new_stack(
  new_dataset_block("iris")
)

are missing the required "registry link" as well. In light of these issues, it might make more sense to store the block attributes we require over the entire block-lifetime directly with the block and set this up during object construction.

We keep all registry-concerned metadata with the registry (such as name, description, category), but move input/output to the block definition. This will involve a breaking change to how blocks are defined. I see two ways of doing this:

This can happen directly as part of the block ctor definition, by modifying calls to new_block(), something like

new_some_block <- function() {
  ...
  new_block(
    ...,
    input = "data.frame",
    output = "data.frame"
  )
}

We can offer a utility, something like block_constructor() as

new_some_block <- block_constructor(
  constructor = function() {
    ...
    new_block(
      ...
    )
  },
  input = "data.frame",
  output = "data.frame"
)

Another approach could be via a new set of generics ofc. We could have something like
```
block_input.some_block <- function() "data.frame"
block_output.some_block <- function() "data.frame" 
```
This would have the advantage that we can leverage inheritance directly, e.g. have methods for transform_block and that way it would not have to be explicit for any of the existing blocks.

As for the input/output specification, we could improve this as well. We could do something like

input is a predicate function, e.g. function(x) inherits(x, "data.frame")
output is a ptype, something like data.frame()

That way the "backend" just has to output of the new block into input of the previous and that's all.

@DivadNojnarg any thoughts? I'm kind of leaning towards the generic approach as we're using this pattern in many places already and this seem to be the least "intrusive" change.

DivadNojnarg · 2024-11-05T11:00:48Z

@nbenn

From where would you call block_input.some_block in 3?
I don't see how 2 can be used. Seems like a major breaking change.

Overall, I'd prefer if we could use generics to have more customisation options. Could we have a mix of 1 and 3 like:

new_block <- function(fields, expr, name = rand_names(),
                      submit = FALSE, ...,
                      class = character()) {
  stopifnot(
    is.list(fields),
    all(lgl_ply(fields, is_field)),
    is.language(expr),
    is_string(name)
  )

  if (is.na(submit)) {
    submit <- 0
  } else {
    submit <- if (submit) 1 else -1
  }

  blk <- structure(fields,
    name = name, expr = expr, result = NULL, submit = submit, ...,
    class = c(class, "block")
  )
  attr(blk, "input") <- block_input(blk)
  attr(blk, "output") <- block_output(blk)
  blk
}

nbenn · 2024-11-05T12:00:31Z

From where would you call block_input.some_block in 3?

Anywhere we need to. For example in get_compatible_blocks(), we could iterate through the registry

for (x in get_registry()) {
  identical(block_input(x()), block_output(stack[[length(stack)]]))
}

to check which new blocks are compatible with the current last stack block.

I don't see how 2 can be used. Seems like a major breaking change.

Whenever a block constructor is defined. And yes that'd be a major breaking change. But the same holds true for option (1). That's why I am leaning towards (3).

DivadNojnarg · 2024-11-05T13:28:28Z

Ok, 3 seems fine to me, as we expose a better API around inputs/outputs. With such approach we also don't need all the work around block id, and we can probably drop the get_registry() function?

JohnCoene added the critical label Oct 15, 2024

nbenn mentioned this issue Nov 7, 2024

Persist registry ID in block #445

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add block #436

Add block #436

JohnCoene commented Oct 15, 2024

DivadNojnarg commented Oct 15, 2024

DivadNojnarg commented Oct 16, 2024

JohnCoene commented Oct 16, 2024

DivadNojnarg commented Oct 16, 2024

JohnCoene commented Oct 16, 2024

nbenn commented Nov 4, 2024

nbenn commented Nov 5, 2024 •

edited

Loading

DivadNojnarg commented Nov 5, 2024

nbenn commented Nov 5, 2024

DivadNojnarg commented Nov 5, 2024

Add block #436

Add block #436

Comments

JohnCoene commented Oct 15, 2024

DivadNojnarg commented Oct 15, 2024

DivadNojnarg commented Oct 16, 2024

JohnCoene commented Oct 16, 2024

DivadNojnarg commented Oct 16, 2024

JohnCoene commented Oct 16, 2024

nbenn commented Nov 4, 2024

nbenn commented Nov 5, 2024 • edited Loading

DivadNojnarg commented Nov 5, 2024

nbenn commented Nov 5, 2024

DivadNojnarg commented Nov 5, 2024

nbenn commented Nov 5, 2024 •

edited

Loading