Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Apply Visual Vocabulary to vizro-ai #1059

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

lingyielia
Copy link
Contributor

@lingyielia lingyielia commented Mar 10, 2025

Description

How does vivivo become available in vizro-ai?

Assuming with current approach, whenever we make vizro-ai releases, run

# under vizro-core
hatch run examples:gen-vivivo

to generate and copy the latest visual_vocabulary.json to vizro-ai directory.

How is vizro-ai leveraging vivivo now?

  1. A list of example chart types is populated from vivivo and added into the field chart_type of BaseChartPlan
  2. A dictionary of {chart_type: example_code} is populated from vivivo to provide code suggestion/augment to vizro-ai

For example,
augment=False:
Screenshot 2025-03-13 at 00 46 51

augment=True:
Screenshot 2025-03-13 at 00 47 08


TODOs:

  • Decide whether rules like color preference should be enforced even when augment=False

LIMITATIONS of vizro-ai:

  • Because the code generated by .plot needs to be compatible with .dashboard, we enforced that the code should always in this format
def custom_chart(data_frame):
    fig = < do something >
    return fig

This means the vizro-ai generated code couldn't be as flexible as Visual Vocabulary. e.g.,

def waterfall(
data_frame: pd.DataFrame,
x: str,
y: str,
measure: list[str],
):

Screenshot

Notice

  • I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":

    • I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
    • I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorized to submit this contribution on behalf of the original creator(s) or their licensees.
    • I certify that the use of this contribution as authorized by the Apache 2.0 license does not violate the intellectual property rights of anyone else.
    • I have not referenced individuals, products or companies in any commits, directly or indirectly.
    • I have not added data or restricted code in any commits, directly or indirectly.

@github-actions github-actions bot added the Vizro-AI 🤖 Issue/PR that addresses Vizro-AI package label Mar 10, 2025
Copy link
Contributor

github-actions bot commented Mar 12, 2025

View the example dashboards of the current commit live on PyCafe ☕ 🚀

Updated on: 2025-03-14 00:14:56 UTC
Commit: de77f27

Compare the examples using the commit's wheel file vs the latest released version:

vizro-core/examples/scratch_dev

View with commit's wheel vs View with latest release

vizro-core/examples/dev/

View with commit's wheel vs View with latest release

vizro-core/examples/visual-vocabulary/

View with commit's wheel vs View with latest release

vizro-core/examples/tutorial/

View with commit's wheel vs View with latest release

vizro-ai/examples/dashboard_ui/

View with commit's wheel vs View with latest release

@lingyielia lingyielia marked this pull request as ready for review March 13, 2025 05:05
Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just skimmed through at a high-level and generally looks good! 🙂 I just left a few suggestions.

I like the idea of including the "how should the graph be used" notes. Does it actually improve behaviour of the model?

@lingyielia
Copy link
Contributor Author

I like the idea of including the "how should the graph be used" notes. Does it actually improve behaviour of the model?

The "#### What is..." and "When should I use it?" are not used yet. I extracted them anyway because they looks informative and could be useful if we ever need to build an agent to assist chart selection, for example

Copy link
Contributor

@maxschulz-COL maxschulz-COL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to skim over the vizro-ai part (not the json creation) and I think it's a bit late for me as I am struggling to make full sense of it :( Will have to wait for Monday if you would like a review from me...

In general I would suggest making types a little clearer, for example what data am I providing where. That might make it a bit easier for me to digest.

General questions:

  • I take it you have opted for the route we discussed, where we do not package vivivo, but we provide it as an occasionally updated file?
  • you have taken the route of rerunning the request again if the augment is true?

As for the second point, I may have an alternative: have you tested how fast in terms of latency a single short LLM request is that just asks for the chart type, and gives as context the avaialble charts from vivivo (maybe with when to use info), but of course allows model to go beyond if nothing fits.

If that is fast, I'd argue why not do that first, and then send a single enhanced request that sends the example code for that chart type - and not have the model produce code twice.

If that works well, maybe we can even scrap augment? If it's hardly slower, then why not. We even have the minimal argument, which still makes it super quick

@lingyielia
Copy link
Contributor Author

lingyielia commented Mar 14, 2025

  • I take it you have opted for the route we discussed, where we do not package vivivo, but we provide it as an occasionally updated file?
  • you have taken the route of rerunning the request again if the augment is true?

As for the second point, I may have an alternative: have you tested how fast in terms of latency a single short LLM request is that just asks for the chart type, and gives as context the avaialble charts from vivivo (maybe with when to use info), but of course allows model to go beyond if nothing fits.

If that is fast, I'd argue why not do that first, and then send a single enhanced request that sends the example code for that chart type - and not have the model produce code twice.

If that works well, maybe we can even scrap augment? If it's hardly slower, then why not. We even have the minimal argument, which still makes it super quick

Yes! you and @antonymilne suggested similar approach. And I really like the byproduct (the JSON representation of vivivo), which I believe could be useful in other genai settings. In terms of when to update the file, we could make hatch run examples:gen-vivivo as an additional step in vizro-ai release process.

I thought about the alternative and I decided not to pursue it somehow. Let me give it another try. Either I will remember why I quit last time or prove it's a good route. You can take another look on Monday.

@maxschulz-COL
Copy link
Contributor

I thought about the alternative and I decided not to pursue it somehow. Let me give it another try. Either I will remember why I quit last time or prove it's a good route. You can take another look on Monday.

Ok interesting. Yes thinking about it a little more, I think that could save some tokens and LLM confusion. Let me know once you have tried, or if you would like to catch-up to discuss this a bit more. I saw the tomorrow is not possible anymore!

@@ -125,14 +125,20 @@ dependencies = [
"plotly==6.0.0" # to leverage new MapLibre features in visual-vocabulary,
]
installer = "uv"
scripts = {example = "cd examples/{args:scratch_dev}; python app.py"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I wasn't clearer before, but this line should stay in the default environment; otherwise we won't be able to do hatch run example and would need to do hatch run examples:example instead, which is annoying (or I'm just lazy).

The idea of the jobs that are in the default environment that refer to other environments is that they're shortcuts like hatch run lint that we run often so we don't have to explicitly specify the environment every time. For things like gen-vivivo that aren't run all the time we don't need the top-level wrapper, but for example we do still want it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole hatch run examples:gen-vivivo flow feels much cleaner and simpler now there's just one hatch command for it all though, I like it 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still run

hatch run example
hatch run example visual-vocabulary

because in default env there is a shortcut command

example = "hatch run examples:example {args:scratch_dev}"  # shortcut script to underlying example environment script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you like me to replace this command in default env:

example = "hatch run examples:example {args:scratch_dev}"  # shortcut script to underlying example environment script.

with this command:

example = "cd examples/{args:scratch_dev}; python app.py"

so we can remove this line in example env:

example = "cd examples/{args:scratch_dev}; python app.py"

@@ -143,7 +143,7 @@ class ChartGroup:


part_to_whole_intro_text = """
#### Part-to-whole helps you show how one whole item breaks down into its component parts. If you consider the size of\
#### Part-to-whole helps you show how one whole item breaks down into its component parts. If you consider the size of \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? Does it still render ok or has is squashed two words together?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was squashing two words together and cause linting error in the output json file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Vizro-AI 🤖 Issue/PR that addresses Vizro-AI package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants