Terminal colors #905

hudson-ai · 2024-06-18T23:24:29Z

This PR:

Automatically detects whether the user is running in the command-line or in a notebook
- When in a notebook, echo=True gives a pretty HTML display
- When in the command-line, raw text is printed to stdout (no more <IPython.core.display.HTML object>!)
Uses ANSI color codes to color generated text
- Green by default
- If colored package is installed (added as optional dependency), colors will depend on log-probabilities of generated tokens

Thank you @dimondjik for your implementation!

In action:

hudson-ai · 2024-06-18T23:35:27Z

guidance/models/_model.py

    from IPython.display import clear_output, display, HTML
-
-    ipython_is_imported = True
 except ImportError:
    ipython_is_imported = False


Do we still need this, or is it replaced by notebook_mode?

hudson-ai · 2024-06-18T23:36:50Z

guidance/models/_model.py

+    _ipython = get_ipython()
+    notebook_mode = (
+        _ipython is not None
+        and "IPKernelApp" in _ipython.config


Based on the way that tqdm.auto determines notebook context -- would be good to test this on multiple machines/platforms...

We had to tackle this detection back in InterpretML too, here's how we did it there: https://github.com/interpretml/interpret/blob/develop/python/interpret-core/interpret/provider/_environment.py. We could re-use some of this logic here perhaps

cc @nopdive

Sorry, hadn't been watching my notifications -- code seems reasonable. Yes, this should be tested (either manually/automatically is fine) on multiple platforms before merge: terminal (Window/Linux/Mac), vscode, jupyter notebook/lab, azure/google/amazon/databricks notebooks. We should be relatively okay if these target environments work.

hudson-ai · 2024-06-18T23:37:16Z

@Harsha-Nori @dimondjik :)

hudson-ai · 2024-06-19T00:55:12Z

setup.py

@@ -37,6 +37,7 @@
    "openai": ["openai>=1.0"],
    "schemas": ["jsonschema"],
    "server": ["fastapi", "uvicorn"],
+    "cli": ["colored"]


Not a super widely-used package, so I am open to alternative solutions if there are any concerns.

I've used colorama before, which claimed better Windows support if I remember right?

@nopdive might have thoughts here

colorama is a great library for printing "named" ANSI codes (like "green"), but afaik it doesn't support mapping rgb triples to ANSI codes. However, as you note, colorama does have better windows support via their init or just_fix_windows_console functions.

We can manually put together a lookup table if we only want to support a few shades from red to green or something like that, but I'll also take a look at some of the other libraries that colorama itself recommends :)

\begin{mutter}If you refactor so that notebook output and console output are separate implementations of the same base class, this problem largely goes away\end{mutter}

The problem still stands of choosing a way to map probabilities to colors, but yes at the very least the problem of parsing a particular rgba value goes away :)

@Harsha-Nori, I'm not familiar with colored, and colorama in its current state: not fussed which package is used as long as its maintained and we can confirm it works for our target set of environments.

Harsha-Nori · 2024-06-19T12:17:01Z

So happy to see this! I think it looks wonderful.

Would appreciate @nopdive 's review given Sam's experience in notebook env detection

Also, just a thought -- might be fun to have the commit co-authored by @dimondjik if comfortable!

riedgar-ms · 2024-06-19T12:42:00Z

Could this be refactored? It looks like it has a potentially fragile regex for parsing out styling information, and is sitting as another stage in the pipeline.

Should Engine acquire another property such as output_processor which would be some abstract base class (or rather, some reasonable Python facsimile thereof). Wherever our current code writes some output, it calls the appropriate method on the class. We then have implementations of the output_processor for Python Notebooks, console output etc.

This would also let us put all the styling information in the output_processor itself, rather than have it scattered across the main codebase.

hudson-ai · 2024-06-19T16:34:47Z

Could this be refactored? It looks like it has a potentially fragile regex for parsing out styling information, and is sitting as another stage in the pipeline.

Should Engine acquire another property such as output_processor which would be some abstract base class (or rather, some reasonable Python facsimile thereof). Wherever our current code writes some output, it calls the appropriate method on the class. We then have implementations of the output_processor for Python Notebooks, console output etc.

This would also let us put all the styling information in the output_processor itself, rather than have it scattered across the main codebase.

Reasonable concerns and a nice suggestion. It would be nice to encapsulate the existing pretty-output code regardless. I'll see what I can do 😎

hudson-ai · 2024-06-19T16:39:10Z

So happy to see this! I think it looks wonderful.

Would appreciate @nopdive 's review given Sam's experience in notebook env detection

Also, just a thought -- might be fun to have the commit co-authored by @dimondjik if comfortable!

@Harsha-Nori do you know if it's possible to add a co-author during the squash-and-merge process? Otherwise, I really think that the first commit on this PR was 99.99% @dimondjik ... maybe I can amend the history of this branch to properly credit them as an author/contributor :)

dimondjik · 2024-06-19T18:46:59Z

Nice! First of all, I wasn't expecting it to get anywhere at all, I'm glad I could help!
@hudson-ai I thank you for the PR from all the people working in terminal and not seeing colors like Notebook people XD

I've read the thread and I absolutely 100% sure that @riedgar-ms 's suggestion is the way.

But, I live by the motto "don't touch something that works", so from my perspective trying to rewrite something in Engine, when we can just add extra layer (yes, it is making the output somewhat slow, sadly), so two and a half programmers can see colors in terminal when debugging. I believe in production - colors in terminal is not that important. And also I am too lazy to dig inside the code.

So, I'm not here to just talk, when I was playing with the llm I found out that this approach "eats" any html-like tag, even generated by the llm. I.e. Mistral's "<s></s>".

So here's my local version I was updating for myself:

class ModelStateHTMLParser:
    def __init__(self):
        self.__prev_data_len = 0

        self.__ptr = (
            re.compile(r"<\|\|_html:([\S\s]+?)_\|\|>|([\S\s]+?)(?=<\|\|_html:|\Z)"))

        self.__color_param_ptr = (
            re.compile(r"rgba\(([0-9]+?\.??[0-9]*?),\s*?([0-9]+?\.??[0-9]*?),\s*?([0-9]+?\.??[0-9]*?)"))

    def feed(self, data):
        data_trunc = data[self.__prev_data_len:]
        self.__prev_data_len = len(data)

        for res in self.__ptr.finditer(data_trunc):
            if res.group(1) is not None:
                try:
                    r, g, b = [round(float(n)) for n in self.__color_param_ptr.search(res.group(1)).groups()]
                    print('\033[38;2;{};{};{}m'.format(r, g, b), end='', flush=True)
                except AttributeError:
                    print('\033[0m', end='', flush=True)
            if res.group(2) is not None:
                print(res.group(2), end='', flush=True)

It uses colors from the html markdown dependency-less, just ANSI escape
Almost bulletproof regex for tag detection
Last length of the prompt is stored inside the class now, otherwise what's the point not making "feed" static

@hudson-ai I'm ready to push to terminal_colors, just give me the green light XD

Anyways, you guys are awesome, best wishess

hudson-ai · 2024-06-19T20:02:34Z

Marked this as a draft, as there is clearly a good amount of work left to be done.

@dimondjik thank you for the additions! Definitely good to be careful about eating html tags produced by the model, and I like that this allows us to drop the dependency I added.

Also as a terminal user rather than a notebook user, I'm super excited about this 😆

Maybe the best way for us to collaborate is for you to open a PR to my branch? No need to be super formal or anything :)
I'd just ask that you return from feed rather than printing so that we can handle I/O further up the stack.

codecov-commenter · 2024-06-21T21:04:53Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 93.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 58.30%. Comparing base (6e4ee06) to head (0b13ca8).

Files	Patch %	Lines
guidance/_utils.py	94.44%	1 Missing ⚠️
guidance/models/_model.py	91.66%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #905      +/-   ##
==========================================
+ Coverage   55.54%   58.30%   +2.76%     
==========================================
  Files          63       63              
  Lines        4733     4754      +21     
==========================================
+ Hits         2629     2772     +143     
+ Misses       2104     1982     -122

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nopdive

LGTM on confirmation that this works across target environments before merge.

nopdive · 2024-06-25T17:11:55Z

guidance/models/_model.py

+    _ipython = get_ipython()
+    notebook_mode = (
+        _ipython is not None
+        and "IPKernelApp" in _ipython.config


Sorry, hadn't been watching my notifications -- code seems reasonable. Yes, this should be tested (either manually/automatically is fine) on multiple platforms before merge: terminal (Window/Linux/Mac), vscode, jupyter notebook/lab, azure/google/amazon/databricks notebooks. We should be relatively okay if these target environments work.

nopdive · 2024-06-25T17:30:41Z

setup.py

@@ -37,6 +37,7 @@
    "openai": ["openai>=1.0"],
    "schemas": ["jsonschema"],
    "server": ["fastapi", "uvicorn"],
+    "cli": ["colored"]


@Harsha-Nori, I'm not familiar with colored, and colorama in its current state: not fussed which package is used as long as its maintained and we can confirm it works for our target set of environments.

nopdive · 2024-06-25T17:32:30Z

guidance/_utils.py

@@ -261,3 +270,38 @@ def softmax(array: np.ndarray, axis: int = -1) -> np.ndarray:
    array_maxs = np.amax(array, axis=axis, keepdims=True)
    exp_x_shifted = np.exp(array - array_maxs)
    return exp_x_shifted / np.sum(exp_x_shifted, axis=axis, keepdims=True)
+
+
+class ModelStateHTMLParser(HTMLParser):


If it's only console related, have the name/module relate to console.

On a similar note, might be a good time to consider re-architecting visualization to have sibling classes between notebook and terminal.

@nking-1 and I are taking a look at rethinking how we're representing internal model state in order to make this possible (hence why this PR appears a bit stalled out at the moment). Currently, formatting information is very very closely entangled with internal representation, but yes ideally we can abstract that away

nopdive · 2024-06-25T17:37:00Z

guidance/models/_model.py

@@ -862,6 +870,8 @@ def __init__(self, engine, echo=True, **kwargs):
        self._event_parent = None
        self._last_display = 0  # used to track the last display call to enable throttling
        self._last_event_stream = 0  # used to track the last event streaming call to enable throttling
+        self._state_html_parser = ModelStateHTMLParser() # used to parse the state for cli display


Same as before, consider making naming specific to console if it's only used for console mode.

hudson-ai · 2024-06-25T18:21:25Z

@nopdive re: colored/colorama, @dimondjik shared some code that should let us avoid adding a dependency at all (although I want to triple check for platform differences when it comes to ansi code support)

hudson-ai added 3 commits June 18, 2024 15:41

Terminal coloring courtesy of @dimondjik

6d3800f

Dynamic colors with colored

38fc7be

Fix notebook_mode flag

5a08f12

hudson-ai commented Jun 18, 2024

View reviewed changes

mypy

aaffff2

hudson-ai requested a review from Harsha-Nori June 18, 2024 23:41

hudson-ai commented Jun 19, 2024

View reviewed changes

hudson-ai marked this pull request as draft June 19, 2024 19:53

Merge branch 'main' into terminal_colors

acea90e

nopdive approved these changes Jun 25, 2024

View reviewed changes

hudson-ai mentioned this pull request Jun 25, 2024

[WIP] Structured model state #929

Open

3 tasks

dimondjik and others added 3 commits July 2, 2024 09:24

Colored cli base class

cdb8314

drop colored dep

43da50b

Merge branch 'main' into terminal_colors

0b13ca8

paulbkoch force-pushed the main branch from c291eda to c4531ea Compare February 3, 2025 08:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terminal colors #905

Terminal colors #905

hudson-ai commented Jun 18, 2024

hudson-ai Jun 18, 2024

hudson-ai Jun 18, 2024

Harsha-Nori Jun 19, 2024

nopdive Jun 25, 2024

hudson-ai commented Jun 18, 2024

hudson-ai Jun 19, 2024

Harsha-Nori Jun 19, 2024

hudson-ai Jun 19, 2024

riedgar-ms Jun 19, 2024

hudson-ai Jun 19, 2024

nopdive Jun 25, 2024

Harsha-Nori commented Jun 19, 2024

riedgar-ms commented Jun 19, 2024

hudson-ai commented Jun 19, 2024

hudson-ai commented Jun 19, 2024

dimondjik commented Jun 19, 2024 •

edited

Loading

hudson-ai commented Jun 19, 2024

codecov-commenter commented Jun 21, 2024 •

edited

Loading

nopdive left a comment

nopdive Jun 25, 2024

nopdive Jun 25, 2024

nopdive Jun 25, 2024

hudson-ai Jun 25, 2024

nopdive Jun 25, 2024

hudson-ai commented Jun 25, 2024

Terminal colors #905

Are you sure you want to change the base?

Terminal colors #905

Conversation

hudson-ai commented Jun 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hudson-ai commented Jun 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Harsha-Nori commented Jun 19, 2024

riedgar-ms commented Jun 19, 2024

hudson-ai commented Jun 19, 2024

hudson-ai commented Jun 19, 2024

dimondjik commented Jun 19, 2024 • edited Loading

hudson-ai commented Jun 19, 2024

codecov-commenter commented Jun 21, 2024 • edited Loading

Codecov Report

nopdive left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hudson-ai commented Jun 25, 2024

dimondjik commented Jun 19, 2024 •

edited

Loading

codecov-commenter commented Jun 21, 2024 •

edited

Loading