Implement "folded table" output, as an opt-in feature for all tabular outputs#1231
Implement "folded table" output, as an opt-in feature for all tabular outputs#1231sirosen merged 12 commits intoglobus:mainfrom
Conversation
801637d to
54315b6
Compare
This is an alternative output mode which, when selected, prints tabular data with row separators. Based on terminal width detection, the output can be "folded" to stack up adjacent elements into columns, making rows more horizontally compact. The table folding first tries by halves, then thirds, and finally resorts to folding by N where N is the number of columns.
Also make the styling fallback to a line-oriented style when output is not an interactive terminal.
Unit tests of the Row class and some higher level integration tests of the printer itself to ensure it produces proper tabular output both with and without folding enabled.
'GLOBUS_CLI_FOLD_TABLES=1' enables table folding for all tabular outputs. This can be used to preview the behavior and, in the future, to opt-out of folded table output for the entire CLI.
54315b6 to
35d13a5
Compare
Per discussion, switch from opt-in to opt-out.
6034a55 to
5e89019
Compare
Instrument the selection of various box drawing characters instead of plain ASCII characters for folded table output. This enables some clearer rendering without noisy/confusing lines that make it hard to see an element of the table as singular.
Using the half-dash char looks good in many fonts, but in some renderings it appears to be out of the horizontal line with the rest of the dashed line. In order to reduce the potential for this to be an oddity we see with some users' fonts in the future, simply avoid using the half-dash chars.
|
I've made several adjustments after a team review of the high-level change:
One of the concerns raised was that we want each entity ( I'm now marking this ready for review. |
derek-globus
left a comment
There was a problem hiding this comment.
The output of this functions well and renders lists more nicely on smaller terminals as the PR sets out to do; no complaints there.
Within the code though, the implementation of OutputStyle is pretty difficult to parse & mixes abstraction levels imo. I've got a recommendation in one of the comments about splitting that up into 2 or 3 different categories of "style" to help readers comprehend what stylization entails.
| double = enum.auto() | ||
| double_transition = enum.auto() | ||
| top = enum.auto() | ||
| bottom = enum.auto() |
There was a problem hiding this comment.
This being represented as a boolean flag makes it pretty difficult to understand.
From a lot of back and forth, I think I get the idea is that:
noneanddecorateddisambiguate broadly "how am I styling this"topandbottomdelineate (for specifically separator lines) whether to modify the cornersdoubleanddouble_transitiondelineate stylization characters within a separator line.
Assuming I got that right, it took a lot of imperative piecing together to figure out.
Maybe some comments would help a reader grok this quicker but I think it's just not a great way to model this. The approach merges different abstraction levels (e.g., "decorated" and "bottom") into the same "bag" of style-elements, requiring the reader to look at all of the usage sites to understand anything about what they are.
My recommendation would be to model the "style of output" (e.g., none or decorated) distinctly from the "category of thing being output".
Something like the following would convey the same information more explicitly in method signatures with informative types of stylization than the permutation bag of styles offers.
def _separator_line(
col_widths: tuple[int, ...],
style: t.Literal["none", "decorator"],
fill_type: t.Literal["single", "double"],
line_type: t.Literal["top", "bottom", "transition"]
) -> str
...
class Row:
...
def serialize(
use_col_widths: tuple[int, ...],
style: t.Literal["none", "decorator"]
) -> str
...Example Note:
I used literal types just because that helped me include the data type in the signature; for a real implementation those could be modeled as enums or dataclasses or whatever. The important part I'm trying to highlight is that different "types" of stylization are identified concretely in the signature instead of mushed together into style flags.
There was a problem hiding this comment.
Also worth noting that style in this pared down none vs decorated looks like it's just a renaming of is_folded. Keeping it as a bool instead of a renamed "style type" may help remove an abstraction that actually helps disambiguate at usage site why we might be rendering none-style vs decorated-style.
There was a problem hiding this comment.
I'm starting to tinker on this now. My first step is going to be to try removing decorated as a style -- using the is_folded bool more -- and see how that looks. After that, I think I will want to separate "style" from "line type" (which looks to me like the right split).
I'm sticking with enums, but as you say that's a detail. 🤞 it goes smoothly, but I agree with the overall feedback. Too many ideas got mushed into one here.
There was a problem hiding this comment.
I've done a version of this, but, unlike the other threads, I'm leaving it unresolved until you have a look. There's a new enum for "separator row type", and I feel good about the changes, but it's not how I imagined this going when I first was digging in. If you find the results odd, let's iterate one more time at least.
Define an enum of separator row types, including various "box drawing" row types and one (and only one) ascii row type. `_separator_line` now takes one of these row types, instead of a "style" parameter, and uses it to determine how printing should be done. This also now supports "box_intra_row_separator" as a type, and therefore is used to simplify row serialization.
The information passed via OutputStyle is always available in one of three places, depending on context: - the table is marked "folded" - the row has multiple subrows - the separator row type It turns out there is no need for a replacement for this enum.
Co-authored-by: derek-globus <113056046+derek-globus@users.noreply.github.com>
Define a small dataclass, pop it into a mapping of row types -> styles, and use it where style information is interpreted. Co-authored-by: derek-globus <113056046+derek-globus@users.noreply.github.com>
|
@derek-globus, I've pushed through changes which I think properly incorporate your feedback. Doing a lot of fiddling with the row styling is exactly the sort of thing which I think makes the case for us for "probably add |
derek-globus
left a comment
There was a problem hiding this comment.
Looks good!
Curious to get your thoughts on my one nit-picky thread but feel free to merge without that necessarily resolving. I'm plenty happy with it as-is.
| class SeparatorRowType(enum.Enum): | ||
| # top of a table | ||
| box_top = enum.auto() | ||
| # between the header and the rest of the table (box drawing chars) | ||
| box_header_separator = enum.auto() | ||
| # the same, but for ASCII tables | ||
| ascii_header_separator = enum.auto() | ||
| # between rows of the table | ||
| box_row_separator = enum.auto() | ||
| # between element lines inside of a row of a table | ||
| box_intra_row_separator = enum.auto() | ||
| # bottom of a table | ||
| box_bottom = enum.auto() | ||
|
|
||
|
|
||
| @dataclasses.dataclass | ||
| class SeparatorRowStyle: | ||
| fill: str | ||
| leader: str | ||
| trailer: str | ||
| middle: str | ||
|
|
||
|
|
||
| SEPARATOR_ROW_STYLE_CHART: dict[SeparatorRowType, SeparatorRowStyle] = { | ||
| SeparatorRowType.ascii_header_separator: SeparatorRowStyle( | ||
| fill="-", leader="", trailer="", middle="+" | ||
| ), | ||
| SeparatorRowType.box_top: SeparatorRowStyle( | ||
| fill="═", leader="╒═", trailer="═╕", middle="╤" | ||
| ), | ||
| SeparatorRowType.box_header_separator: SeparatorRowStyle( | ||
| fill="═", leader="╞═", trailer="═╡", middle="╪" | ||
| ), | ||
| SeparatorRowType.box_row_separator: SeparatorRowStyle( | ||
| fill="─", leader="├─", trailer="─┤", middle="┼" | ||
| ), | ||
| SeparatorRowType.box_intra_row_separator: SeparatorRowStyle( | ||
| fill="─", leader="├─", trailer="─┤", middle="┼" | ||
| ), | ||
| SeparatorRowType.box_bottom: SeparatorRowStyle( | ||
| fill="─", leader="└─", trailer="─┘", middle="┴" | ||
| ), | ||
| } |
There was a problem hiding this comment.
Did you try the charachter-data-in-enum version?
Your version is certainly legible enough that I wouldn't block approval over this style. But curious to hear your thought process as to why you prefer the multi-data style over the shorter one.
class SeparatorRowType(enum.Enum):
# top of a table
box_top = ("═", "╒═", "═╕", "╤")
# between the header and the rest of the table (box drawing chars)
box_header_separator = ("═", "╞═", "═╡", "╪")
# the same, but for ASCII tables
ascii_header_separator = ("-", "", "", "+")
# between rows of the table
box_row_separator = ("─", "├─", "─┤", "┼")
# between element lines inside of a row of a table
box_intra_row_separator = ("─", "├─", "─┤", "┼")
# bottom of a table
box_bottom = ("─", "└─", "─┘", "┴")
def __init__(self, fill: str, leader: str, trailer: str, middle: str) -> None:
self.fill = fill
self.leader = leader
self.trailer = trailer
self.middle = middleThere was a problem hiding this comment.
I have some mixed feelings about enums which contain nontrivial data -- anything beyond a single primitive type like an int (for flags) or a string.
The reason is that it becomes harder to reason about what a member of the enum "really is".
Consider:
>>> import enum
>>> class FooEnum(enum.Enum):
... a = (1, 2)
... b = (2, 3)
...
>>> FooEnum.a
<FooEnum.a: (1, 2)>
>>> FooEnum.b
<FooEnum.b: (2, 3)>
>>> isinstance(FooEnum.a, tuple)
False
>>> FooEnum.a[0]
Traceback (most recent call last):
File "<python-input-7>", line 1, in <module>
FooEnum.a[0]
~~~~~~~~~^^^
TypeError: 'FooEnum' object is not subscriptable
>>> FooEnum.a.value
(1, 2)So it isn't a tuple, but it's an enum member containing a tuple. And you need to remember to always "dereference" the enum member to its value to use it. This is okay! It works, and it's type safe, and the enum members are carefully kept distinct from the values. In practice, I worry that the distinction (enum member vs the value it contains) could be confusing.
When people "hack" this by making an enum class which also inherits from another type, we switch from my mixed feelings to fully negative feelings! 😆
Then you're just mixing together different abstractions and praying that it won't bite you.
That said, I agree that it's sort of nice to associate each definition with its associated styles. I'll play with this a little more and if I can come up with a version that I like, I'll submit it for final review.
There was a problem hiding this comment.
Actually, to be up front, I am right now playing with the enum-assignment version, but with a dataclass (renamed from "style" to "settings"). Because it is okay in my book to assign values to an enum as long as you don't do the forbidden thing of making the enum inherit from that class of values.
My one worry is whether or not type checking will understand what the .value type is of an enum whose value types are homogeneous.
There was a problem hiding this comment.
Type checking was fine but I'm scrapping it because it is, as past experience has shown, weird as heck.
>>> import enum
>>> class FooEnum(enum.Enum):
... a = (1, 2)
... b = (2, 3)
... c = (2, 3)
...
>>> FooEnum.b
<FooEnum.b: (2, 3)>
>>> FooEnum.c
<FooEnum.b: (2, 3)>
>>> FooEnum.c is FooEnum.b
True☠️
This could be viewed as an alternative implementation / reimplementation of our table output printer.
Similar to the current table output, it loads the data to print into a data structure which represents our desired output.
Unlike the current table output, each row is treated as a meaningful data structure as well.
Instead of a grid of cells for the whole table, a table is a list of rows, each of which contains a grid.
Rows and tables are not responsible for correctness invariants like "every row has the same layout" -- the enforcement of such rules is left solely to the higher-level printer.
Rows define "folding", which is an accordion-like fold that converts from
to
or
and so forth for different fold sizes.
Rows do not support folding when they are already folded -- only a 1-height row can be folded.
As such, Rows and Tables (lists of Rows) are immutable, and the folding methods produce net-new rows and new tables.
The printer handles them as such, and can therefore compare a "fold by 2" table with a "fold by 3" table.
With the "folding table" implemented and checked against terminal width to determine how much/if it should fold, there is the matter of output formatting for the rows.
Each row knows how to format itself, and accepts a set of "style" flags (an enum) to control specifics.
The same "style" flags can control the printing of separator lines for the header and foot of the table.
Finally, there is the matter of control.
When the environment is not interactive (e.g., a pipeline), folding is always disabled.
The table folding output is off by default, but can be enabled as a replacement for all table printing by setting the new env var
GLOBUS_CLI_FOLD_TABLES=1.The intent, in this PR, is that the new folded table would fully replace our table output in the future, and users could opt out of the behavior with
GLOBUS_CLI_FOLD_TABLES=0.When a table is not folded, the output is almost identical to our current non-folded table. This is intentional.