Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hash for block elements #10

Open
petdance opened this issue Mar 12, 2024 · 1 comment
Open

Add hash for block elements #10

petdance opened this issue Mar 12, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@petdance
Copy link
Collaborator

Migrated from RT originally by [email protected]

Would be nice to have a list of elements which "break line" when textified. I actually this in my HTML::AsText::Fix:

# source: http://en.wikipedia.org/wiki/HTML_element#Block_elements
%isBlockElement = map {; $_ => 1 } qw(
  p
  h1 h2 h3 h4 h5 h6
  dl dt dd
  ol ul li
  dir
  address
  blockquote
  center
  del
  div
  hr
  ins
  noscript script
  pre
);

Not sure what to do with <br>: it breaks line but it's not a block element.

@petdance petdance added the enhancement New feature or request label Mar 12, 2024
@petdance
Copy link
Collaborator Author

Comments by @PhilterPaper from RT

I think it would be useful to list all tags which affect the screen display AND are NOT flow/inline/phrasal. That is, one of their characteristics would be that they cause line breaks in most (if not all) cases. %isBlockElement is fine as a name, although for consistency with %isPhraseMarkup, perhaps it should be %isBlockMarkup?

There is still the question as to whether tags which put nothing directly on the screen (e.g., ) but might have children which do, should go into the %isBlockElement or %isPhraseMarkup, or get a new list: %isNonDisplayMarkup? Also, should %isBlockElement contain children which can only appear under another block element (e.g., thead, tbody, tfoot, tr, caption, etc. under table; or leave it as just table)? It's not clear how these lists are intended to be used. Presumably a given tag should appear only once in a "basic" list, and possibly again in composite lists.

Changes needed for %isBlockElement:

  1. remove 'ins' and 'del'. They belong in %isPhraseMarkup.
  2. add 'menu', 'map', 'area', 'marquee', 'noscript', 'script', 'frameset', 'frame', 'noframes', 'form', 'table', 'search', 'multicol', 'layer', 'nolayer', 'bgsound', 'applet'. Some of these don't create output, so it's questionable where they should go. It's probably safer in %isBlockElement than in %isPhraseMarkup (less chance of disrupting the flow). Some have children which produce output to the page, while others don't.

That should bring it up to date for HTML v4 (see also Github #5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant