Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 781 Bytes

File metadata and controls

25 lines (13 loc) · 781 Bytes

Whitespace Removal Module

Purpose

Removes textual elements containing only whitespace as content.

What it does

Removes (filters out) all the textual elements not containing any content at all.

Dependencies

None

Parameters

minWidth: The minimum width to see if an element is at least a certain size for it to be taken into consideration as a candidate for removal

How it works

All whitespace textual elements are checked to see if their width is less than minWidth, then checked if they are overlapping with other text elements (a very common case), and then deleted.

Accuracy

Good. The module treats the most common cases, but its completeness is based on observation. New edge-cases might appear and will be interesting to treat in the future.