getText() returns text other drivers does not #153

alexpott · 2020-10-08T21:19:08Z

\Behat\Mink\Driver\BrowserKitDriver::getText() will return text in the head section and also any json on the page that's contained in a script tag in the HTML body. \Behat\Mink\Driver\Selenium2Driver::getText(), for example, will not return text from the head section or script tags in the body section. Given the Mink documentation states:

getText() will strip tags and unprinted characters out of the response, including newlines. So it’ll basically return the text that the user sees on the page.

I'm not sure if this is a Symfony\DomCrawler issue or not.

See for a discussion of the affects of this - https://www.drupal.org/project/drupal/issues/3175718

The text was updated successfully, but these errors were encountered:

jonathanjfshaw · 2020-10-09T10:44:50Z

DomCrawler is simply using php's DOMNode: https://www.php.net/manual/en/class.domnode.php#domnode.props.textcontent
which is implementing the W3c spec: https://www.w3.org/TR/2003/WD-DOM-Level-3-Core-20030226/DOM3-Core.html#core-ID-1312295772

alexpott · 2020-10-09T17:12:54Z

@jonathanjfshaw yep and it's returning what document.body.textContent in the browser console does. The point is that this is not what \Behat\Mink\Driver\Selenium2Driver::getText() returns and it is returning stuff that is not visible.

aik099 · 2020-10-22T18:09:39Z

I see no issue here.

The Selenium driver is talking to a real browser and can ask to return only text visible to a user. The BrowserKit being a headless driver only looking at HTML tags and parsing them to its knowledge. This way stripping all HTML tags will leave their content in place resulting in the effect you're getting.

@alexpott , I'm recommending to use the getText method on the BODY NodeElement (PHP class in Mink) of the document, not the whole document. This way you won't get any extra stuff (at least I hope so).

Code below (maybe not working) is how I'll be getting the contents of a document.

$body_text = $session->getPage()->find('xpath', '//body')->getText();

alexpott · 2021-05-22T21:08:06Z

@aik099 body can contain script tags. Adding script tags just before closing the body tag is often advocated for performance reasons.

claudiu-cristea mentioned this issue Jul 12, 2022

Port https://www.drupal.org/project/drupal/issues/3175718 jhedstrom/drupalextension#612

Closed

claudiu-cristea mentioned this issue Nov 16, 2022

Port https://www.drupal.org/project/drupal/issues/3175718 jhedstrom/drupalextension#629

Merged

MiroslavRusev mentioned this issue Jan 30, 2024

Drupal finder exception causes error - Cannot locate Drupal on a standalone behat installation. jhedstrom/drupalextension#658

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getText() returns text other drivers does not #153

getText() returns text other drivers does not #153

alexpott commented Oct 8, 2020

jonathanjfshaw commented Oct 9, 2020

alexpott commented Oct 9, 2020

aik099 commented Oct 22, 2020

alexpott commented May 22, 2021

getText() returns text other drivers does not #153

getText() returns text other drivers does not #153

Comments

alexpott commented Oct 8, 2020

jonathanjfshaw commented Oct 9, 2020

alexpott commented Oct 9, 2020

aik099 commented Oct 22, 2020

alexpott commented May 22, 2021