Skip to content

Latest commit

 

History

History
672 lines (307 loc) · 8.01 KB

HtmlPage.md

File metadata and controls

672 lines (307 loc) · 8.01 KB

Wa72\HtmlPageDom\HtmlPage

This class represents a complete HTML document.

It offers convenience functions for getting and setting elements of the document such as setTitle(), getTitle(), setMeta($name, $value), getBody().

It uses HtmlPageCrawler to navigate and manipulate the DOM tree.

Implements:

Stringable

Methods

Name Description
__clone
__construct
__toString
filter Filter nodes by using a CSS selector
filterXPath Filter nodes by XPath expression
getBaseHref Get the href attribute from the base tag, null if not present in document
getBody Get the document's body wrapped in a HtmlPageCrawler instance
getBodyNode Get the document's body as DOMElement
getCrawler Get a HtmlPageCrawler object containing the root node of the HTML document
getDOMDocument Get a DOMDocument object for the HTML document
getElementById Get an element in the document by it's id attribute
getHead Get the document's HEAD section wrapped in a HtmlPageCrawler instance
getHeadNode Get the document's HEAD section as DOMElement
getMeta Get the content attribute of a meta tag with the specified name attribute
getTitle Get the page title of the HTML document
indent indent the HTML document
minify minify the HTML document
removeMeta Remove all meta tags with the specified name attribute
save Save this document to a HTML file or return HTML code as string
setBaseHref Set the base tag with href attribute set to parameter $url
setHtmlById Sets innerHTML content of an element specified by elementId
setMeta Set a META tag with specified 'name' and 'content' attributes
setTitle Sets the page title of the HTML document
trimNewlines remove newlines from string and minimize whitespace (multiple whitespace characters replaced by one space)

HtmlPage::__clone

Description

 __clone (void)

Parameters

This function has no parameters.

Return Values

void


HtmlPage::__construct

Description

 __construct (void)

Parameters

This function has no parameters.

Return Values

void


HtmlPage::__toString

Description

 __toString (void)

Parameters

This function has no parameters.

Return Values

void


HtmlPage::filter

Description

public filter (string $selector)

Filter nodes by using a CSS selector

Parameters

  • (string) $selector : CSS selector

Return Values

\HtmlPageCrawler


HtmlPage::filterXPath

Description

public filterXPath (string $xpath)

Filter nodes by XPath expression

Parameters

  • (string) $xpath : XPath expression

Return Values

\HtmlPageCrawler


HtmlPage::getBaseHref

Description

public getBaseHref (void)

Get the href attribute from the base tag, null if not present in document

Parameters

This function has no parameters.

Return Values

null|string


HtmlPage::getBody

Description

public getBody (void)

Get the document's body wrapped in a HtmlPageCrawler instance

Parameters

This function has no parameters.

Return Values

\HtmlPageCrawler


HtmlPage::getBodyNode

Description

public getBodyNode (void)

Get the document's body as DOMElement

Parameters

This function has no parameters.

Return Values

\DOMElement


HtmlPage::getCrawler

Description

public getCrawler (void)

Get a HtmlPageCrawler object containing the root node of the HTML document

Parameters

This function has no parameters.

Return Values

\HtmlPageCrawler


HtmlPage::getDOMDocument

Description

public getDOMDocument (void)

Get a DOMDocument object for the HTML document

Parameters

This function has no parameters.

Return Values

\DOMDocument


HtmlPage::getElementById

Description

public getElementById (string $id)

Get an element in the document by it's id attribute

Parameters

  • (string) $id

Return Values

\HtmlPageCrawler


HtmlPage::getHead

Description

public getHead (void)

Get the document's HEAD section wrapped in a HtmlPageCrawler instance

Parameters

This function has no parameters.

Return Values

\HtmlPageCrawler


HtmlPage::getHeadNode

Description

public getHeadNode (void)

Get the document's HEAD section as DOMElement

Parameters

This function has no parameters.

Return Values

\DOMElement


HtmlPage::getMeta

Description

public getMeta (string $name)

Get the content attribute of a meta tag with the specified name attribute

Parameters

  • (string) $name

Return Values

null|string


HtmlPage::getTitle

Description

public getTitle (void)

Get the page title of the HTML document

Parameters

This function has no parameters.

Return Values

null|string


HtmlPage::indent

Description

public indent (array $options)

indent the HTML document

Parameters

  • (array) $options : Options passed to PrettyMin::__construct()

Return Values

\HtmlPage

Throws Exceptions

\Exception


HtmlPage::minify

Description

public minify (array $options)

minify the HTML document

Parameters

  • (array) $options : Options passed to PrettyMin::__construct()

Return Values

\HtmlPage

Throws Exceptions

\Exception


HtmlPage::removeMeta

Description

public removeMeta (string $name)

Remove all meta tags with the specified name attribute

Parameters

  • (string) $name

Return Values

void


HtmlPage::save

Description

public save (string $filename)

Save this document to a HTML file or return HTML code as string

Parameters

  • (string) $filename : If provided, output will be saved to this file, otherwise returned

Return Values

string|void


HtmlPage::setBaseHref

Description

public setBaseHref (string $url)

Set the base tag with href attribute set to parameter $url

Parameters

  • (string) $url

Return Values

void


HtmlPage::setHtmlById

Description

public setHtmlById (string $elementId, string $html)

Sets innerHTML content of an element specified by elementId

Parameters

  • (string) $elementId
  • (string) $html

Return Values

void


HtmlPage::setMeta

Description

public setMeta ( $name,  $content)

Set a META tag with specified 'name' and 'content' attributes

Parameters

  • () $name
  • () $content

Return Values

void


HtmlPage::setTitle

Description

public setTitle (string $title)

Sets the page title of the HTML document

Parameters

  • (string) $title

Return Values

void


HtmlPage::trimNewlines

Description

public static trimNewlines (string $string)

remove newlines from string and minimize whitespace (multiple whitespace characters replaced by one space)

useful for cleaning up text retrieved by HtmlPageCrawler::text() (nodeValue of a DOMNode)

Parameters

  • (string) $string

Return Values

string