extending.html

<!DOCTYPE html>
<html lang="en">
<head>

<meta charset="utf-8" />
<link rel="icon" href="favicon.png" />
<title>Extending Prism ▲ Prism</title>
<link rel="stylesheet" href="style.css" />
<link rel="stylesheet" href="themes/prism.css" data-noprefix />
<link rel="stylesheet" href="plugins/line-highlight/prism-line-highlight.css" data-noprefix />
<script src="scripts/prefixfree.min.js"></script>
<style>
ol.indent {
	margin: 1em 0;
	padding-left: 2em;
}
</style>

<script>var _gaq = [['_setAccount', 'UA-33746269-1'], ['_trackPageview']];</script>
<script src="https://www.google-analytics.com/ga.js" async></script>
</head>
<body class="language-javascript">

<header>
	<div class="intro" data-src="templates/header-main.html" data-type="text/html"></div>

	<h2>Extending Prism</h2>
	<p>Prism is awesome out of the box, but it’s even awesomer when it’s customized to your own needs. This section will help you write new language definitions, plugins and all-around Prism hacking.</p>
</header>

<section id="language-definitions">
	<h1>Language definitions</h1>

	<p>Every language is defined as a set of tokens, which are expressed as regular expressions. For example, this is the language definition for CSS:</p>
	<pre data-src="components/prism-css.js"></pre>

	<p>A regular expression literal is the simplest way to express a token. An alternative way, with more options, is by using an object literal. With that notation, the regular expression describing the token would be the <code>pattern</code> attribute:</p>
	<pre><code class="language-javascript">...
'tokenname': {
	pattern: /regex/
}
...</code></pre>
	<p>So far the functionality is exactly the same between the short and extended notations. However, the extended notation allows for additional options:</p>

	<dl>
		<dt>inside</dt>
		<dd>This property accepts another object literal, with tokens that are allowed to be nested in this token.
		This makes it easier to define certain languages. However, keep in mind that they’re slower and if coded poorly, can even result in infinite recursion.
		For an example of nested tokens, check out the Markup language definition:
		<pre data-src="components/prism-markup.js"></pre></dd>

		<dt>lookbehind</dt>
		<dd>This option mitigates JavaScript’s lack of lookbehind. When set to <code>true</code>,
		the first capturing group in the regex <code>pattern</code> is discarded when matching this token, so it effectively behaves
		as if it was lookbehind. For an example of this, check out the C-like language definition, in particular the comment and class-name tokens:
		<pre data-src="components/prism-clike.js"></pre></dd>

		<dt>rest</dt>
		<dd>Accepts an object literal with tokens and appends them to the end of the current object literal. Useful for referring to tokens defined elsewhere. For an example where <code>rest</code> is useful, check the Markup definitions above.</dd>

		<dt>alias</dt>
		<dd>This option can be used to define one or more aliases for the matched token. The result will be, that
		the styles of the token and its aliases are combined. This can be useful, to combine the styling of a well known
		token, which is already supported by most of the themes,  with a semantically correct token name. The option
		can be set to a string literal or an array of string literals. In the following example the token
		name <code>latex-equation</code> is not supported by any theme, but it will be highlighted the same as a string.
		<pre><code class="language-javascript">{
	'latex-equation': {
		pattern: /\$(\\?.)*?\$/g,
		alias: 'string'
	}
}</code></pre></dd>

		<dt>greedy</dt>
		<dd>This is a boolean attribute. It is intended to solve a common problem with
		patterns that match long strings like comments, regex or string literals. For example,
		comments are parsed first, but if the string <code class="language-javascript">/* foo */</code>
		appears inside a string, you would not want it to be highlighted as a comment.
		The greedy-property allows a pattern to ignore previous matches of other patterns, and
		overwrite them when necessary. Use this flag with restraint, as it incurs a small performance overhead.
		The following example demonstrates its usage:
		<pre><code class="language-javascript">'string': {
	pattern: /(["'])(\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/,
	greedy: true
}</code></pre>
	</dl>

	<p>Unless explicitly allowed through the <code>inside</code> property, each token cannot contain other tokens, so their order is significant. Although per the ECMAScript specification, objects are not required to have a specific ordering of their properties, in practice they do in every modern browser.</p>

	<p>In most languages there are multiple different ways of declaring the same constructs (e.g. comments, strings, ...) and sometimes it is difficult or unpractical to match all of them with one single regular expression. To add multiple regular expressions for one token name an array can be used:</p>

	<pre><code class="language-javascript">...
'tokenname': [ /regex0/, /regex1/, { pattern: /regex2/ } ]
...</code></pre>

	<section>
		<h1><code>Prism.languages.insertBefore(inside, before, insert<span class="optional" title="Default value: Prism.languages">, root</span>)</code></h1>

		<p>This is a helper method to ease modifying existing languages. For example, the CSS language definition not only defines CSS highlighting for CSS documents,
		but also needs to define highlighting for CSS embedded in HTML through <code class="language-markup">&lt;style></code> elements. To do this, it needs to modify
		<code>Prism.languages.markup</code> and add the appropriate tokens. However, <code>Prism.languages.markup</code>
		is a regular JavaScript object literal, so if you do this:</p>

		<pre><code >Prism.languages.markup.style = {
	/* tokens */
};</code></pre>

		<p>then the <code>style</code> token will be added (and processed) at the end. <code>Prism.languages.insertBefore</code> allows you to insert
		tokens <em>before</em> existing tokens. For the CSS example above, you would use it like this:</p>

		<pre><code>Prism.languages.insertBefore('markup', 'cdata', {
	'style': {
		/* tokens */
	}
});</code></pre>

		<h2>Parameters</h2>
		<dl>
			<dt>inside</dt>
			<dd>The property of <code>root</code> that contains the object to be modified.</dd>

			<dt>before</dt>
			<dd>Key to insert before (String)</dd>

			<dt>insert</dt>
			<dd>An object containing the key-value pairs to be inserted</dd>

			<dt>root</dt>
			<dd>The root object, i.e. the object that contains the object that will be modified. Optional, default value is <code>Prism.languages</code>.</dd>
		</dl>
	</section>
</section>

<section id="creating-a-new-language-definition" class="language-none">
	<h1>Creating a new language definition</h1>

	<p>This section will explain the usual workflow of creating a new language definition.</p>

	<p>As an example, we will create the language definition of the fictional <em>Foo's Binary, Artistic Robots&trade;</em> language or just Foo Bar for short.</p>

	<ol class="indent">
		<li>
			<p>Create a new file <code>components/prism-foo-bar.js</code>.</p>

			<p>In this example, we choose <code>foo-bar</code> as the id of the new language. The language id has to be unique and should work well with the <code>language-xxxx</code> CSS class name Prism uses to refer to language definitions. Your language id should ideally match the regular expression <code>/^[a-z][a-z\d]*(?:-[a-z][a-z\d]*)*$/</code>.</p>
		</li>
		<li>
			<p>Edit <code>components.json</code> to register the new language by adding it to the <code>languages</code> object. (Please note that all language entries are sorted alphabetically by title.) <br>
			Our new entry for this example will look like this:</p>

			<pre><code class="language-json">"foo-bar": {
	"title": "Foo Bar",
	"owner": "Your GitHub name"
}</code></pre>

			<p>If your language definition depends any other languages, you have to specify this here as well by adding a <code class="language-js">"require"</code> property. E.g. <code class="language-js">"require": "clike"</code>, or <code class="language-js">"require" : ["markup", "css"]</code>. For more information on dependencies read the declaring dependencies section.</p>

			<p><em>Note:</em> Any changes made to <code>components.json</code> require a rebuild (see step 3).</p>
		</li>
		<li>
			<p>Rebuild Prism by running <code class="language-bash">npx gulp</code>.</p>

			<p>This will make your language available to the <a href="test.html">test page</a>, or more precise: your local version of it. You can open your local <code>test.html</code> page in any browser, select your language, and see how your language definition highlights any code you input.</p>

			<p><em>Note:</em> You have to reload the test page to apply changes made to <code>prism-foo-bar.js</code>.</p>
		</li>
		<li>
			<p>Write the language definition.</p>

			<p>The <a href="#language-definitions">above section</a> already explains the makeup of language definitions.</p>
		</li>
		<li>
			<p>Adding aliases.</p>

			<p>Aliases for are useful if your language is known under more than just one name or there are very common abbreviations for your language (e.g. JS for JavaScript). Keep in mind that aliases are very similar to language ids in that they also have to be unique (i.e. there cannot be an alias which is the same as another alias of language id) and work as CSS class names.</p>

			<p>In this example, we will register the alias <code>foo</code> for <code>foo-bar</code> because Foo Bar code is stored in <code>.foo</code> files.</p>

			<p>To add the alias, we add this line at the end of <code>prism-foo-bar.js</code>:</p>

			<pre><code class="language-js">Prism.languages.foo = Prism.languages['foo-bar'];</code></pre>

			<p>Aliases also have to be registered in <code>components.json</code> by adding the <code>alias</code> property to the language entry. In this example, the updated entry will look like this:</p>

			<pre><code class="language-json">"foo-bar": {
	"title": "Foo Bar",
	"alias": "foo",
	"owner": "Your GitHub name"
}</code></pre>

			<p><em>Note:</em> <code>alias</code> can also be a string array if you need to register multiple aliases.</p>

			<p>Using <code>aliasTitles</code>, it's also possible to give aliases specific titles. In this example, this won't be necessary but a good example as to where this is useful is the markup language:</p>

			<pre><code class="language-json">"markup": {
	"title": "Markup",
	"alias": ["html", "xml", "svg", "mathml"],
	"aliasTitles": {
		"html": "HTML",
		"xml": "XML",
		"svg": "SVG",
		"mathml": "MathML"
	},
	"option": "default"
}</code></pre>
		</li>
		<li>
			<p>Add some tests.</p>

			<p>Create a folder <code>tests/languages/foo-bar/</code>. This is where your test files will live. The test format and how to run tests is described <a href="test-suite.html">here</a>.</p>

			<p>You should add a test for every major feature of your language. Test files should test the common case and certain edge cases (if any). Good examples are <a href="https://github.com/PrismJS/prism/tree/master/tests/languages/javascript">the tests of the JavaScript language</a>.</p>

			<p>You can use this template for new <code>.test</code> files:</p>

			<pre><code class="language-json">The code to test.

----------------------------------------------------

[ "JSON of the simplified token stream. We will add this later." ]

----------------------------------------------------

Brief description.</code></pre>

			<p>For every test file:</p>

			<ol class="indent">
				<li>
					<p>Add the code to test and a brief description.</p>
				</li>
				<li>
					<p>Verify that your language definition correctly highlights the test code. This can be done using the test page. <br>
					<em>Note:</em> Using the <em>Show tokens</em> options, you see the token stream your language definition created.</p>
				</li>
				<li>
					<p>Once you <strong>carefully checked</strong> that the test case is handled correctly (i.e. by using the test page), run the following command:</p>
					<code class="language-bash">npm run test:languages -- --language=foo-bar --pretty</code>
					<p>This command will check only your test files. The new test will fail because the specified JSON is incorrect but the error message of the failed test will also include the JSON of the simplified token stream Prism created. This is what we're after. Replace the current incorrect JSON with the output labeled <em>Token Stream</em>. (Please also adjust the indentation. We use tabs.)</p>
				</li>
				<li>
					<p><strong>Carefully check</strong> that the token stream JSON you just inserted is what you expect.</p>
				</li>
				<li>Re-run <code class="language-bash">npm run test:languages -- --language=foo-bar --pretty</code> to verify that the test passes.</li>
			</ol>
		</li>
		<li>
			<p>Run <code class="language-bash">npm test</code> to check that <em>all</em> tests pass, not just your language tests.<br>
			This will usually pass without problems. If you can't get all the tests to pass, skip this step.</p>
		</li>
		<li>
			<p>Add an example page.</p>

			<p>Create a new file <code>examples/prism-foo-bar.html</code>. This will be the template containing the example markup. Just look at other examples to see how these files are structured. <br>
			We don't have any rules as to what counts as an example, so a single <em>Full example</em> section where you present the highlighting of the major features of the language is enough.</p>
		</li>
		<li>
			<p>Run <code class="language-bash">npx gulp</code> again.</p>
		</li>
	</ol>
</section>


<section id="dependencies" class="language-none">
	<h1>Dependencies</h1>

	<p>Languages and plugins can depend on each other, so Prism has its own dependency system to declare and resolve dependencies.</p>

	<h2>Declaring dependencies</h2>

	<p>You declare a dependency by adding a property to the entry of your language or plugin in the <a href="https://github.com/PrismJS/prism/blob/master/components.json"><code>components.json</code></a> file. The name of the property will be dependency kind and its value will be the component id of the dependee. If multiple languages or plugins are depended upon then you can also declare an array of component ids.</p>

	<p>In the following example, we will use the <code>require</code> dependency kind to declare that a fictional language Foo depends on the JavaScript language and that another fictional language Bar depends on both JavaScript and CSS.</p>

	<pre class="language-json" data-line="8,12"><code>{
	"languages": {
		"javascript": { "title": "JavaScript" },
		"css": { "title": "CSS" },
		...,
		"foo": {
			"title": "Foo",
			"require": "javascript"
		},
		"bar": {
			"title": "Bar",
			"require": ["css", "javascript"]
		}
	}
}</code></pre>

	<h3>Dependency kinds</h3>

	<p>There are 3 kinds of dependencies:</p>

	<dl>
		<dt><code>require</code></dt>
		<dd>
			Prism will ensure that all dependees are loaded before the depender. <br>
			You are <strong>not</strong> allowed to modify the dependees unless they are also declared as <code>modify</code>.

			<p>This kind of dependency is most useful if you e.g. extend another language or dependee as an embedded language (e.g. like PHP is embedded in HTML).</p>
		</dd>
		<dt><code>optional</code></dt>
		<dd>
			Prims will ensure that an optional dependee is loaded before the depender if the dependee is loaded. Unlike <code>require</code> dependencies which also guarantee that the dependees are loaded, <code>optional</code> dependencies only guarantee the order of loaded components. <br>
			You are <strong>not</strong> allowed to modify the dependees. If you need to modify the optional dependee, declare it as <code>modify</code> instead.

			<p>This kind of dependency is useful if you have embedded languages but you want to give the users a choice as to whether they want to include the embedded language. By using <code>optional</code> dependencies, users can better control the bundle size of Prism by only including the languages they need.<br>
			E.g. HTTP can highlight JSON and XML payloads but it doesn't force the user to include these languages.</p>
		</dd>
		<dt><code>modify</code></dt>
		<dd>
			This is an <code>optional</code> dependency which also declares that the depender might modify the dependees.

			<p>This kind of dependency is useful if your language modifies another language (e.g. by adding tokens).<br>
			E.g. CSS Extras adds new tokens to the CSS language.</p>
		</dd>
	</dl>

	<p>To summarize the properties of the different dependency kinds:</p>

	<table class="stylish">
		<tr>
			<th></th>
			<th>Non-optional</th>
			<th>Optional</th>
		</tr>
		<tr>
			<th>Read only</th>
			<td><code>require</code></td>
			<td><code>optional</code></td>
		</tr>
		<tr>
			<th>Modifiable</th>
			<td></td>
			<td><code>modify</code></td>
		</tr>
	</table>

	<style>
	table.stylish {
		border-collapse: collapse;
	}
	table.stylish, table.stylish tr, table.stylish td, table.stylish th {
		border: 1px solid #CCC;
	}
	table.stylish td, table.stylish th {
		padding: .5em .75em;
	}
	table.stylish th {
		background-color: #F8F8F8;
	}
	</style>

	<p>Note: You can declare a component as both <code>require</code> and <code>modify</code></p>

	<h2>Resolving dependencies</h2>

	<p>We consider the dependencies of components an implementation detail, so they may change from release to release. Prism will usually resolve dependencies for you automatically. So you won't have to worry about dependency loading if you <a href="download.html">download</a> a bundle or use the <code>loadLanguages</code> function in NodeJS, the <a href="plugins/autoloader/">AutoLoader</a>, or our Babel plugin.</p>

	<p>If you have to resolve dependencies yourself, use the <code>getLoader</code> function exported by <a href="https://github.com/PrismJS/prism/blob/master/dependencies.js"><code>dependencies.js</code></a>. Example:</p>

	<pre><code class="language-js">const getLoader = require('prismjs/dependencies');
const components = require('prismjs/components');

const componentsToLoad = ['markup', 'css', 'php'];
const loadedComponents = ['clike', 'javascript'];

const loader = getLoader(components, componentsToLoad, loadedComponents);
loader.load(id => {
	require(`prismjs/components/prism-${id}.min.js`);
});</code></pre>

	<p>For more details on the <code>getLoader</code> API, check out the <a href="https://github.com/PrismJS/prism/blob/master/dependencies.js">inline documentation</a>.</p>

</section>


<section id="writing-plugins">
	<h1>Writing plugins</h1>

	<p>Prism’s plugin architecture is fairly simple. To add a callback, you use <code class="language-javascript">Prism.hooks.add(hookname, callback)</code>.
	<code>hookname</code> is a string with the hook id, that uniquely identifies the hook your code should run at.
	<code>callback</code> is a function that accepts one parameter: an object with various variables that can be modified, since objects in JavaScript are passed by reference.
	For example, here’s a plugin from the Markup language definition that adds a tooltip to entity tokens which shows the actual character encoded:
	<pre><code class="language-javascript">Prism.hooks.add('wrap', function(env) {
	if (env.token === 'entity') {
		env.attributes['title'] = env.content.replace(/&amp;amp;/, '&amp;');
	}
});</code></pre>
	<p>Of course, to understand which hooks to use you would have to read Prism’s source. Imagine where you would add your code and then find the appropriate hook.
	If there is no hook you can use, you may <a href="https://github.com/PrismJS/prism/issues">request one to be added</a>, detailing why you need it there.
</section>

<section id="api">
	<h1>API documentation</h1>

	<section id="highlight-all">
		<h1><code>Prism.highlightAll(async, callback)</code></h1>
		<p>This is the most high-level function in Prism’s API. It fetches all the elements that have a <code>.language-xxxx</code> class
		and then calls <code>Prism.highlightElement()</code> on each one of them.</p>

		<h2>Parameters</h2>
		<dl>
			<dt>async</dt>
			<dd>
				Whether to use Web Workers to improve performance and avoid blocking the UI when highlighting very large
				chunks of code. False by default
				(<a href="faq.html#why-is-asynchronous-highlighting-disabled-by-default">why?</a>).<br />
				Note: All language definitions required to highlight the code must be included in the main <code>prism.js</code>
				file for the async highlighting to work. You can build your own bundle on the <a href="download.html">Download page</a>.
			</dd>

			<dt>callback</dt>
			<dd>
				An optional callback to be invoked after the highlighting is done. Mostly useful when <code>async</code>
				is true, since in that case, the highlighting is done asynchronously.
			</dd>
		</dl>
	</section>

	<section id="highlight-all-under">
		<h1><code>Prism.highlightAllUnder(element, async, callback)</code></h1>
		<p>Fetches all the descendants of <code>element</code> that have a <code>.language-xxxx</code> class
			and then calls <code>Prism.highlightElement()</code> on each one of them.</p>

		<h2>Parameters</h2>
		<dl>
			<dt>element</dt>
			<dd>The root element, whose descendants that have a <code>.language-xxxx</code> class will be highlighted.</dd>

			<dt>async</dt>
			<dd>Same as in <a href="#highlight-all"><code>Prism.highlightAll()</code></a></dd>

			<dt>callback</dt>
			<dd>Same as in <a href="#highlight-all"><code>Prism.highlightAll()</code></a></dd>
		</dl>
	</section>

	<section id="highlight-element">
		<h1><code>Prism.highlightElement(element, async, callback)</code></h1>
		<p>Highlights the code inside a single element.</p>

		<h2>Parameters</h2>
		<dl>
			<dt>element</dt>
			<dd>The element containing the code. It must have a class of <code>language-xxxx</code> to be processed, where <code>xxxx</code> is a valid language identifier.</dd>

			<dt>async</dt>
			<dd>Same as in <a href="#highlight-all"><code>Prism.highlightAll()</code></a></dd>
			<dt>callback</dt>
			<dd>Same as in <a href="#highlight-all"><code>Prism.highlightAll()</code></a></dd>
		</dl>
	</section>

	<section id="highlight">
		<h1><code>Prism.highlight(text, grammar, language)</code></h1>
		<p>Low-level function, only use if you know what you’re doing.
		It accepts a string of text as input, the language definitions to use, and the name of the language, and returns a string with the HTML produced.</p>

		<h2>Parameters</h2>
		<dl>
			<dt>text</dt>
			<dd>A string with the code to be highlighted.</dd>
			<dt>grammar</dt>
			<dd>An object containing the tokens to use. Usually a language definition like <code>Prism.languages.markup</code></dd>
			<dt>language</dt>
			<dd>The name of the language definition passed to <code>grammar</code>. E.g. <code>markup</code> or <code>javascript</code></dd>
		</dl>

		<h2>Returns</h2>
		<p>The highlighted HTML</p>
	</section>

	<section id="tokenize">
		<h1><code>Prism.tokenize(text, grammar)</code></h1>
		<p>This is the heart of Prism, and the most low-level function you can use. It accepts a string of text as input and the language definitions to use, and returns an array with the tokenized code.
		When the language definition includes nested tokens, the function is called recursively on each of these tokens. This method could be useful in other contexts as well, as a very crude parser.</p>

		<h2>Parameters</h2>
		<dl>
			<dt>text</dt>
			<dd>A string with the code to be highlighted.</dd>
			<dt>grammar</dt>
			<dd>An object containing the tokens to use. Usually a language definition like <code>Prism.languages.markup</code></dd>
		</dl>

		<h2>Returns</h2>
		<p>An array of strings, tokens (class <code>Prism.Token</code>) and other arrays.</p>
	</section>
</section>

<footer data-src="templates/footer.html" data-type="text/html"></footer>

<script src="scripts/utopia.js"></script>
<script src="prism.js"></script>
<script src="plugins/autoloader/prism-autoloader.js"></script>
<script src="plugins/line-highlight/prism-line-highlight.js"></script>
<script src="components.js"></script>
<script src="scripts/code.js"></script>

</body>
</html>