Fork me on GitHub
Documentation: Index > Hacking > Writing A Formatter

Writing a Formatter

Luminous has two distinct stages in highlighting. The first consists of tokenizing the string (this is done by the scanners), and results in the string being represented in an intermediate format. The second stage occurs when a formatter is given the intermediate string and converts it to some output format.

The intermediate representation is a loose but well-formed XML structure, where the tags represent a way to embed the highlighting data.

Here's a brief example of a simple C program

123456
#include <stdio.h>
int main() {
  /* NOTE: something */
  float f = 1.0f;
  return 100 * f;
}

and its resulting XML:

123456
<PREPROCESSOR>#include &lt;<STRING>stdio.h</STRING>&gt;</PREPROCESSOR>
<TYPE>int</TYPE> main() {
  <COMMENT>/* <COMMENT_NOTE>NOTE:</COMMENT_NOTE> something */</COMMENT>
  <TYPE>float</TYPE> f <OPERATOR>=</OPERATOR> <NUMERIC>1.0f</NUMERIC>;
  <KEYWORD>return</KEYWORD> <NUMERIC>100</NUMERIC> <OPERATOR>*</OPERATOR> f;
}

A few things to notice:

  1. It's not quite valid XML because it lacks a root tag and it doesn't have a <?xml declaration. But apart from that, it's XML. Tags can nest and it should be well formed.
  2. The contents of the tags are HTML entity escaped, &, < and > become &amp;, &lt; and &gt; respectively.
  3. Some things which are not deemed important for highlighting aren't inside tags.
  4. We don't use (or need) attributes or self-closing tags.
Note: currently any multiline tokens are closed before the newline and re-opened after the newline. This is not yet configurable, it might be in future.

The Formatter Class

A formatter should subclass LuminousFormatter. It needs to implement the method:
format($str)
This method receives the XML string and should return the formatted string.

It should respect the following class properties if appropriate.

$wrap_length - int - word wrap at n characters, 0 or -1 means no wrap
$line_numbers - bool - whether or not to display line numbering
$link - bool - convert URLs to hyperlinks
$height - int or string - Constrain output to this height (applies mostly to HTML)
And if appropriate it should implement the following method:
set_theme($css)
This receives a CSS string representing the theme in the user's theme setting. The class LuminousCSSParser will parse this allowing you to translate colouring rules into whatever format is necessary (consult the LaTeX formatter to see this in action).

Using The Formatter

Insert an instance of the formatter as a setting, i.e.
123
<?php
$formatter = new MyFormatter();
Luminous::set('format', $formatter);

That formatter (actually a clone of it) will be used to format subsequent calls to highlight().