Fork me on GitHub
Documentation: Index > Cache

The Cache

Introduction

Highlighting source code is fairly expensive from a computational point of view. Luminous is probably one of the slower highlighters around, for two main reasons:
  1. It's doing a lot of work to try to get highlighting as correct as possible
  2. Apart from the regular expression library, most of the logic is self-implemented in PHP, which seems to be a fairly slow language for this kind of thing.
For this reason Luminous includes a caching system so that highlighting need only be calculated once, and it should be largely invisible to you after you've set it up.

Since Luminous 0.6.3 this can either be stored on the filesystem or in a MySQL table (support for other RDBMSs will hopefully come later).

Enabling/Disabling Caching

Caching is enabled and disabled at a per-highlight level, and is done in the call to highlight by the third argument:

12
$use_cache = FALSE;
luminous::highlight('c', 'printf("hi\n");', $use_cache);

The default value is TRUE.

With either SQL or the filesystem, if the cache is unusable (i.e. there's a permissions problem or the database queries fail), Luminous will throw a PHP warning (not an exception), and the highlight will be calculated as normal.

Filesystem vs SQL

For most use cases the file-system is the most obvious choice: it's simpler and it's faster.

The main reasons to use the SQL cache are:

  1. It's neater to have everything hidden in an SQL table instead of lying around the filesystem.
  2. It may be more secure on your setup, as the cache directory may require 777 permissions.
  3. You are handling vast numbers of highlights and have file-system constraints on the number of small files you can have. e.g. ext3 formatted with -T largefile4 expects an average file size on the partition of 4MB, if there are a lot of smaller files, you risk running out of inodes.
  4. Simply that you're storing a lot of highlights: there's a 24-hourly purge which removes filesystem files based on their most recent cache hit (internally we use mtime), this involves iterating over every file in the directory and looking at its mtime. If you're storing thousands+ of highlights this could cause a brief IO bottleneck. The SQL database instead purges on every write, but we can probably expect the database to handle this a lot better when there are a lot of cached items.
In summary: if you don't see the benefit of the SQL cache then don't use it, but if you do, don't be put off by the fact it's a bit slower.

Enabling the File System cache

The filesystem is used by default, but Luminous might not be able to create it.

Luminous uses the directory luminous/cache, which you might have to create and assign writable permissions to (probably chmod 777, but your server might accept other values).

Enabling the SQL cache

Using the SQL cache is a little more complex and some of the responsibility falls on the programmer. To enable the SQL cache a configuration setting 'sql_function' must be set, which must be a function that performs SQL queries. The reason that this is left to the programmer is that Luminous doesn't want to tie itself to one family of SQL functions (e.g. MySQL vs PostgreSQL), and doesn't want to have to worry about your passwords, database names and so on.

The function should take a string (the SQL query) and return:

  1. false if the query had some kind of error
  2. true if the query was successful but didn't return anything
  3. An array of results if the query was a SELECT (or similar). The results should a numerically indexed set of rows, and each row should be an associative array keyed by field name
An example implementation is:

12345678910111213
function query($sql) {
  $connection = mysql_connect('server', 'user', 'password');
  $db = mysql_select_db('your_database');
  $r = mysql_query($sql);
  if (is_bool($r)) return $r;
  $results = array();
  while ($row = mysql_fetch_assoc($r)) {
    $results[] = $row;
  }
  return $results;
}

luminous::set('sql_function', 'query');

If you are using a framework, you should be able to plug this into your framework's database library without too much effort. Here's an example for CodeIgniter, implemented as a helper [codeigniter.com] for simplicity but you could also plug it into a model or controller:

1234567891011121314
function luminous_sql($sql) {
  $CI =& get_instance();
  $CI->load->database();
  $q = $CI->db->query($sql);
  if (is_bool($q)) return $q;
  $ret = array();
  if ($q->num_rows()) {
    foreach($q->result_array() as $row)
      $ret[] = $row;
  }
  return $ret;
}

luminous::set('sql_function', 'luminous_sql');

Note in CI that the database library will throw a fatal error if debug mode is enabled and the query fails (application/config/database.php, change the line $db['default']['db_debug'])

Security

You might be concerned that Luminous doesn't want an SQL-escaping function specific to your RDBMS. This isn't actually a problem: Luminous has strict control over the data being used in queries and doesn't need to escape it: string data is encoded as either b16 or b64 (which have only 'clean' characters). This is double-checked when building each query and if somehow the data has been polluted the query is silently aborted.

Cache FAQ

How is the cache ID calculated?

The ID used for a cache element is a checksum calculated from a number of things, including the input source code, the language, and other less obvious properties such as the version number of Luminous, and the various options which you set at runtime.

This has the result that if you use a non-versioned copy of Luminous (i.e if you just keep pulling the master branch), your highlights may not be recalculated, and you may not see the benefit of bug fixes or improvements unless you clear your cache when you update. It is easier to use versioned copies of Luminous.

How often is the cache purged?

Elements in the cache are purged after a certain time of inactivity. The exact time is the 'cache-age' setting, luminous::set('cache-age', $age), which is given in seconds, and defaults to 90 days.

The timeout is calculated by the last time the cache element was accessed. If using the file-system, the last access is stored in the file's mtime.

How can I clear the cache manually?

For an SQL cache, just empty the table.

For the filesystem, remove everything in the luminous/cache/ directory.

Troubleshooting

If the cache fails for some reason, a PHP warning is generated so:

12
Notice:  Luminous cache errors were encountered.
See luminous::cache_errors() for details. in /home/mark/projects/luminous/src/luminous.php on line 631

and will be printed to your page (you can disable this warning by setting luminous::set('verbose', false), but you should only do this if you have some other error logging set up - keep reading).

Since 0.6.6 luminous::cache_errors() will contain more information:

123456
$highlighted = luminous::highlight($lang, $code, true);
if ($e = luminous::cache_errors()) {
  echo '<pre>';
  echo implode("<br/>", $e);
  echo '</pre>';
}

NOTE: luminous::cache_errors() returns either an array or, if the cache is disabled, FALSE. It will only keep information for the most recent highlight.

Will print something similar to this:

12345
Error writing to "/home/mark/projects/luminous/cache/7a7b9073efe10b64c322de36db0f0a09"
File exists: false
Readable?: false
Writable?: false
Your cache dir ("/home/mark/projects/luminous/cache/") is not writable!

What does this mean? It shows that Luminous cannot write into a file in the cache directory. The file doesn't exist (and it's not readable or writable), but in this case the last line is more insightful, which shows the cache directory is not writable. The solution is to give the cache directory write permissions.

If you don't want all this information being spammed out to your visitors, call luminous::set('verbose', false) to disable the warning, and set up your logging or email handler to create an entry when luminous::cache_errors() returns problems.