Over the past months, I have been helping to build a new documentation website for the YAGPDB project using Hugo. To provide more accurate syntax highlighting for our code snippets, which use a custom scripting language, we needed to switch away from Hugo’s built-in Chroma-powered highlighting.1
After evaluating various alternative syntax highlighting solutions, we ultimately settled on Shiki. Shiki is powered by TextMate grammars, which are also used by VSCode. In practice, this means that Shiki can take advantage of the massive existing ecosystem of VSCode themes and language extensions to provide accurate, highly customizable highlighting with little effort.
Since there is a lack of existing resources on using different syntax highlighters with Hugo, I wrote up a brief guide in this blog post.
Some important caveats
Shiki is highly extensible and looks nice, but using it in a Hugo website has significant drawbacks. Notably, Shiki is only available as a JavaScript library; this not only ties you to the dreaded npm ecosystem but also means that Shiki cannot be integrated smoothly into Hugo’s build process. Instead, Shiki must run in a separate build step that post-processes the HTML files output by Hugo.2
Concretely, instead of building with
$ hugo
one must instead run
$ hugo && node scripts/highlight.mjs
or similar, where scripts/highlight.mjs
overwrites the output HTML files in public/
. This
additional build step is also rather sluggish relative to Hugo; in our experience, Shiki takes ~1
second to highlight 60 files.
Moreover, Hugo’s built-in development server,
$ hugo server
cannot be made to work with Shiki with this approach: codeblocks will render differently in
development and production. In some sense, this is actually beneficial, as avoiding invoking Shiki
in development preserves quick iteration cycles. (It is still, of course, possible to preview the
production appearance by building via hugo && node scripts/highlight.mjs
and then serving the
public/
directory.)
If these caveats are acceptable to you, read on. For us, our theme—Doks—already tied us to the npm ecosystem, and having accurate highlighting for our custom scripting language was well worth the drawbacks.
A step-by-step guide
Disable Chroma. First, disable Hugo’s built-in syntax highlighting in your website configuration
(config.toml
or similar):
[markup]
[markup.highlight]
codeFences = false
Set up npm project. Next, install Node.js on your machine, create a new npm project in your website root, and install the packages required by the build script with
$ npm init # prompts for project name among other details; any will do
$ npm install glob # for locating built HTML files to highlight
$ npm install htmlparser2 domutils dom-serializer # for editing HTML files
$ npm install shiki
Add build script. Now, copy the build script—responsible for highlighting the HTML files
output by Hugo—to scripts/highlight.mjs
. A longer explanation
of how the build script works is available for the curious.
You must change the glob pattern on line 30 to match your file structure: in our case, we had
documentation under content/docs
, so the correct pattern for the output HTML files was
public/docs/**/index.html
.
|
|
For convenience, you may wish to add a npm script that runs both Hugo and the build script in
package.json
.
{
...
"scripts": {
"build": "hugo && npm run highlight",
"highlight": "node scripts/highlight.mjs"
}
}
You may now build your website with npm run build
and view the result by serving the public
directory with your webserver of choice.
With that, we are done! The finishing touch is to modify your deployment process to invoke the build script in addition to Hugo; see the GitHub workflow we use in the YAGPDB documentation repository for inspiration.
Enjoy your new Shiki-powered codeblocks :)
Appendix: build script explanation
When Hugo processes markdown files, it transforms codeblocks to HTML structured as follows:
<pre>
<code class="language-xxx">
...
</code>
</pre>
so, for each output HTML file, we should
- target all
pre
nodes, - read the language from the
class
attribute of the innercode
node, - and use Shiki to transform the content of the
code
node, then replace thepre
node with the highlightedpre
node.
The function highlightHtmlContent
, which is the bulk of the build script, accomplishes this.
function highlightHtmlContent(highlighter, htmlContent, { lightTheme, darkTheme }) {
const doc = parseDocument(htmlContent);
for (const preNode of findAll((e) => e.name === 'pre', doc.children)) {
const codeNode = findOne((e) => e.name === 'code', preNode.children);
if (!codeNode) continue;
const lang = codeNode.attribs['class']?.replace(/^language-/, '') ?? 'text';
const code = textContent(codeNode);
const highlighted = highlighter.codeToHtml(code, {
lang,
themes: { light: lightTheme, dark: darkTheme },
});
const highlightedPreNode = parseDocument(highlighted).children[0];
replaceElement(preNode, highlightedPreNode);
}
return render(doc);
}
The rest of the build script is driver code that invokes highlightHtmlContent
on each output file
and overwrites the contents.
-
Though Chroma is extensible with custom lexers, Hugo does not surface this extension point at the moment. ↩︎
-
In the future, Hugo may support running WASM scripts as part of the build, potentially ameliorating this issue as Shiki can then, in theory, be compiled to WASM and invoked. However, the separate build script is the best we can do for now. ↩︎