Pygments extensions

Click Extra plugs into Pygments to allow for the rendering of ANSI codes in various terminal output.

Important

For these helpers to work, you need to install click_extra’s additional dependencies from the pygments extra group:

$ pip install click_extra[pygments]

Integration

As soon as click-extra is installed, all its additional components are automaticcaly registered to Pygments.

Here is a quick way to check the new plugins are visible to Pygments’ regular API:

  • Formatter:

    >>> from pygments.formatters import get_formatter_by_name
    >>> get_formatter_by_name("ansi-html")
    <click_extra.pygments.AnsiHtmlFormatter object at 0x1011ff1d0>
    
  • Filter:

    >>> from pygments.filters import get_filter_by_name
    >>> get_filter_by_name("ansi-filter")
    <click_extra.pygments.AnsiFilter object at 0x103aaa790>
    
  • Lexers:

    >>> from pygments.lexers import get_lexer_by_name
    >>> get_lexer_by_name("ansi-shell-session")
    <pygments.lexers.AnsiBashSessionLexer>
    

Tip

If click-extra is installed but you don’t see these new components, you are probably running the snippets above in the wrong Python interpreter.

For instance, you may be running them in a virtual environment. In that case, make sure the virtual environment is activated, and you can import click_extra from it.

ANSI HTML formatter

The new ansi-html formatter interpret ANSI Pygments tokens and renders them into HTML. It is also responsible for producing the corresponding CSS style to color the HTML elements.

Warning

This ansi-html formatter is designed to only work with the ansi-color lexer. These two components are the only one capable of producing ANSI tokens (ansi-color) and rendering them in HTML (ansi-html).

ansi-color is implement by pygments_ansi_color.AnsiColorLexer on which Click Extra depends. So on Click Extra installation, ansi-color will be available to Pygments:

>>> from pygments.lexers import get_lexer_by_name
>>> get_lexer_by_name("ansi-color")
<pygments.lexers.AnsiColorLexer>

Formatter usage

To test it, let’s generate a cowsay.ans file that is full of ANSI colors:

$ fortune | cowsay | lolcat --force > ./cowsay.ans
$ cat ./cowsay.ans
 ________________________________ 
/ Reality is for people who lack \
\ imagination.                   /
 -------------------------------- 
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

We can run our formatter on that file:

from pathlib import Path

from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters import get_formatter_by_name

lexer = get_lexer_by_name("ansi-color")
formatter = get_formatter_by_name("ansi-html")

ansi_content = Path("./cowsay.ans").read_text()

print(highlight(ansi_content, lexer, formatter))

Hint

The ansi-color lexer parse raw ANSI codes and transform them into custom Pygments tokens, for the formatter to render.

Pygments’ highlight() is the utility method tying the lexer and formatter together to generate the final output.

The code above prints the following HTML:

<div class="highlight">
 <pre>
      <span></span>
      <span class="-Color -Color-C154 -C-C154"> __</span>
      <span class="-Color -Color-C148 -C-C148">_</span>
      <span class="-Color -Color-C184 -C-C184">___________</span>
      <span class="-Color -Color-C178 -C-C178">_</span>
      <span class="-Color -Color-C214 -C-C214">_________</span>
      <span class="-Color -Color-C208 -C-C208">________ </span>
      <span class="-Color -Color-C148 -C-C148">/</span>
      <span class="-Color -Color-C184 -C-C184"> Reality is</span>
      <span class="-Color -Color-C178 -C-C178"> </span>
      <span class="-Color -Color-C214 -C-C214">for people</span>
      <span class="-Color -Color-C208 -C-C208"> who lack</span></pre>
</div>

And here is how to obtain the corresponding CSS style:

print(formatter.get_style_defs(".highlight"))
pre {
    line-height: 125%;
}

.highlight .hll {
    background-color: #ffffcc
}

.highlight {
    background: #f8f8f8;
}

.highlight .c {
    color: #3D7B7B;
    font-style: italic
}

/* Comment */
.highlight .err {
    border: 1px solid #FF0000
}

/* Error */
.highlight .o {
    color: #666666
}

/* Operator */
.highlight .-C-BGBlack {
    background-color: #000000
}

/* C.BGBlack */
.highlight .-C-BGBlue {
    background-color: #3465a4
}

/* C.BGBlue */
.highlight .-C-BGBrightBlack {
    background-color: #676767
}

/* C.BGBrightBlack */
.highlight .-C-BGBrightBlue {
    background-color: #6871ff
}

/* C.BGBrightBlue */
.highlight .-C-BGC0 {
    background-color: #000000
}

/* C.BGC0 */
.highlight .-C-BGC100 {
    background-color: #878700
}

/* C.BGC100 */
.highlight .-C-BGC101 {
    background-color: #87875f
}

/* C.BGC101 */
/* … */

Caution

The ansi-color lexer/ansi-html formatter combo can only render pure ANSI content. It cannot interpret the regular Pygments tokens produced by the usual language lexers.

That’s why we also maintain a collection of ANSI-capable lexers for numerous languages, as detailed below.

ANSI filter

Todo

Write example and tutorial.

ANSI language lexers

Some languages supported by Pygments are command lines or code, mixed with generic output.

For example, the console lexer can be used to highlight shell sessions. The general structure of the shell session will be highlighted by the console lexer, including the leading prompt. But the ANSI codes in the output will not be interpreted by console and will be rendered as plain text.

To fix that, Click Extra implements ANSI-capable lexers. These can parse both the language syntax and the ANSI codes in the output. So you can use the ansi-console lexer instead of console, and this ansi--prefixed variant will highlight shell sessions with ANSI codes.

Lexer variants

Here is the list of new ANSI-capable lexers and the original lexers they are based on:

Original Lexer

Original IDs

ANSI variants

BashSessionLexer

console, shell-session

ansi-console, ansi-shell-session

DylanConsoleLexer

dylan-console, dylan-repl

ansi-dylan-console, ansi-dylan-repl

ElixirConsoleLexer

iex

ansi-iex

ErlangShellLexer

erl

ansi-erl

GAPConsoleLexer

gap-console, gap-repl

ansi-gap-console, ansi-gap-repl

JuliaConsoleLexer

jlcon, julia-repl

ansi-jlcon, ansi-julia-repl

MSDOSSessionLexer

doscon

ansi-doscon

MatlabSessionLexer

matlabsession

ansi-matlabsession

OutputLexer

output

ansi-output

PostgresConsoleLexer

postgres-console, postgresql-console, psql

ansi-postgres-console, ansi-postgresql-console, ansi-psql

PowerShellSessionLexer

ps1con, pwsh-session

ansi-ps1con, ansi-pwsh-session

PsyshConsoleLexer

psysh

ansi-psysh

PythonConsoleLexer

pycon, python-console

ansi-pycon, ansi-python-console

RConsoleLexer

rconsole, rout

ansi-rconsole, ansi-rout

RubyConsoleLexer

irb, rbcon

ansi-irb, ansi-rbcon

SqliteConsoleLexer

sqlite3

ansi-sqlite3

TcshSessionLexer

tcshcon

ansi-tcshcon

Lexers usage

Let’s test one of these lexers. We are familiar with Python so we’ll focus on the pycon Python console lexer.

First, we will generate some random art in an interactive Python shell:

>>> import itertools
>>> colors = [f"\033[3{i}m{{}}\033[0m" for i in range(1, 7)]
>>> rainbow = itertools.cycle(colors)
>>> letters = [next(rainbow).format(c) for c in "║▌█║ ANSI Art ▌│║▌"]
>>> art = "".join(letters)
>>> art
'\x1b[35m║\x1b[0m\x1b[36m▌\x1b[0m\x1b[31m█\x1b[0m\x1b[32m║\x1b[0m\x1b[33m \x1b[0m\x1b[34mA\x1b[0m\x1b[35mN\x1b[0m\x1b[36mS\x1b[0m\x1b[31mI\x1b[0m\x1b[32m \x1b[0m\x1b[33mA\x1b[0m\x1b[34mr\x1b[0m\x1b[35mt\x1b[0m\x1b[36m \x1b[0m\x1b[31m▌\x1b[0m\x1b[32m│\x1b[0m\x1b[33m║\x1b[0m\x1b[34m▌\x1b[0m'

The code block above is a typical Python console session. You have interactive prompt (>>>), pure Python code, and the output of these invocations. It is rendered here with Pygments’ original pycon lexer.

You can see that the raw Python string art contain ANSI escape sequences (\x1b[XXm). When we print this string and give the results to Pygments, the ANSI codes are not interpreted and the output is rendered as-is:

>>> print(art)
║▌█║ ANSI Art ▌│║▌

If you try to run the snippet above in your own Python console, you will see that the result of the print(art) is colored.

That’s why you need Click Extra’s lexers. If we switch to the new ansi-pycon lexer, the output is colored, replicating exactly what you are expecting in your console:

>>> print(art)
 ANSI Art 

See also

All these new lexers can be used in Sphinx out of the box, with a bit of configuration.

Lexer design

We can check how pygments_ansi_color’s ansi-color lexer transforms a raw string into ANSI tokens:

>>> from pygments.lexers import get_lexer_by_name
>>> ansi_lexer = get_lexer_by_name("ansi-color")
>>> tokens = ansi_lexer.get_tokens(art)
>>> tuple(tokens)
((Token.Color.Magenta, '║'), (Token.Text, ''), (Token.Color.Cyan, '▌'), (Token.Text, ''), (Token.Color.Red, '█'), (Token.Text, ''), (Token.Color.Green, '║'), (Token.Text, ''), (Token.Color.Yellow, ' '), (Token.Text, ''), (Token.Color.Blue, 'A'), (Token.Text, ''), (Token.Color.Magenta, 'N'), (Token.Text, ''), (Token.Color.Cyan, 'S'), (Token.Text, ''), (Token.Color.Red, 'I'), (Token.Text, ''), (Token.Color.Green, ' '), (Token.Text, ''), (Token.Color.Yellow, 'A'), (Token.Text, ''), (Token.Color.Blue, 'r'), (Token.Text, ''), (Token.Color.Magenta, 't'), (Token.Text, ''), (Token.Color.Cyan, ' '), (Token.Text, ''), (Token.Color.Red, '▌'), (Token.Text, ''), (Token.Color.Green, '│'), (Token.Text, ''), (Token.Color.Yellow, '║'), (Token.Text, ''), (Token.Color.Blue, '▌'), (Token.Text, '\n'))

See how the raw string is split into Pygments tokens, including the new Token.Color tokens. These tokens are then ready to be rendered by our own ansi-html formatter.

pygmentize command line

Because they’re properly registered to Pygments, all these new components can be invoked with the pygmentize CLI.

For example, here is how we can render the cowsay.ans file from the example above into a standalone HTML file:

$ pygmentize -f ansi-html -O full -o cowsay.html ./cowsay.ans
$ cat cowsay.html
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<!--
generated by Pygments <https://pygments.org/>
Copyright 2006-2023 by the Pygments team.
Licensed under the BSD license, see LICENSE for details.
-->
<html>
 <head>
  <title>
  </title>
  <meta content="text/html; charset=utf-8" http-equiv="content-type"/>
  <style type="text/css">
   /*
         generated by Pygments <https://pygments.org/>
         Copyright 2006-2023 by the Pygments team.
         Licensed under the BSD license, see LICENSE for details.
         */
         pre { line-height: 125%; }
         body { background: #f8f8f8; }
         body .c { color: #3D7B7B; font-style: italic } /* Comment */
         body .err { border: 1px solid #FF0000 } /* Error */
         body .o { color: #666666 } /* Operator */
         body .-Color-BGBlack { background-color: #000000 } /* Color.BGBlack */
         body .-Color-BGBlue { background-color: #3465a4 } /* Color.BGBlue */
         body .-Color-BGBrightBlack { background-color: #676767 } /* Color.BGBrightBlack */
         body .-Color-BGBrightBlue { background-color: #6871ff } /* Color.BGBrightBlue */
         body .-Color-BGCyan { background-color: #34e2e2 } /* Color.BGCyan */
         body .-Color-BGGreen { background-color: #8ae234 } /* Color.BGGreen */
         /* … */
  </style>
 </head>
 <body>
  <h2>
  </h2>
  <div class="highlight">
   <pre>
            <span></span>
            <span class="-Color -Color-C154"> __</span>
            <span class="-Color -Color-C148">_</span>
            <span class="-Color -Color-C184">___________</span>
            <span class="-Color -Color-C178">_</span>
            <span class="-Color -Color-C214">_________</span>
            <span class="-Color -Color-C208">________ </span>
            <span class="-Color -Color-C148">/</span>
            <span class="-Color -Color-C184"> Reality is</span>
            <span class="-Color -Color-C178"> </span>
            <span class="-Color -Color-C214">for people</span>
            <span class="-Color -Color-C208"> who lack</span></pre>
  </div>
 </body>
</html>

click_extra.pygments API

        classDiagram
  ExtendedColorHtmlFormatterMixin <|-- AnsiHtmlFormatter
  Filter <|-- AnsiFilter
  HtmlFormatter <|-- AnsiHtmlFormatter
  Lexer <|-- AnsiLexerFiltersMixin
  LexerMeta <|-- AnsiSessionLexer
    

Helpers and utilities to allow Pygments to parse and render ANSI codes.

click_extra.pygments.DEFAULT_TOKEN_TYPE = ('Generic', 'Output')

Default Pygments’ token type to render with ANSI support.

We defaults to Generic.Output tokens, as this is the token type used by all REPL- like and terminal lexers.

class click_extra.pygments.AnsiFilter(**options)[source]

Bases: Filter

Custom filter transforming a particular kind of token (Generic.Output by defaults) into ANSI tokens.

Initialize a AnsiColorLexer and configure the token_type to be colorized.

Todo

Allow multiple token_type to be configured for colorization (if traditions are changed on Pygments’ side).

filter(lexer, stream)[source]

Transform each token of token_type type into a stream of ANSI tokens.

Return type:

Iterator[tuple[_TokenType, str]]

class click_extra.pygments.AnsiSessionLexer(name, bases, dct)[source]

Bases: LexerMeta

Custom metaclass used as a class factory to derive an ANSI variant of default shell session lexers.

Setup class properties’ defaults for new ANSI-capable lexers.

  • Adds an ANSI prefix to the lexer’s name.

  • Replaces all aliases IDs from the parent lexer with variants prefixed with

    ansi-.

class click_extra.pygments.AnsiLexerFiltersMixin(*args, **kwargs)[source]

Bases: Lexer

Adds a TokenMergeFilter and AnsiOutputFilter to the list of filters.

The session lexers we inherits from are parsing the code block line by line so they can differentiate inputs and outputs. Each output line ends up encapsulated into a Generic.Output token. We apply the TokenMergeFilter filter to reduce noise and have each contiguous output lines part of the same single token.

Then we apply our custom AnsiOutputFilter to transform any Generic.Output monoblocks into ANSI tokens.

click_extra.pygments.collect_session_lexers()[source]

Retrieve all lexers producing shell-like sessions in Pygments.

This function contain a manually-maintained list of lexers, to which we dynamiccaly adds lexers inheriting from ShellSessionBaseLexer. :rtype: Iterator[type[Lexer]]

Hint

To help maintain this list, there is a test that will fail if a new REPL/terminal-like lexer is added to Pygments but not referenced here.

click_extra.pygments.lexer_map = {<class 'pygments.lexers.algebra.GAPConsoleLexer'>: <class 'pygments.lexer.AnsiGAPConsoleLexer'>, <class 'pygments.lexers.dylan.DylanConsoleLexer'>: <class 'pygments.lexer.AnsiDylanConsoleLexer'>, <class 'pygments.lexers.erlang.ElixirConsoleLexer'>: <class 'pygments.lexer.AnsiElixirConsoleLexer'>, <class 'pygments.lexers.erlang.ErlangShellLexer'>: <class 'pygments.lexer.AnsiErlangShellLexer'>, <class 'pygments.lexers.julia.JuliaConsoleLexer'>: <class 'pygments.lexer.AnsiJuliaConsoleLexer'>, <class 'pygments.lexers.matlab.MatlabSessionLexer'>: <class 'pygments.lexer.AnsiMatlabSessionLexer'>, <class 'pygments.lexers.php.PsyshConsoleLexer'>: <class 'pygments.lexer.AnsiPsyshConsoleLexer'>, <class 'pygments.lexers.python.PythonConsoleLexer'>: <class 'pygments.lexer.AnsiPythonConsoleLexer'>, <class 'pygments.lexers.r.RConsoleLexer'>: <class 'pygments.lexer.AnsiRConsoleLexer'>, <class 'pygments.lexers.ruby.RubyConsoleLexer'>: <class 'pygments.lexer.AnsiRubyConsoleLexer'>, <class 'pygments.lexers.shell.BashSessionLexer'>: <class 'pygments.lexer.AnsiBashSessionLexer'>, <class 'pygments.lexers.shell.MSDOSSessionLexer'>: <class 'pygments.lexer.AnsiMSDOSSessionLexer'>, <class 'pygments.lexers.shell.PowerShellSessionLexer'>: <class 'pygments.lexer.AnsiPowerShellSessionLexer'>, <class 'pygments.lexers.shell.TcshSessionLexer'>: <class 'pygments.lexer.AnsiTcshSessionLexer'>, <class 'pygments.lexers.special.OutputLexer'>: <class 'pygments.lexer.AnsiOutputLexer'>, <class 'pygments.lexers.sql.PostgresConsoleLexer'>: <class 'pygments.lexer.AnsiPostgresConsoleLexer'>, <class 'pygments.lexers.sql.SqliteConsoleLexer'>: <class 'pygments.lexer.AnsiSqliteConsoleLexer'>}

Map original lexer to their ANSI variant.

class click_extra.pygments.AnsiHtmlFormatter(**kwargs)[source]

Bases: ExtendedColorHtmlFormatterMixin, HtmlFormatter

Extend standard Pygments’ HtmlFormatter.

Adds support for ANSI 256 colors.

Intercept the style argument to augment it with ANSI colors support.

Creates a new style instance that inherits from the one provided by the user, but updates its styles attribute to add ANSI colors support from pygments_ansi_color.

name = 'ANSI HTML'

Full name for the formatter, in human-readable form.

aliases = ['ansi-html']

A list of short, unique identifiers that can be used to lookup the formatter from a list, e.g. using get_formatter_by_name().