Skip to content
forked from miekg/pandoc2rfc

Use pandoc to create XML suitable for xml2rfc

Notifications You must be signed in to change notification settings

matje/pandoc2rfc

 
 

Repository files navigation

Introduction

This memo presents a technique for using Pandoc syntax as a source format for documents in the Internet-Drafts (I-Ds) and Request for Comments (RFC) series.

Pandoc is an "almost plain text" format and therefor particularly well suited for editing RFC-like documents.

Github reader: see draft.example.txt for the actual rendering.

Pandoc to RFC

Pandoc2rfc - designed to do the right thing, until it doesn't.

When writing we directly wrote the XML. Needless to say is was kinda tedious even thought the XML of xml2rfc is very "light".

Nowadays I'm a fan of the Markdown syntax and especially the syntax as supported (created?) by Pandoc.

So for my next RFC (if ever!) I decided I wanted to use Pandoc. As xml2rfc uses XML I thought the easiest way would be to create docbook XML and transform that using XSLT. Pandoc2rfc does just that. The conversions are, in some way amusing, as we start off with (almost) plain text, use elaborate XML and end up with plain text again.

+-------------------+   pandoc   +---------+  
| ALMOST PLAIN TEXT |   ------>  | DOCBOOK |  
+-------------------+            +---------+  
              |                       |
non-existent  |                       | xsltproc
 quicker way  |                       |
              v                       v
      +------------+    xml2rfc  +---------+ 
      | PLAIN TEXT |  <--------  | XML2RFC | 
      +------------+             +---------+ 
Figure: Attempt to justify Pandoc2rfc.

The XML generated (the output after the xsltproc step in ) is suitable for inclusion in either the middle or back section of an RFC. The easiest way is to create a template XML file and include the appropriate XML:

<?xml version='1.0' ?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd'>

<rfc ipr='trust200902' docName='draft-gieben-pandoc-rfcs-01'>
 <front>
    <title>Writing I-Ds and RFCs using Pandoc</title>
</front>

<middle>
    <?rfc include="middle.xml"?>
</middle>

<back>
    <?rfc include="back.xml"?>
</back>

</rfc>
Figure: A minimal template.xml.

See the Makefile for an example of this. In this case you need to edit 3 documents:

  1. middle.mdk;
  2. back.mkd;
  3. template.xml (probably a fairly static file).

The draft (draft.txt) is automatically created when you call make. This also works for this README.mkd.

It needs xsltproc and pandoc to be installed. See the Pandoc user manual for the details on how to type in Pandoc style.

Supported

  • Sections with an anchor and title attributes;
  • Tables with an anchor and preamble;
  • Lists
    • style=symbols;
    • style=numbers;
    • style=letters (lower- and uppercase);
    • style=format %i, roman lowercase numerals;
    • style=format %I, roman uppercase numerals;
    • style=hanging;
    • style=empty;
  • Figure/artwork with an anchor and preamble;
  • Block quote - not supported by xml2rfc, this is converted to <list style="hanging> paragraph;
  • References
    • external (eref);
    • internal (xref), you can refer to:
      • sections;
      • tables;
      • figures.
  • Citations, by using interal references;
  • Spanx style=verb, style=emph and style=strong.
  • Indexes, by using footnotes;

Unsupported

  • Lists inside a table (xml2rfc doesn't handle this);
  • crefs: for comments (no input syntax available), use HTML comments: <!-- ... -->;

The heavy lifting is done by transform.xsl that transforms the XML. The homepage of Pandoc2rfc is this github repository.

Acknowledgements

The following people have helped to make Pandoc2rfc what it is today:

  • Benno Overeinder
  • Erlend Hamnaberg
  • Matthijs Mekking
  • Trygve Laugstøl

Pandoc Constructs

So, what syntax do you need to use to get the correct output? Well, it is just Pandoc. The best introduction to the Pandoc style is given in this README from Pandoc itself.

For convenience we list the most important ones in the following sections.

Paragraph

Paragraphs are separated with an empty line.

Section

Just use the normal sectioning commands available in Pandoc, for instance:

# Section1 One
Bla

Converts to: <section title="Section1 One" anchor="section1-one"> If you have another section that is also named "Section1 One", that anchor will be called "section1-one-1", but only when the sections are in the same source file!

Referencing the section is done with see [](#section1-one), as in see .

List Styles

A good number of styles are supported.

Symbol

A symbol list.

* Item one;
* Item two.

Converts to: <list style="symbol">

Number

A numbered list.

1. Item one;
2. Item two.

Converts to: <list style="numbers">

Empty

Using the default list markers from Pandoc:

A list using the default list markers.

#. First item;
#. Second item.

Converts to: <list style="empty">

Roman

Use the supported Pandoc syntax:

ii. First item;
ii. Second item.

Or:

II. First item;
II. Second item.

Converts to: <list style="format %i.">. Or in the case of uppercase, this yields: <list style="format %I.">

Letter

A numbered list.

a. Item one;
b. Item two.

Converts to: <list style="letters">, uppercasing the letters works too.

Hanging

This is more like a description list, so we need to use:

First item that needs clarification:

:   Explanation one
More stuff, because item is difficult to explain.
* item1
* item2

Second item that needs clarification:

:   Explanation two

Converts to: <list style="hanging"> and <t hangText="First item that...">

If you want a newline after the hangTexts, search for the string OPTION in transform.xsl and uncomment it.

Figure/Artwork

Indent the paragraph with 4 spaces.

Like this

Converts to: <figure><artwork> ... Note that xml2rfc supports a caption with <artwork>. Pandoc does not support this, but Pandoc2rfc does. If you add a Figure: some text as the last line, the artwork gets a <preamble> with the text after Figure: . It will also be possible to reference the artwork. If a caption is supplied the artwork will be centered.

The reference anchor attribute will be: fig: + first 10 (normalized) characters from the caption. Where normalized means:

  • Take the first 10 characters of the caption (i.e. this is the text after the string Figure: );
  • Spaces are translated to a minus -;
  • Uppercase letters translated to lowercase.

So the first artwork with a caption will get fig:a-minimal- as a reference. See for instance .

This anchoring is completely handled from within the xslt, so cool Pandoc stuff, like adding sequence numbers in case of duplicates isn't supported.

Block Quote

This is not supported by xml2rfc, but any paragraph like:

> quoted text

Converts to: <t><list style="hanging" hangIndent="3"> ... paragraph.

References

External

Any reference like:

[Click here](URI)

Converts to: <ulink target="URI">Click here ...

Internal

Any reference like:

[Click here](#localid)

Converts to: <link target="localid">Click here ...

For referring to RFCs (for which you manually need add the reference source in the template, use a include refs.xml or something), you can just use:

[](#RFC2119)

And it does the right thing. Referencing sections is done with:

See [](#pandoc-constructs)

The word 'Section' is inserted automatically: ... see ... For referencing figures/artworks see . For referencing tables see .

Spanx Styles

The verb style can be selected with back-tics: `text` Converts to: <spanx style="verb"> ...

And the emphasis style with asterisks: *text* or underscores: _text_ Converts to: <spanx style="emph"> ...

And the emphasis style with double asterisks: **text** Converts to: <spanx style="strong"> ...

Table

A table can be entered as:

  Right     Left     Center     Default                                                                  
  -------   ------ ----------   -------                                                                  
       12     12        12        12                                                                   
      123     123       123       123                                                                   
        1     1         1         1       

  Table: A caption describing the table.
  Figure: A caption describing the figure describing the table.

Is translated to <texttable> element in xml2rfc. You can choose multiple styles as input, but they all are converted to the same style (plain <texttable>) table in xml2rfc. The caption is always translated to a <preamble>. The <postamble> tag isn't supported.

If a table has a caption, it will also get a reference. The reference anchor attribute will be: tab: + first 10 (normalized) characters from the caption. Where normalized means:

  • Take the first 10 characters of the caption (i.e. this is the text after the string Table: );
  • Spaces are translated to a minus -;
  • Uppercase letters translated to lowercase.

So the first table with a caption will get tab:a-caption- for reference use. See for instance .

This anchoring is completely handled from within the xslt, so cool Pandoc stuff, like adding sequence numbers in case of duplicates isn't supported.

Indexes

The footnote syntax of Pandoc is slightly abused to support an index. Footnotes are entered in two steps, you have a marker in the text, and later you give actual footnote text. Like this:

[^1]

[^1]: footnote text

We re-use this syntax for the <iref> tag. The above text translates to:

<iref item="footnote text"/>

Sub items are also supported. Use an exclamation mark (!) to separate them:

[^1]: item!sub item

Security Considerations

This memo raises no security issues.

IANA Considerations

This memo has no actions for IANA.

About

Use pandoc to create XML suitable for xml2rfc

Resources

Stars

Watchers

Forks

Packages

No packages published