Skip to content

Orgmode Elements

Karl Voit edited this page Jun 17, 2017 · 14 revisions

The Org-mode syntax recognized is a sub-set of the Org-mode syntax.

Following Org-mode elements are parsed and HTMLized by lazyblorg. As a fall-back, all other Org-mode elements are converted using pandoc after the usual replacements (links to IDs,…) have been made.

Most Org-mode elements such as tables or blocks must not be indented to be recognized properly.

There must be an empty line between different Org-mode elements.

In the source code, following things can be checked:

  • /lib/orgparser.pyparse_orgmode_file(…)
  • /lib/htmlizer.pysanitize_and_htmlize_blog_content(…)

If you want to get an overview on which sanitizing operation is done on what element, take a look at Sanitizing. It also contains a description on the labels used in the sanitizing tables below.

Basic Text Formatting

Currently supported Org-mode text formatting:

Example Using Characters Results in HTML Notes
bold asterisks <b>
code tilde <code> See this issue
verbatim equality signs <code>

Not yet supported, raise an issue or send a merge request if you need it:

Links

Linking Other Blog Articles (Internal Links)

For linking to another blog article within the same blog, please do use :ID: properties.

For example:

*** DONE This Is an Interesting Article                        :blog:
CLOSED: [2017-06-04 Sun 18:15]
:PROPERTIES:
:ID: 2017-06-04-interesting
:CREATED:  [2017-06-04 Sun 17:59]
:END:
:LOGBOOK:
- State "DONE"       from "NEXT"       [2017-06-04 Sun 18:15]
:END:
…
This is the interesting stuff.
…
*** DONE A Boring Article Referring To Another One     :blog:example:
CLOSED: [2017-05-27 Sat 17:43] SCHEDULED: <2017-05-28 Sun>
:PROPERTIES:
:CREATED:  [2017-05-27 Sat 17:30]
:ID: 2017-05-27-example-reference
:END:
:LOGBOOK:
- State "DONE"       from "NEXT"       [2017-05-27 Sat 17:43]
:END:
…
In case you want to read something really interesing,
visit [[id:2017-06-04-interesting][my other article]] in this blog.

See Sanitizing for all elements that may contain those internal links.

Links to Web Pages (External Links)

Any http(s)-links within an Org-mode article will result in

<a href="…">…</a>

in the resulting HTML. See Sanitizing for all elements that may contain those links.

Embedding External Content

Please do read about HTML blocks to include HTML snippets such as tweets or YouTube videos.

Embed Tweets into Your Blog Article

There was no need to develop something specific. Just follow following procedure:

  1. insert an HTML block into your Org blog article
    • Easily done via entering <h + pressing the TAB key
  2. go to the tweet you would like to embed in your browser
    • select the three dots below the Tweet
    • select “Embed Tweet”
    • copy resulting snippet
  3. paste snippet into HTML block

Simple as that.

Example:

#+BEGIN_EXPORT HTML
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">You gotta love <a href="https://twitter.com/slydigsband">@slydigsband</a> - tune into <a href="https://t.co/7yylPwDKvi">https://t.co/7yylPwDKvi</a><br>I just bought the album. Guys, you really rock! Thanks for your Vienna gig!</p>&mdash; Karl Voit (@n0v0id) <a href="https://twitter.com/n0v0id/status/776735121823174656">September 16, 2016</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
#+END_EXPORT

Embed YouTube Videos

  1. insert an HTML block into your Org blog article
    • Easily done via entering <h + pressing the TAB key
  2. retrieve the YouTube URL of your video
  3. replace (only!) HTVgPw7TR_k in the snippet example below with the video ID of your video

Example:

#+BEGIN_EXPORT HTML
<iframe width="560" height="315" src="http://www.youtube.com/embed/HTVgPw7TR_k?rel=0" frameborder="0" allowfullscreen="allowfullscreen"></iframe>
#+END_EXPORT

Headings

One or more asterisks followed by at least one whitespace character.

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x x x x x #SECTION-TITLE#, #SECTION-LEVEL#

Like horizontal rulers, the first sub-heading does separate the teaser part of an article from the rest of it. Teasers are shown on the entry page as well as in the teaser-only feed.

Headings That Start a Blog Article

Recognized as a potential blog article heading when it is a heading that got the tag :blog: (or customized content of TAG_FOR_BLOG_ENTRY) and no NOEXPORT tag.

Headings Within a Blog Article

Headings within a blog article are migrated to HTML sub-headings of the article.

If such a heading has the NOEXPORT tag, this section and its sub-headings are omitted in the end result.

Drawers

Drawers must not be intended and directly follow a heading. After the first empty line, the parser is looking for Org-mode elements that are not drawers.

Drawers and its content is not visible in the HTML output. They are just parsed for the meta-data described in the next sections.

PROPERTIES Drawers

ID property: The first line starting with :ID: is interpreted as the line holding the ID of the heading.

CREATED property: line begins with :CREATED: (case insensitive) followed by at least one whitespace character and an timestamp.

LOGBOOK Drawers

State transitions: within LOGBOOK drawer: line begins with - State followed by at least one whitespace character, from, at least one whitespace character and text.

The most recent finished time-stamp is used to mark the last update on the article (latestupdateTS).

The oldest finished time-stamp is used to mark the first publishing date (firstpublishTS).

Paragraphs

If the parser is in the state where it is looking for content and this content is not a drawer, horizontal ruler, empty line, block, colon block, table, heading or list, it is classified as a plain paragraph.

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x x x x x #PAR-CONTENT#

Horizontal Rulers

At least five minus characters - followed only by optional whitespace characters.

Like the first sub-heading, horizontal rulers do separate the teaser part of an article from the rest of it. Teasers are shown on the entry page as well as in the teaser-only feed.

Lists

The very first list item must not be intended.

Lists are recognized with -, or +, or any number followed by a dot, followed by a space character, an optional checkbox ([ ] filled with one arbitrary character) and text.

Currently, blog articles must not start with a list. See this issue. Also, a blog article must not end with a list. See this issue.

Lists must not be interrupted by empty lines: an empty line ends the current list.

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x x x x x #CONTENT#

Tables

Tables must not be indented. They consist of lines that start with a | character.

Table formulas are ignored and therefore omitted.

Tables are HTMLized via pandoc.

Tables do miss some sanitizing:

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x x

Blocks

Blocks must not be intended. They start with #+BEGIN_ followed by supported block types (below; case insensitive) and parameters. Blocks end with the first #+END found.

Optional names can pre-pended a block: lines starting with =#+NAME: = (case insensitive). Those names are used as captions for the blocks. Please note the different handling for HTML blocks.

Enforcing line breaks is not yet supported with lazyblorg. Open an issue if you need it.

SRC

SRC-blocks do miss some sanitizing:

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x #NAME#

EXPORT

Note: lazyblorg looks into block_type_export_backend which holds either HTML or LATEX.

HTML

HTML-blocks do miss some sanitizing:

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x #NAME#

Named HTML Blocks

If an HTML block has a name associated, it gets included into the blog article with a caption and the HTML source rendered as visible content.

#+NAME: This is the caption
#+BEGIN_EXPORT html
<a href="http://Karl-Voit.at">Link to my webpage</a>
#+END_EXPORT

… results in something similar to:

<a href="http://Karl-Voit.at">Link to my webpage</a>

The caption/name/title is omitted at the moment. See this related issue.

HTML Blocks Without a Name

If an HTML block has no name associated, its HTML source code is written as is into the HTML output of the blog article. This way, you can add HTML snippets for tweets, videos or other content directly into your lazyblorg article.

#+BEGIN_EXPORT html
<a href="http://Karl-Voit.at">Link to my webpage</a>
#+END_EXPORT

… results in something similar to: Link to my webpage

LATEX

LaTeX blocks are HTMLized via pandoc.

LaTeX-blocks do miss some sanitizing:

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x x

EXAMPLE

EXAMPLE-blocks do miss some sanitizing:

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x #NAME#

VERSE

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x x x x #NAME#

QUOTE

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x x x x #NAME#

CENTER

Parsed but has no effect. Raise an issue or send a merge request in case you need it.

ASCII

Parsed but has no effect. Raise an issue or send a merge request in case you need it.

Colon Blocks

Colon blocks are blocks such as following example:

: Lines that start with a colon and a space
: are colon blocks.

Colon blocks are the safest way to embed characters of any kind. Their text format and line breaks are preserved, nothing gets interpreted or replaced. Therefore, colon-blocks do miss some sanitizing on purpose:

HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
x #NAME#

Comment Lines

Comment lines start with a hash character followed by at least one space

# This is an example

They are ignored by the parser and therefore omitted in the resulting article.

Images

First: you can embed image files hosted on the web via HTML snippets such as:

#+BEGIN_EXPORT html
<img src="http://example.com/images/Joshua_Tree.jpg" alt="A beautiful tree" />
#+END_EXPORT

Following chapter describes your possibilities to embed image files that are located on your computer and which should get copied and published with your blog data as well.

Configuration

In config.py you do have three options to configure related to images:

  1. CUSTOMIZED_IMAGE_LINK_KEY
  2. MEMACS_FILE_WITH_IMAGE_FILE_INDEX
    • Read the Memacs readme to learn about Memacs and its filenametimestamps module.
    • Basically it’s a path to a text file (the Memacs index) that holds lines like:
      ** <2006-01-06 10:16> [[2006-01-06 My Document.pdf|/home/user/projects/foobar/2006-01-06 My Document.pdf]]
              
    • All links to files within the Memacs index are indexed for lazyblorg images as well.
      • You should exclude the directories with your generated blog data from being indexed by Memacs.
    • Use an empty string to disable the Memacs index for lazyblorg.
  3. PARENT_DIRECTORY_WITH_IMAGE_ORIGINALS
    • This is the good old-fashioned method: put in a relative or an absolute path to a directory. This directory and all of its sub-directories are traversed and all filenames are indexed for lazyblorg images.
    • Use an empty string to disable including images via traversing the file system.

So if you want to embed image files, you must configure CUSTOMIZED_IMAGE_LINK_KEY and at least one of the other options (Memacs and/or directory traversal) as well.

Embedding Image Files

You have may different choices on how to embed image files in a blog article of yours.

The simplest one:

[[tsfile:2017-03-11T18.29.21 Stars at night -- mytag.jpg]]

As you can see, you don’t have to cope with the directory, where the image file is located. Using the Memacs and/or the file traversal method, image files are located and copied independent of their current location.

This is quite handy when image files are moved between different directories or directories are renamed on the way.

On the downside, you get random results when using the very same file name for different image files. However, I tend to classify this as an edge-case.

Of course, you can use your Org-mode link description as usual:

[[tsfile:2017-03-11T18.29.21 Stars at night -- mytag.jpg][This photo of stars at night is awesome]]

The description This photo of stars at night is awesome becomes the HTML caption of the image in the blog.

But hey, there so much more: You can use the #+CAPTION: feature as well. This will supersede the Org-mode link description. Parameters within #+ATTR_HTML: can be used to define alt text, alignment, or the width of an image:

#+CAPTION: Some beautiful stars in a tree
#+ATTR_HTML: :alt Stars in a Tree :align right :width 300
[[tsfile:2017-03-11T18.29.21 Stars at night -- mytag.jpg][This description is superseded by the CAPTION line]]

This results in HTML code like following:

<figure class="image-right">
<img src="2017-03-11T18.29.21 Stars at night -- mytag - scaled width 300.jpg" alt="Stars in a Tree" width="300" />
<figcaption>Some beautiful stars in a tree<figcaption/>
</figure>

As you can see, when using the width attribute, the resulting image is also scaled to this width in order to maximize transmission speed and browser performance. Currently, there is only support for defining the width and not the height.

You can use multiple ATTR_HTML lines to define multiple parameters in multiple lines:

#+CAPTION: This is going to be the caption
#+ATTR_HTML: :alt This is going to be the alt parameter of the img tag
#+ATTR_HTML: :title The title (like all other "unknown" attributes) is ignored
#+ATTR_HTML: :align right :width 300
[[tsfile:2017-03-11T18.29.21 Stars at night -- mytag.jpg][Remember, if there is an CAPTION, this title gets ignored]]

Currently, lazyblorg supports following alignment parameters:

  • :align left
    • left-aligned image, nothing on the right hand side
  • :align right
    • right-aligned image, nothing on the left hand side
  • :align center
    • centered image, nothing on the right or left of it
    • This is the default alignment if there is no align parameter found.
  • :align float-left
    • left-aligned image where the follow-up paragraph floats around the image
  • :align float-right
    • right-aligned image where the follow-up paragraph floats around the image

Be careful with the float- options: be sure to follow-up those images with paragraphs that contain enough text to float until the bottom of the image. Otherwise, there could be overlapping page elements.

Smart Image File Search

If you do use a similar image management method to mine, you might face following situation as well. You have taken a great photograph and renamed it 2017-03-11T18.29.21 Stars at night.jpg which is a reasonable choice.

You’re embedding this photograph in a blog article:

[[tsfile:2017-03-11T18.29.21 Stars at night.jpg][What a beautiful night]]

A few weeks later, you decide to add some tags to the files from the recent photo sessions. During this task, the image file from your blog post gets renamed and now has a filetag such as: 2017-03-11T18.29.21 Stars at night -- tree night springbreak.jpg

Normally, this would result in a broken link in your lazyblorg blog article. But lazyblorg is not normal: I’ve implemented an algorithm that detects that the file starts with an adapted ISO time-stamp: 2017-03-11T18.29.21.

If there is another unique filename that starts with the very same time-stamp, this file is assumed to be the same image file as stated in the Org-mode source of the blog article. In this case, lazyblorg prints out a warning in the logs and uses this image file instead of the broken filename.

Isn’t this great? I’m loving it!

Easy Embedding Images with Yasnippet

If you are as lazy as I am (at least you’re using a blog software that contains the word «lazy» in its name!) you won’t type all those things on your own.

I’m sure you already know yasnippet and its virtues.

In this case, a snippet might look like following:

# name : ATTR_HTML block with CAPTION for lazyblorg images
# --
#+CAPTION: ${3:caption}
#+ATTR_HTML: :alt ${4:alterantive-text for the image}
#+ATTR_HTML: :align ${5:$$(yas-choose-value '("left" "center" "right" "float-left" "float-right"))} :width ${6:width in pixel}
[[tsfile:$1][: ${2:$$(unless yas-modified-p
 (let ((field (nth 0 (yas--snippet-fields (first (yas--snippets-at-point))))))
   (concat (buffer-substring (yas--field-start field) (yas--field-end field)))))}]] $0

It asks you for all parameters and generates a perfectly fine result for your blog article.

Enjoy.

Org-Mode Elements and Their Placeholders

Org elements: from ox-ascii.el (Org-mode)

Org Element [fn:earmarked] [fn:lowprio] implemented since [fn:internalrepresentation] HTML5
external hyperlinks <2014-01-30 Thu> a
internal links <2014-03-03 Mon> a
bold <2014-01-30 Thu> b
center-block x
clock x
code <2014-01-30 Thu> code
drawer x
dynamic-block x
entity
example-block <2014-01-30 Thu> [‘example-block’, ‘name or None’, [u’first line’, u’second line’]] FIXXME
example “colon-block” <2014-08-10 Sun> [‘colon-block’, False, [u’first line’, u’second line’]] pre
export-block x
export-snippet x
fixed-width x
footnote-definition x
footnote-reference x
headline <2014-01-30 Thu> [‘heading’, {‘level’: 3, ‘title’: u’my title’}] section+header+h1
horizontal-rule <2014-01-31 Fri> [‘hr’] (ignored and only interpreted to mark end of standfirst)
inline-src-block x
inlinetask x
inner-template x
italic x
item
keyword x
latex-environment <2014-01-30 Thu> [fn:pypandoc] [‘latex-block’, ‘name or None’, [u’first line’, u’second line’]]
latex-fragment x
line-break x
link x
paragraph <2014-01-30 Thu> [‘par’, u’line1’, u’line2’] p
plain-list x [‘list-itemize’, [u’first line’, u’second line’]] ul+li
plain-text <2014-01-30 Thu> see: paragraph
planning x
quote-block <2014-01-30 Thu> [‘quote-block’, ‘name or None’, [u’first line’, u’second line’]] blockquote
quote-section ?
radio-target x
section <2014-01-30 Thu> [‘heading’, {‘title’: u’Sub-heading foo’, ‘level’: 3}] h2, h3, …
special-block x
src-block <2014-01-30 Thu> [‘src-block’, ‘name or None’, [u’first line’, u’second line’]] pre
statistics-cookie x
strike-through x
subscript x
superscript x
table x [fn:pypandoc]
table-cell x
table-row x
target
template x
timestamp x
underline x
verbatim x pre
verse-block <2014-01-30 Thu> [‘verse-block’, ‘name or None’, [u’first line’, u’second line’]] pre
html-block <2014-01-30 Thu> [‘html-block’, ‘name or None’, [u’first line’, u’second line’]] pre (if no #+NAME: then insert directly!)
tsfile-links <2017-06-17 Sat> [‘cust_link_image’, u’2017-03-11T18.29.20 Stars.jpg’, {u’width’: u’300’, u’alt’: u’Stars in a Tree’, u’align’: u’right’}] figure, img + attributes, figcaption
the rest [fn:pypandoc]

NOTE: OrgParser is using “par” for anything it can not interpret as something else.

[fn:earmarked] Planned to be implemented soon (or at all :-)

[fn:lowprio] This feature is low on my personal development list (way take some time or might never get implemented)

[fn:pypandoc] This element gets converted using pypndoc (and additional sanitizing)

[fn:internalrepresentation] usually in list: blog_data['id-of-entry']['content']

  • Blocks: (beginning with BEGIN_)

The list of the placeholders and their occurrence might be a bit outdated. Please refer to blog format and the source code for the most current version.

placeholder description gets sanitized source
#ARTICLE-TITLE# heading/title of the blog article x Org: heading
#ARTICLE-ID# id of the article PROPERTIES-drawer
#ABOUT-BLOG# a line of text which describes the blog in general FIXXME
#BLOGNAME# short name of the blog FIXXME
#ARTICLE-YEAR# four digit year of the article (folder path) Org: CREATED-time-stamp
#ARTICLE-MONTH# two digit month of the article (folder path) Org: CREATED-time-stamp
#ARTICLE-DAY# two digit day of the article (folder path) Org: CREATED-time-stamp
#ARTICLE-PUBLISHED-HTML-DATETIME# time-stamp of publishing in HTML Org: CREATED-time-stamp
#ARTICLE-PUBLISHED-HUMAN-READABLE# time-stamp of publishing in Org: CREATED-time-stamp
#TAGNAME# string of a tag Org: tags of Org-heading
#SECTION-TITLE# title of the next heading/section x Org: heading of Org sub-heading
#SECTION-LEVEL# relative level of the next heading/section Org: level of heading - level of article + 1
#PAR-CONTENT# x Org: content which is not recognized as something special
#A-URL# URL of a hyperlink Org: Org-link
#CONTENT# description of the hyperlink Org: Org-link
#CONTENT# text of the list item x Org: item content of Org list
#NAME# Org-mode name of a block Org: #+NAME: declaration
What template-name placeholder replacements
article article-header ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
article-header-begin ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
article-header-end ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
article-tags-begin
article-usertag TAGNAME
article-autotag TAGNAME
article-tags-end ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
article-footer ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
article-end ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
persistent-header ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
persistent-header-begin ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
persistent-header-end ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
persistent-footer ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
persistent-end ARTICLE-TITLE, ABOUT-BLOG, BLOGNAME, ARTICLE-(ID,YEAR,MONTH,DAY,PUB*)
headline section-begin SECTION-TITLE, SECTION-LEVEL
paragraph, plain-text paragraph PAR-CONTENT
URLs a-href A-URL, CONTENT
plain-list ul-begin
ul-item
ul-end
pre-fromatted text pre-begin
pre-end
html-block html-begin NAME
html-end
src-block src-begin
src-end
named-src-begin
named-src-end
Clone this wiki locally