Skip to content

markdown

stable

Parse and analyze Markdown text using the pulldown-cmark engine, with support for conversion to HTML, content extraction, and plain-text rendering.

use plugin markdown::{to_html, extract_headings, extract_links, …}
14 functions Data Formats
/ filter jk navigate Esc clear
Functions (14)
  1. to_html Convert Markdown string to HTML
  2. extract_headings Extract all headings with level and text
  3. extract_links Extract all hyperlinks with text and URL
  4. extract_code_blocks Extract fenced and indented code blocks
  5. strip Strip Markdown syntax, return plain text
  6. extract_images Extract image alt text and URLs
  7. extract_tables Extract tables as nested cell arrays
  8. extract_frontmatter Extract YAML frontmatter between `---` delimiters
  9. toc Build table of contents with slugified IDs
  10. word_count Count words across all text nodes
  11. reading_time Estimate reading time in minutes
  12. to_plaintext Strip Markdown and frontmatter, return plain text
  13. extract_task_items Extract GFM task list items with checked state
  14. parse_to_ast Parse Markdown into a flat list of AST events

Overview

markdown is a stateless analysis toolkit built on the pulldown-cmark engine. Every function takes a Markdown string and returns plain Zolo values — a string, an integer, or a table — so there is no document handle to manage and no parser to set up. The parser runs with GitHub-Flavored extensions enabled (tables, footnotes, strikethrough, and task lists), which means the extraction helpers understand modern Markdown out of the box.

Reach for it whenever you need to render Markdown to HTML, pull structured content out of a document (headings, links, images, code blocks, tables, task items, frontmatter), or reduce prose to a searchable plain-text form. Higher-level helpers like toc, word_count, and reading_time compose these primitives into common documentation chores, while parse_to_ast exposes the raw event stream when you need full control.

Common patterns

Render a document and summarize it for a content index:

use plugin markdown::{to_html, word_count, reading_time}

let doc = "# Release notes\n\nWe shipped **custom iterators** today."
let html = to_html(doc)
print("words: {word_count(doc)}, read: {reading_time(doc)} min")
print(html)

Build a navigable table of contents with anchor links:

use plugin markdown::{toc}

let doc = "# Intro\n## Setup\n## Usage\n### Advanced"
let entries = toc(doc)
for i = 1, 4 {
  let e = entries[i]
  print("[{e["text"]}](#{e["id"]}) (H{e["level"]})")
}

Strip frontmatter and collect every outbound link from a post:

use plugin markdown::{to_plaintext, extract_links}

let post = "---\ntitle: Hi\n---\n\nSee [Zolo](https://zolo-lang.com) and [docs](https://zolo-lang.com/docs)."
let body = to_plaintext(post)
let links = extract_links(post)
print("plain: {body}")
print("first link: {links[1]["url"]}")

Convert Markdown string to HTML

Converts a Markdown string to an HTML string using the pulldown-cmark parser with tables, footnotes, strikethrough, and task list extensions enabled.

use plugin markdown::{to_html}

let html = to_html("# Hello\n\nThis is **bold** text.")
print(html)

GFM extensions are on, so tables and task lists render too:

use plugin markdown::{to_html}

let doc = "- [x] done\n- [ ] todo\n\n| A | B |\n|---|---|\n| 1 | 2 |"
print(to_html(doc))

Extract all headings with level and text

Returns a table of {level, text} entries for each heading in the document. Level is an integer from 1 to 6.

use plugin markdown::{extract_headings}

let doc = "# Title\n## Section\n### Sub"
let headings = extract_headings(doc)
let h = headings[1]
print("{h["text"]} (H{h["level"]})")

Extract fenced and indented code blocks

Returns a table of {language, code} entries for each fenced or indented code block. language is an empty string for indented blocks.

use plugin markdown::{extract_code_blocks}

let doc = "```zolo\nlet x = 1\n```"
let blocks = extract_code_blocks(doc)
print(blocks[1]["language"])
print(blocks[1]["code"])

Filter a multi-block document down to just the Zolo snippets:

use plugin markdown::{extract_code_blocks}

let doc = "```rust\nfn main() {}\n```\n\n```zolo\nlet x = 1\n```"
let blocks = extract_code_blocks(doc)
for i = 1, 2 {
  let b = blocks[i]
  if b["language"] == "zolo" {
    print("zolo block: {b["code"]}")
  }
}

Strip Markdown syntax, return plain text

Strips all Markdown syntax and returns the raw text content, preserving paragraph and heading line breaks.

use plugin markdown::{strip}

let plain = strip("# Heading\n\nSome **bold** text.")
print(plain)

Extract image alt text and URLs

Returns a table of {alt, url} entries for each image reference in the document.

use plugin markdown::{extract_images}

let doc = "![Logo](https://example.com/logo.png)"
let images = extract_images(doc)
print(images[1]["alt"])

Extract tables as nested cell arrays

Parses Markdown tables and returns a nested structure: a table of rows, where each row is a table of cell strings. The first row is the header row.

use plugin markdown::{extract_tables}

let doc = "| Name | Age |\n|------|-----|\n| Alice | 30 |"
let tables = extract_tables(doc)
let first_table = tables[1]
let header_row = first_table[1]
print(header_row[1])

Extract YAML frontmatter between `---` delimiters

Extracts the raw YAML content between the opening and closing --- delimiters at the top of the document. Returns nil if no frontmatter is present.

use plugin markdown::{extract_frontmatter}

let doc = "---\ntitle: Hello\nauthor: Alice\n---\n\n# Content"
let fm = extract_frontmatter(doc)
print(fm)

Build table of contents with slugified IDs

Builds a table of contents from all headings in the document. Each entry contains {level, text, id} where id is a slugified anchor string.

use plugin markdown::{toc}

let doc = "# Introduction\n## Getting Started\n## Advanced Usage"
let entries = toc(doc)
for i = 1, 3 {
  let entry = entries[i]
  print("{entry["level"]}: {entry["text"]} -> #{entry["id"]}")
}

Count words across all text nodes

Counts all whitespace-separated words in text and code nodes of the document.

use plugin markdown::{word_count}

let doc = "# Hello World\n\nThis document has several words in it."
let count = word_count(doc)
print("Words: {count}")

Code spans and fenced blocks count toward the total, so it reflects everything a reader sees:

use plugin markdown::{word_count}

let doc = "Run `cargo build` then read the next two words."
print("Words: {word_count(doc)}")

Estimate reading time in minutes

Estimates the reading time in minutes based on word count. The optional wpm parameter sets words-per-minute (default 200). Uses ceiling division so a short article returns at least 1.

use plugin markdown::{reading_time}

let doc = "..."
let mins = reading_time(doc)
let mins_fast = reading_time(doc, 300)
print("~{mins} min read")

Strip Markdown and frontmatter, return plain text

Similar to strip, but also removes any YAML frontmatter block before extracting the plain text. Useful for indexing or summarising documents that include frontmatter.

use plugin markdown::{to_plaintext}

let doc = "---\ntitle: My Post\n---\n\n# Hello\n\nSome **text**."
let plain = to_plaintext(doc)
print(plain)

Extract GFM task list items with checked state

Extracts GitHub-Flavored Markdown task list items (- [ ] task / - [x] done). Returns a table of {checked, text} entries.

use plugin markdown::{extract_task_items}

let doc = "- [x] Write tests\n- [ ] Deploy to prod\n- [x] Update docs"
let items = extract_task_items(doc)
for i = 1, 3 {
  let item = items[i]
  let status = if item["checked"] { "done" } else { "todo" }
  print("{status}: {item["text"]}")
}

Parse Markdown into a flat list of AST events

Parses the Markdown document and returns a flat list of pulldown-cmark events as tables. Each event has at minimum an event field ("start", "end", "text", "code", etc.) and additional fields depending on type (e.g. tag, level, url, language).

use plugin markdown::{parse_to_ast}

let doc = "# Hello\n\nParagraph with [link](https://example.com)."
let events = parse_to_ast(doc)
let first = events[1]
print("{first["event"]}: {first["tag"]}")

Walk the event stream to collect text emitted inside headings only:

use plugin markdown::{parse_to_ast}

let doc = "# Title\n\nBody text.\n\n## Section"
let events = parse_to_ast(doc)
let depth = 0
for i = 1, 999 {
  let e = events[i]
  if e == nil { break }
  if e["event"] == "start" and e["tag"] == "heading" { depth = depth + 1 }
  if e["event"] == "end" and e["tag"] == "heading" { depth = depth - 1 }
  if depth > 0 and e["event"] == "text" {
    print("heading text: {e["text"]}")
  }
}
enespt-br