How to Recover a Pelican Site

Written by Katrina Ellison Geltman on Sun 23 March 2025.

Perhaps you, like me, have lost the original content used to generate a Pelican website. You have the site itself, but without the original settings and Markdown files, it's difficult to add any new content.

This article is about how to recover that content.

Install Python and Pelican

# Ensure Python >= 3.9
python3 --version

#### Install venv
sudo apt install python3.12-venv

# Install pelican
./pelican-dev-env/bin/python -m ensurepip --default-pip
./pelican-dev-env/bin/python -m pip install "pelican[markdown]"

Serve Your Site Locally

When you want to preview your site, run this command in your project's root directory, then visit localhost:3333 in your browser. It serves everything in the output/ directory.

python3 -m http.server -d output 3333

Install a Theme

Download one of the themes shown at Pelican Themes (most of them are available on GitHub). Put it in a directory that's outside your content/ directory. I chose to make a directory called themes/ and put it there, but you could also keep it completely outside of your site's code.

Wherever you put it, add it to your pelicanconf.py like this:

THEME = './themes/foundation-default-colours'

Add Settings to pelicanconf.py

You will likely need some settings in the file pelicanconf.py. These were mine:

THEME = './themes/foundation-default-colours'
SITENAME = "Katrina Ellison Geltman"
FOUNDATION_FOOTER_TEXT = ' '
FOUNDATION_PYGMENT_THEME = 'bw'
MONTH_ARCHIVE_SAVE_AS = 'posts/{date:%Y}/{date:%m}/index.html'
STATIC_PATHS = ['images', 'theme/img']
LINKS = [['LinkedIn', 'http://www.linkedin.com/in/katrinaellison'],
     ['GitHub', 'http://www.github.com/katrinae'],
     ['SlideShare', 'http://www.slideshare.net/kellison00']
    ]
FOUNDATION_ALTERNATE_FONTS = 'True'

See a full list of settings on the Pelican website.

Generate Input Markdown Files

I used this script (generated by Copilot) to convert my output HTML files back to Markdown.

import os
import html2text

# Function to convert HTML file to Markdown
def convert_html_to_markdown(html_file_path, markdown_file_path):
with open(html_file_path, 'r', encoding='utf-8') as html_file:
    html_content = html_file.read()

markdown_content = html2text.html2text(html_content)

with open(markdown_file_path, 'w', encoding='utf-8') as markdown_file:
    markdown_file.write(markdown_content)

# Directory containing the HTML files
html_directory = '/home/katrina/pelican-dev/katrinae.github.io/'
# Directory to save the converted Markdown files
markdown_directory = '/home/katrina/pelican-dev/pelican-site/content/'

# Ensure the markdown directory exists
os.makedirs(markdown_directory, exist_ok=True)

# Convert each HTML file in the directory to Markdown
for html_file_name in os.listdir(html_directory):
if html_file_name.endswith('.html'):
    html_file_path = os.path.join(html_directory, html_file_name)
    markdown_file_name = os.path.splitext(html_file_name)[0] + '.md'
    markdown_file_path = os.path.join(markdown_directory, markdown_file_name)

    convert_html_to_markdown(html_file_path, markdown_file_path)
    print(f'Converted {html_file_name} to {markdown_file_name}')

You also have to add some metadata to the beginning of each Markdown file, like this:

Title: How to Recover a Pelican Site
Date: Sunday, March 23 2025
Author: Katrina Ellison Geltman
Slug: site-recovery

The slug corresponds to the page's URL (i.e. site-recovery becomes site-recovery.html).

Copy Images

I put my images in content/images/ and then added STATIC_PATHS = ['images'] to my pelicanconf.py. This copies the images to output/images/ so they can be served at images/.

Generate Output

When everything is in place, generate output with:

python -m pelican content -s pelicanconf.py

Then compare it to your old output.

Want to become a better programmer? Join the Recurse Center!