How to Recover a Pelican Site
Written by Katrina Ellison Geltman on Sun 23 March 2025.
Perhaps you, like me, have lost the original content used to generate a Pelican website. You have the site itself, but without the original settings and Markdown files, it's difficult to add any new content.
This article is about how to recover that content.
Install Python and Pelican
# Ensure Python >= 3.9
python3 --version
#### Install venv
sudo apt install python3.12-venv
# Install pelican
./pelican-dev-env/bin/python -m ensurepip --default-pip
./pelican-dev-env/bin/python -m pip install "pelican[markdown]"
Serve Your Site Locally
When you want to preview your site, run this command in your project's root directory, then visit localhost:3333 in your browser. It serves everything in the output/ directory.
python3 -m http.server -d output 3333
Install a Theme
Download one of the themes shown at Pelican Themes (most of them are available on GitHub). Put it in a directory that's outside your content/ directory. I chose to make a directory called themes/ and put it there, but you could also keep it completely outside of your site's code.
Wherever you put it, add it to your pelicanconf.py like this:
THEME = './themes/foundation-default-colours'
Add Settings to pelicanconf.py
You will likely need some settings in the file pelicanconf.py. These were mine:
THEME = './themes/foundation-default-colours'
SITENAME = "Katrina Ellison Geltman"
FOUNDATION_FOOTER_TEXT = ' '
FOUNDATION_PYGMENT_THEME = 'bw'
MONTH_ARCHIVE_SAVE_AS = 'posts/{date:%Y}/{date:%m}/index.html'
STATIC_PATHS = ['images', 'theme/img']
LINKS = [['LinkedIn', 'http://www.linkedin.com/in/katrinaellison'],
['GitHub', 'http://www.github.com/katrinae'],
['SlideShare', 'http://www.slideshare.net/kellison00']
]
FOUNDATION_ALTERNATE_FONTS = 'True'
See a full list of settings on the Pelican website.
Generate Input Markdown Files
I used this script (generated by Copilot) to convert my output HTML files back to Markdown.
import os
import html2text
# Function to convert HTML file to Markdown
def convert_html_to_markdown(html_file_path, markdown_file_path):
with open(html_file_path, 'r', encoding='utf-8') as html_file:
html_content = html_file.read()
markdown_content = html2text.html2text(html_content)
with open(markdown_file_path, 'w', encoding='utf-8') as markdown_file:
markdown_file.write(markdown_content)
# Directory containing the HTML files
html_directory = '/home/katrina/pelican-dev/katrinae.github.io/'
# Directory to save the converted Markdown files
markdown_directory = '/home/katrina/pelican-dev/pelican-site/content/'
# Ensure the markdown directory exists
os.makedirs(markdown_directory, exist_ok=True)
# Convert each HTML file in the directory to Markdown
for html_file_name in os.listdir(html_directory):
if html_file_name.endswith('.html'):
html_file_path = os.path.join(html_directory, html_file_name)
markdown_file_name = os.path.splitext(html_file_name)[0] + '.md'
markdown_file_path = os.path.join(markdown_directory, markdown_file_name)
convert_html_to_markdown(html_file_path, markdown_file_path)
print(f'Converted {html_file_name} to {markdown_file_name}')
You also have to add some metadata to the beginning of each Markdown file, like this:
Title: How to Recover a Pelican Site
Date: Sunday, March 23 2025
Author: Katrina Ellison Geltman
Slug: site-recovery
The slug
corresponds to the page's URL (i.e. site-recovery
becomes
site-recovery.html
).
Copy Images
I put my images in content/images/ and then added STATIC_PATHS = ['images']
to my pelicanconf.py. This copies the images to output/images/ so they can be
served at images/.
Generate Output
When everything is in place, generate output with:
python -m pelican content -s pelicanconf.py
Then compare it to your old output.