Sitemap Generation
Automatic XML sitemap and JSON search index generation for search engine discovery and site search.
Sitemap Generation
Automatic generation of XML sitemaps for search engines and JSON indexes for site search.
Overview
The theme generates:
- XML Sitemap: For search engine crawlers
- JSON Search Index: For client-side search
- robots.txt: Crawler instructions
XML Sitemap
Automatic Generation
Using jekyll-sitemap plugin:
# _config.yml
plugins:
- jekyll-sitemap
Output
Generated at /sitemap.xml:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://yoursite.com/</loc>
<lastmod>2025-01-25T00:00:00+00:00</lastmod>
</url>
<url>
<loc>https://yoursite.com/docs/</loc>
<lastmod>2025-01-20T00:00:00+00:00</lastmod>
</url>
</urlset>
Custom Sitemap
Create sitemap.xml manually:
---
layout: null
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{% for page in site.pages %}
{% unless page.sitemap == false %}
<url>
<loc>{{ page.url | absolute_url }}</loc>
<lastmod>{{ page.last_modified_at | default: site.time | date_to_xmlschema }}</lastmod>
<changefreq>{{ page.sitemap.changefreq | default: 'monthly' }}</changefreq>
<priority>{{ page.sitemap.priority | default: '0.5' }}</priority>
</url>
{% endunless %}
{% endfor %}
</urlset>
JSON Search Index
Generated File
search.json for client-side search:
[
{
"title": "Getting Started",
"url": "/docs/getting-started/",
"content": "Welcome to the documentation...",
"categories": ["docs"],
"tags": ["setup"]
}
]
Search Template
---
layout: null
---
[
{% assign pages = site.pages | where_exp: "page", "page.title" %}
{% for page in pages %}
{
"title": {{ page.title | jsonify }},
"url": {{ page.url | jsonify }},
"content": {{ page.content | strip_html | truncatewords: 100 | jsonify }},
"categories": {{ page.categories | jsonify }},
"tags": {{ page.tags | jsonify }}
}{% unless forloop.last %},{% endunless %}
{% endfor %}
]
robots.txt
Basic Configuration
# robots.txt
User-agent: *
Allow: /
Sitemap: https://yoursite.com/sitemap.xml
Jekyll Template
---
layout: null
permalink: /robots.txt
---
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /private/
Sitemap: {{ site.url }}/sitemap.xml
Excluding Pages
From XML Sitemap
---
sitemap: false
---
Or with plugin config:
# _config.yml
defaults:
- scope:
path: "admin/*"
values:
sitemap: false
From Search Index
---
search: false
---
{% unless page.search == false %}
// Include in search index
{% endunless %}
Priority and Frequency
Per-Page Settings
---
sitemap:
priority: 0.8
changefreq: weekly
---
Default Settings
# _config.yml
defaults:
- scope:
path: ""
type: "posts"
values:
sitemap:
changefreq: monthly
priority: 0.7
- scope:
path: ""
type: "pages"
values:
sitemap:
changefreq: weekly
priority: 0.5
Search Engine Submission
Google Search Console
- Go to Search Console
- Add your site
- Submit sitemap URL:
https://yoursite.com/sitemap.xml
Bing Webmaster Tools
- Go to Bing Webmaster Tools
- Add your site
- Submit sitemap
Validation
XML Validation
Test at XML Sitemap Validator
Google Search Console
Check sitemap status in Search Console → Sitemaps
Troubleshooting
Sitemap Not Found
- Check plugin is installed
- Verify
_site/sitemap.xmlexists - Check file permissions
Pages Missing
- Verify page isn’t excluded
- Check front matter for
sitemap: false - Ensure page has title
JSON Invalid
- Check for unescaped characters
- Validate JSON syntax
- Check Liquid template errors
Best Practices
Keep Sitemap Updated
- Regenerate on deploy
- Include lastmod dates
- Remove deleted pages
Optimize for Search
- Include all important pages
- Use descriptive titles
- Keep URLs clean
Monitor Performance
- Check indexing status
- Monitor crawl errors
- Review search analytics
Related
See also
- [[SEO]]
- [[Deployment]]
- [[Analytics]]