Technical
SEO
Crawlability
XML sitemaps: a no-nonsense guide
A sitemap helps search engines discover and prioritize your pages. Here is what to include, what to leave out, and how to keep it healthy.
SEO Pine · February 2, 2026 · 1 min read
An XML sitemap is a list of the URLs you want search engines to crawl, with optional hints about when each was last changed. It does not guarantee indexing, but it makes discovery faster and more reliable, especially for large or deep sites.
What to include
- Canonical URLs you want indexed, one entry each.
- A lastmod date that reflects real content changes.
- Only pages that return a 200 and are indexable.
What to leave out
- Noindex pages, redirects, and error pages.
- Duplicate or parameter URLs that are not canonical.
- Pages blocked in robots.txt.
Reference it everywhere
Add a Sitemap directive to your robots.txt and submit the sitemap in your search console. Both help engines find it quickly.
Keep it healthy
- Regenerate it when you add or remove pages.
- Keep each file under the 50,000 URL and 50 MB limits, and use a sitemap index for more.
- Audit for stale URLs that now redirect or 404.
- Trust accurate lastmod dates, do not fake freshness.
A sitemap is a discovery aid, not a ranking trick. Keep it honest and current.
The SEO Pine XML sitemap generator builds a valid sitemap from a list of URLs, with lastmod, changefreq, and priority.