A very simple news crawler with a funny name
-
Updated
Sep 29, 2025 - Python
A very simple news crawler with a funny name
Ultimate Website Sitemap Parser
Generate an XML sitemap for a GitHub Pages site using GitHub Actions
A Python script to submit web pages to the Wayback Machine for archiving.
提取 sitemap 中的链接,利用百度、必应、谷歌 API 自动推送至搜索引擎,提升网站收录速度。
🕸️ Spider Sitemap - Simple Python 3 crawler that automatically navigates your website, discovers all pages, and generates a complete XML sitemap. Easy to configure and blazing fast!
Asynchronous website cache warmer
Generate a redirect map from two sitemaps for website migration.
A template Python script responsible for generating sitemap files automatically using information from production database.
Archive all webpages in a webiste which are not already archived by archive.org
django CMS page extension to handle sitemap customization
SitemapRAG is an open-source tool designed to leverage your website's sitemap as a Retrieval-Augmented Generation (RAG) source. It empowers developers to build intelligent, context-aware applications by extracting, indexing, and querying content directly from a sitemap.
Processes XML sitemaps and extracts URLs. Includes features such as support for both plain XML and compressed XML files, multiple input sources, protection against anti-bot measures, multi-threading, and automatic processing of nested sitemaps.
This Python library is designed to scrape sitemaps from websites, providing a simple and efficient way to gather information about the structure of a website.
Add a description, image, and links to the sitemap topic page so that developers can more easily learn about it.
To associate your repository with the sitemap topic, visit your repo's landing page and select "manage topics."