Add LLM-powered content summarization feature with litellm integratio… #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces LLM-powered content summarization to the web crawler, allowing users to generate high-quality summaries of search results using GPT-4o-mini and other supported models. The changes update the documentation, add new dependencies, enhance the CLI workflow, and introduce new modules for summarization and improved search functionality.
LLM Summarization Feature
README.md
to announce LLM-powered summarization, document new features, configuration steps, example output, architecture, supported LLM providers, and production considerations. [1] [2]litellm
as a dependency inpyproject.toml
to enable LLM integration.CLI and Workflow Enhancements
src/main.py
to prompt users for AI summaries, allow configuration of the number of summaries, and integrate the newSummarizationService
for summarizing search results. [1] [2] [3]Search and Summarization Modules
src/search/searcher.py
implementingWebSearcher
for robust, multi-source web search with improved RSS parsing, Google News URL resolution, and result cleaning/sorting.src/summarizer/__init__.py
to expose summarization engine components for use in the crawler.