ArxivFlow - Periodic Track on arXiv Paper

English | 中文

Author: Tiger, member from HKUST Dial

Last update: September 09, 2025

🎯 Objectives

This workflow serves for tracking daily updates in arXiv.org. Paper info will be preprocessed and concluded by a series of modules. Finally, it will post to a group chat in Feishu for reading. The target audience is for education and research community.

💰 Cost: less than 0.05 CNY per workflow execution.

✨ Key Features

📚 Automatically fetch latest arXiv papers
🤖 AI-powered paper summarization and filtering
📱 Auto-send to Feishu group chat
⏰ GitHub Actions automated scheduling
🛠️ Local debugging script support

📋 Prerequisites

Before getting started, please ensure you have prepared the following accounts and services:

Dify account - Free registration for building AI workflows
LLM Provider API - Recommended DeepSeek API (cost-effective)
Jina API key - For web content extraction, new users get 1M free credits
Feishu Group Bot Webhook - For message pushing

🚀 Quick Start

Step 1: Setup Dify Workflow

Open Dify Console
- Login to Dify and find the "Studio" tab
Import Workflow
- Create a new workflow by importing this DSL file
- This DSL file contains the complete logic for paper fetching, processing, and pushing
Configure Environment Variables
- Configure necessary environment variables in workflow settings
- See detailed configuration in Environment Variables Configuration section below
Get API Token
- Get your workflow API token from workflow settings
- This token will be used for automated scheduling

Step 2: Setup Automated Scheduler (Recommended)

The project provides an integrated scheduler that can trigger Dify-side workflows on schedule.

Quick Setup:

Configure GitHub Secrets:
- Go to repository Settings > Secrets and variables > Actions > New repository secret
- Add secret DIFY_TOKENS: Your Dify workflow API token (separate multiple tokens with ;)
Enable GitHub Actions: Go to repository Actions tab and enable workflows
Automatic Execution: The scheduler will automatically run according to timing rules defined in dify-scheduler.yml. For syntax details, see cron.help.

Manual Execution:

GitHub Actions: Go to Actions tab > "Dify ArxivFlow Scheduler" > "Run workflow"

Local Testing:

npm install
# Set environment variables
export DIFY_TOKENS="your_workflow_token_here"
npm start

📱 Final Result

The scheduler will automatically:

✅ Execute your Dify workflow daily
📊 Log execution results and status
❌ Report any errors to GitHub Actions logs
🔄 Support multiple workflows if needed

🔧 Environment Variables Configuration

GitHub Actions Secrets (Required):

DIFY_TOKENS: Your Dify workflow API token, separate multiple workflows with ;

Optional Configuration:

DIFY_BASE_URL: Dify API base URL (default: https://api.dify.ai/v1)
DIFY_INPUTS: Workflow input variables in JSON format (default: {})

Dify Workflow Internal Environment Variables:

FEISHU_DEV / FEISHU_PROD: Feishu Group Bot Webhook for testing/production environments
JINA: API key for crawling arXiv search results
KEYWORDS: Keywords for arXiv paper search, comma-separated
- The number of KEYWORDS and sending frequency needs to match the timing rules in GitHub Actions
- Example: If sending 4 pushes daily, KEYWORDS needs 4 keywords, and timing rules need 4 time points
PAPER_NUM_MAX: Maximum number of papers per message (limited by Feishu message length)

🛠️ Debugging Scripts

The /scripts folder contains scripts for local debugging and testing, simulating the processes used in Dify Workflow:

jina_extract.py: Simulates Jina API calls and paper information extraction logic
sample.text: Sample data returned by Jina API for local testing
extracted_papers.json: Example of structured paper data after extraction, serves as input for downstream LLM analysis in workflow

These scripts help you test and debug paper extraction logic without consuming API credits.

Usage for Local Development:

cd scripts
python jina_extract.py

🤝 Acknowledgement

Dify Official Guidance: Link
Feishu - How to use Bot in Group Chat: Link (Chinese)
AWS Workshop: Lab3-使用Dify构建AI Workflow: Link (Chinese)
arXiv Category: Link
Dify Schedule Project: Link - Inspiration for the automated scheduler implementation

📄 License

MIT License - See LICENSE file

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
dsl		dsl
image		image
scheduler		scheduler
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ArxivFlow - Periodic Track on arXiv Paper

🎯 Objectives

✨ Key Features

📋 Prerequisites

🚀 Quick Start

Step 1: Setup Dify Workflow

Step 2: Setup Automated Scheduler (Recommended)

Quick Setup:

Manual Execution:

📱 Final Result

🔧 Environment Variables Configuration

GitHub Actions Secrets (Required):

Optional Configuration:

Dify Workflow Internal Environment Variables:

🛠️ Debugging Scripts

Usage for Local Development:

🤝 Acknowledgement

📄 License

About

Uh oh!

Languages

License

tigerlcl/ArxivFlow

Folders and files

Latest commit

History

Repository files navigation

ArxivFlow - Periodic Track on arXiv Paper

🎯 Objectives

✨ Key Features

📋 Prerequisites

🚀 Quick Start

Step 1: Setup Dify Workflow

Step 2: Setup Automated Scheduler (Recommended)

Quick Setup:

Manual Execution:

📱 Final Result

🔧 Environment Variables Configuration

GitHub Actions Secrets (Required):

Optional Configuration:

Dify Workflow Internal Environment Variables:

🛠️ Debugging Scripts

Usage for Local Development:

🤝 Acknowledgement

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages