Skip to content

Conversation

KaalaAadmi
Copy link
Member

Signed-off-by: Arnav Bhattacharya [email protected]

✨ Feature / Enhancement PR

πŸ”— Epic / Issue

Link to the epic or parent issue:
Closes #317


πŸš€ Summary

This PR introduces a script to automate checking and fixing file headers for Python source files, ensuring consistent metadata like copyright, license, and file location. It also adds make targets for easy execution and updates the development documentation.


πŸ§ͺ Checks

  • make lint passes
  • make test passes
  • CHANGELOG updated (if user-facing)

πŸ““ Notes

This change addresses the need for consistent file headers, which improves code navigation and maintainability.

Key Additions:

  • New Script: A Python script at /.github/tools/fix_file_headers.py that can scan, check, and fix file headers in both interactive and automated modes.
  • Makefile Integration: New targets have been added to the root Makefile for easy header management:
    • make check-headers: Performs a dry run to find files with incorrect headers.
    • make fix-all-headers: Automatically fixes all incorrect headers.
    • make interactive-fix-headers: Prompts for confirmation before fixing each file.
    • make fix-header path="...": Fixes a specific file or directory.
    • make fix-header path="..." authors="...": Fixes a specific file or directory, with an option to override the author.
  • Documentation:
    • The CONTRIBUTING.md has been updated to include the new file header requirements.
    • Detailed usage instructions for the new script and make targets are now in the development documentation at docs/docs/development/documentation.md.

This implementation uses Python's ast module to safely parse and modify docstrings, preserving existing content while enforcing the new standard.

@crivetimihai crivetimihai added this to the Release 0.5.0 milestone Aug 4, 2025
@crivetimihai
Copy link
Member

Really great work on this! Thank you for contributing. I will work on the fixes outlined below, then merge.

βœ… Overall Assessment

This is a well-implemented feature that addresses the need for consistent file headers across the Python codebase. The script uses Python's AST module for safe parsing and modification, which is the correct approach.

πŸ“‹ Code Quality Checklist

Passes testing

  • make doctest test smoketest lint-web flake8 passed (expected, main code not modified)

Implementation Quality

  • AST-based parsing: Uses ast module for safe Python code manipulation
  • Error handling: Comprehensive try-except blocks for file parsing
  • Multiple modes: Supports check, fix-all, fix, and interactive modes
  • Preserves existing content: Correctly handles existing docstrings and header fields
  • Handles edge cases: Accounts for files without docstrings, shebangs, encoding declarations
  • Clear output: Good use of emojis and colors for status reporting
  • Exit codes: Proper use of sys.exit(0) for success and sys.exit(1) for failures
  • Passes pylint: 8.99/10
  • Passes mypy/pyright: missing some typer annotations.

Code Structure

  • Well-documented: Comprehensive module docstring with usage examples -> missing complete docstring / doctest
  • Modular design: Clean separation of concerns with focused functions
  • Type hints: Proper use of type annotations throughout
  • Constants: Good use of module-level constants for configuration

Integration

  • Makefile targets: All required targets are properly implemented
  • Documentation: Both CONTRIBUTING.md and development docs updated
  • Author override: Supports custom authors via command line

πŸ”§ Issues to Address

1. Line Length Warning ⚠️

# Line 175 is too long (200+ chars)
new_source_code = source_without_shebangs_and_docstring.replace(raw_docstring, new_docstring, 1)

Fix: Break into multiple lines:

new_source_code = source_without_shebangs_and_docstring.replace(
    raw_docstring, new_docstring, 1
)

2. Makefile Target Error Handling πŸ›

The Makefile targets use -@ prefix which suppresses errors:

check-headers:
    -@python3 .github/tools/fix_file_headers.py --check

Issue: This will always return success even if the script fails.

Fix: Remove the - prefix to properly propagate exit codes:

check-headers:
    @python3 .github/tools/fix_file_headers.py --check

3. Missing Validation for Authors Field πŸ”

The script doesn't validate the authors string format. Consider adding validation to ensure proper comma-separated format.

Suggested addition:

def validate_authors(authors: str) -> bool:
    """Validate authors string format."""
    # Basic validation - non-empty and reasonable format
    return bool(authors and authors.strip())

4. Path Handling Edge Case πŸ›‘οΈ

When using --path with absolute paths outside the project:

if not target_path.is_absolute():
    target_path = PROJECT_ROOT / target_path

This could lead to unexpected behavior if an absolute path is provided.

Fix: Add validation:

if target_path.is_absolute() and not target_path.is_relative_to(PROJECT_ROOT):
    print(f"Error: Path '{args.path}' is outside project root.", file=sys.stderr)
    sys.exit(1)

πŸ’‘ Suggestions for Enhancement

  1. Dry-run output enhancement: In check mode, show a diff preview of what would be changed
  2. Configuration file: Consider supporting a .fix-headers.yaml config file for project-specific settings
  3. Pre-commit hook: The script is ready for pre-commit integration, but no hook is defined yet

πŸ“Š Testing Recommendations

Before merging, ensure these test scenarios are covered:

  1. Empty file handling: Test with completely empty .py files
  2. Unicode handling: Test with files containing non-ASCII characters
  3. Symbolic links: Ensure symlinked files are handled correctly
  4. Read-only files: Test behavior with files that lack write permissions
  5. Malformed Python: Test with syntactically invalid Python files

πŸš€ Merge Readiness

Can we merge? Almost!

Address these items first:

  1. Fix the Makefile error suppression issue
  2. Add path validation for absolute paths
  3. Fix the long line issue
  4. Test the script on a few sample files to ensure it works as expected

The implementation is solid and well-thought-out. Once the minor issues above are addressed, this PR will be ready to merge. Great work on using AST for safe code manipulation and providing multiple usage modes!


🎯 Test Summary

# Test the script before merging:
python3 .github/tools/fix_file_headers.py --check
python3 .github/tools/fix_file_headers.py --fix --path mcpgateway/main.py --authors "Test Author"

# Verify Makefile targets work correctly:
make check-headers
make fix-header path=mcpgateway/main.py authors="Test Author"

Copy link
Member

@crivetimihai crivetimihai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge! Also added (inactive) github actions, closes #315

πŸ“‹ Summary of All Changes

1. Script Behavior Changes πŸ”„

  • Default mode is now CHECK - Running without flags won't modify any files
  • Explicit --fix flag required - Must use --fix --path <path> to modify specific files
  • --fix validates --path - Script now errors if --fix is used without --path
  • Clear mode indicators - Script prints which mode it's running in at startup

2. New Script Features ✨

  • Shebang control: --require-shebang auto|always|never (auto = only for executable files)
  • Encoding control: --no-encoding to skip encoding line requirement
  • Debug mode: --debug shows file permissions, shebang status, etc.
  • Diff preview: --show-diff shows what would change in check mode
  • Custom config: --copyright-year, --license for project-specific needs
  • Better path validation: Ensures paths are within project boundaries
  • Author validation: Ensures non-empty author strings
  • Executable detection: Automatically determines if files need shebang based on permissions

3. Makefile Enhancements πŸ”§

  • Dynamic help integration - Follows your # help: pattern, no separate help target
  • New targets added:
    • check-headers-debug - Check with debug info
    • check-headers-diff - Check with diff preview
    • fix-all-headers-no-encoding - Fix without encoding requirement
    • fix-all-headers-custom - Fix with custom year/license/shebang
    • pre-commit-check-headers - For pre-commit integration
    • pre-commit-fix-headers - For pre-commit fixing
  • Enhanced parameters:
    • shebang=auto|always|never for shebang control
    • encoding=no to skip encoding line
    • debug=1 and diff=1 for check commands
    • year=YYYY and license=... for custom configs

4. Safety Improvements πŸ›‘οΈ

  • No accidental modifications - Check mode by default
  • Clear warnings when running fix commands
  • Error suppression removed from Makefile (proper exit codes)
  • Path validation prevents processing files outside project
  • Better error messages throughout

5. Documentation Updates πŸ“š

  • Reorganized by safety - Check modes (safe) vs Fix modes (modifies)
  • All new features documented with examples
  • Configuration options explained in detail
  • Best practices section added
  • Pre-commit integration documented
  • Header format example included

6. Code Quality Improvements 🎯

  • Full Google-style docstrings with examples
  • Complete type annotations using Python's typing module
  • Extensive doctests for all functions
  • Better error handling with specific exceptions
  • Cleaner code structure with modular functions
  • Constants for configuration

Signed-off-by: Mihai Criveti <[email protected]>
@crivetimihai crivetimihai merged commit bbde8b2 into IBM:main Aug 5, 2025
30 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CHORE]: Script to add relative file path header to each file and verify top level docstring
2 participants