Skip to content

find_software_name_for_patch can fail when non UTF8 files exist #3781

@Micket

Description

@Micket

Whilst trying

eb --update-pr 13453 Brotli-1.0.9_pc_link_flags.patch --pr-commit-msg="Add patch"

the code that tries to figure out where the patch belongs fails hard:

Traceback (most recent call last):
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/main.py", line 557, in <module>
    main()
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/main.py", line 476, in main
    update_pr(options.update_pr, categorized_paths, ordered_ecs)
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/github.py", line 1876, in update_pr
    update_branch(branch_name, paths, ecs, github_account=github_account, commit_msg=commit_msg)
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/github.py", line 1848, in update_branch
    commit_msg=commit_msg)
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/github.py", line 861, in _easyconfigs_pr_common
    patch_specs = det_patch_specs(paths['patch_files'], file_info, [target_dir])
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/github.py", line 1051, in det_patch_specs
    soft_name = find_software_name_for_patch(patch_file, ec_dirs)
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/github.py", line 1079, in find_software_name_for_patch
    rawtxt = read_file(path)
  File "/apps/Common/software/EasyBuild/4.4.1/lib/python3.6/site-packages/easybuild/tools/filetools.py", line 209, in read_file
    txt = handle.read()
  File "/usr/lib64/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb4 in position 241240: invalid start byte

The file in question was

easybuild-easyconfigs/easybuild/easyconfigs/g/Grace/Grace-5.1.25-5build1.patch

which seems to be ISO-8859 encoded.

We have some options

  1. I think this function should skip all patches, this just seems like a bug that it doesn't.
  2. We could consider forcing UTF8 encoding? But it might be hard for patches..
  3. We can just handle the errors here and proceed.
  4. Make read_file encoding aware by some means?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions