Skip to content

Conversation

@muravvv
Copy link
Contributor

@muravvv muravvv commented Jun 1, 2025

Fix two issues, which prevents aider from using on repositories with single-byte encodings:

  1. Fix inability to use --llm-history-file option when non-Unicode encoding is set by --encoding. Before this fix any action in aider with this options results in following error:
Traceback (most recent call last):
  File "C:\Users\Vitya\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\aider\coders\base_coder.py", line 1454, in
send_message
    yield from self.send(messages, functions=self.functions)
  File "C:\Users\Vitya\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\aider\coders\base_coder.py", line 1788, in
send
    self.io.log_llm_history("TO LLM", format_messages(messages))
  File "C:\Users\Vitya\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\aider\io.py", line 754, in log_llm_history
    log_file.write(content + "\n")
  File "C:\Program Files\Python312\Lib\encodings\cp1251.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\u22ee' in position 8769: character maps to <undefined>

The symbol ⋮ (\u22ee) is used by aider to mark skipped lines in repo map. So to fix this bug I have made utf-8 encoding to be always used for .aider.llm.history log regardless of --encoding option.

  1. Fix impossibility to generating commit message if changed lines (or diff context lines) contain non-UTF-8 symbols (non-ASCII symbols in single-byte encoding). I have seen this bug with Ollama models, but most likely the problem arises with most of other models. In this case error message looks like:
litellm.APIConnectionError: 'utf-8' codec can't encode characters in position 3618-3626: surrogates not allowed
Traceback (most recent call last):
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\litellm\main.py", line 2937, in completion
    generator = ollama_chat.get_ollama_response(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\litellm\llms\ollama_chat.py", line 313, in get_ollama_response
    response = sync_client.post(
               ^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\litellm\llms\custom_httpx\http_handler.py", line 578, in post
    raise e
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\litellm\llms\custom_httpx\http_handler.py", line 554, in post
    req = self.client.build_request(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\httpx\_client.py", line 378, in build_request
    return Request(
           ^^^^^^^^
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\httpx\_models.py", line 408, in __init__
    headers, stream = encode_request(
                      ^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\httpx\_content.py", line 216, in encode_request
    return encode_json(json)
           ^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\uv\tools\aider-chat\Lib\site-packages\httpx\_content.py", line 179, in encode_json
    ).encode("utf-8")
      ^^^^^^^^^^^^^^^
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 3618-3626: surrogates not allowed

Retrying in 0.2 seconds...

The problem was arised from using wrong encoding settings in parsing of git diff command output: it use default encoding for filenames (which is utf-8 on most systems) instead of encoding set by --encoding option, In addtion to wrong encoding, PythonGit also have used surrogateescape error handling mode which produce wrong Unicode surrogates, that cannot be converted to UTF-8 when sending to models.

@CLAassistant
Copy link

CLAassistant commented Jun 1, 2025

CLA assistant check
All committers have signed the CLA.

@paul-gauthier paul-gauthier merged commit 3266eac into Aider-AI:main Jun 1, 2025
9 checks passed
@paul-gauthier
Copy link
Collaborator

Thanks!

@muravvv muravvv deleted the fix_encoding branch June 1, 2025 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants