Skip to content

Coders should not rely on default encoding from operating system. #14

@valerie-autumn-skye

Description

@valerie-autumn-skye

The default code page on Windows is "Windows-1252" (cp1252), but goose sessions are stored as UTF-8. This causes crashes when content is read with the wrong code page.

Error:

(metacoder) PS C:\Users\CTParker\PycharmProjects\metacoder> uv run metacoder eval .\tests\input\goose_eval_test.yaml                 
🔬 Running evaluations from: tests\input\goose_eval_test.yaml
📊 Loaded dataset: pubmed tools evals
   Models: gpt-4o
   Coders: goose, dummy (all available)
   Cases: 1
   Total evaluations: 2

🚀 Starting evaluations...
Progress: 1/1 - goose/gpt-4o/disease with servers: mcp-simple-pubmed, ols-mcp
Running goose with gpt-4o on case 'disease'
📁 Preparing workdir: eval_workdir\gpt-4o_goose_disease_mcp-simple-pubmed_ols-mcp\gpt-4o_goose_disease
🔒 Obtaining lock for eval_workdir\gpt-4o_goose_disease_mcp-simple-pubmed_ols-mcp\gpt-4o_goose_disease; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔧 Writing config object: .config/goose/config.yaml type=yaml
🔓 Releasing lock for eval_workdir\gpt-4o_goose_disease_mcp-simple-pubmed_ols-mcp\gpt-4o_goose_disease; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🔒 Obtaining lock for eval_workdir\gpt-4o_goose_disease_mcp-simple-pubmed_ols-mcp\gpt-4o_goose_disease; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
🦆 Running command: goose run -t According to PMID:35743164, What 3 diseases are associated with ITPR1 mutations? Give me disease names and MONDO IDs
🦆 Command took 31.43575930595398 seconds
🔓 Releasing lock for eval_workdir\gpt-4o_goose_disease_mcp-simple-pubmed_ols-mcp\gpt-4o_goose_disease; current_dir=C:\Users\CTParker\PycharmProjects\metacoder
Traceback (most recent call last):
  File "C:\Users\CTParker\AppData\Roaming\uv\python\cpython-3.10.16-windows-x86_64-none\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\CTParker\AppData\Roaming\uv\python\cpython-3.10.16-windows-x86_64-none\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\CTParker\PycharmProjects\metacoder\.venv\Scripts\metacoder.exe\__main__.py", line 10, in <module>
    sys.exit(main())
  File "C:\Users\CTParker\PycharmProjects\metacoder\.venv\lib\site-packages\click\core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\CTParker\PycharmProjects\metacoder\.venv\lib\site-packages\click\core.py", line 1363, in main
    rv = self.invoke(ctx)
  File "C:\Users\CTParker\PycharmProjects\metacoder\.venv\lib\site-packages\click\core.py", line 1830, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\CTParker\PycharmProjects\metacoder\.venv\lib\site-packages\click\core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\CTParker\PycharmProjects\metacoder\.venv\lib\site-packages\click\core.py", line 794, in invoke
    return callback(*args, **kwargs)
  File "C:\Users\CTParker\PycharmProjects\metacoder\src\metacoder\metacoder.py", line 587, in eval_command
    results = runner.run_all_evals(dataset, workdir_path, coders_list)
  File "C:\Users\CTParker\PycharmProjects\metacoder\src\metacoder\evals\runner.py", line 386, in run_all_evals
    results = self.run_single_eval(
  File "C:\Users\CTParker\PycharmProjects\metacoder\src\metacoder\evals\runner.py", line 220, in run_single_eval
    output: CoderOutput = coder.run(case.input)
  File "C:\Users\CTParker\PycharmProjects\metacoder\src\metacoder\coders\goose.py", line 169, in run
    ao.structured_messages = [
  File "C:\Users\CTParker\PycharmProjects\metacoder\src\metacoder\coders\goose.py", line 169, in <listcomp>
    ao.structured_messages = [
  File "C:\Users\CTParker\AppData\Roaming\uv\python\cpython-3.10.16-windows-x86_64-none\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 6476: character maps to <undefined>

Sub-issues

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions