Skip to content

Conversation

@fungjj92
Copy link

@fungjj92 fungjj92 commented Jan 22, 2020

Overview

This PR is a first go at fixing the failing tests from the Py2->3 upgrade.
The tests should be fixed for the exporter app 🎉 One down.

I touched tests / code across the django apps api, exporter, importer, treemap. Except for exporter, they continue to have errors. I'm not sure how the test runner selects the order of tests to run because it skipped around these 4 apps when running ./scripts/manage.py test --failfast

Most of the changes thus far have been about unicode/byte - string translation.
The celery and Django-tinsel upgrades were to fix test-related errors, see commit df782b1

Notes

In retrospect, I would have worked on one app's tests at a time. My advice to parallelize this work moving forward is just that:
./scripts/manage.py test <app_name> --failfast

It's impossible to linearly compare progress for various reasons. But anyway, we're making progress!

Before:

Ran 1115 tests in 275.613s

FAILED (failures=19, errors=305, skipped=23)

And this is what remains after related-PRs 😳

Ran 1115 tests in 331.760s

FAILED (failures=26, errors=125, skipped=23)

Testing

In the app VM, pip install the new python dependencies:
vagrant ssh app -c 'cd /opt/app/core && envdir /etc/otm.d/env pip3 install -r ./requirements.txt'

Checkout the otm-addons companion branch https://github.com/OpenTreeMap/otm-addons/pull/1566

I suppose you'll want to test each of the apps that was touched and check if any still failling tests are a result of lines of codes touched in this PR.
./scripts/manage.py test <app_name>

Also ensure that the changes don't chip away important context just to get tests to work.

Copy link
Contributor

@jwalgran jwalgran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The majority of these changes are clear and straightforward. I made a few comments and suggestions.

@fungjj92 fungjj92 force-pushed the jf/fix-tests-2to3 branch 3 times, most recently from 99bb614 to 301695d Compare January 27, 2020 16:33
@fungjj92
Copy link
Author

Standby. I am in the middle of fixing auth test issues in this branch, I realized there were some errors I was making in encoding/decoding.

@fungjj92
Copy link
Author

Alright this is ready to be looked at.

Ran 1115 tests in 297.171s

FAILED (failures=25, errors=125, skipped=23)

I had been clobbering the basic auth tests with unwanted layers of encoding & decoding. I'm not sure how these failures didn't arise earlier, but alas. I was tipped off by the lack of output at the regex parsers; no surprise but regex is great at preserving and communicating its desired data format, which was helpful in figuring out how the auth data should look.

Copy link
Contributor

@jwalgran jwalgran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got 264 errors instead of 125

Ran 1115 tests in 322.333s

FAILED (failures=25, errors=264, skipped=23)
Destroying test database for alias 'default'...

I verified that I had the correct branches and SHAs checked out.

~/Projects/otm-cloud (python3)
> kj git cb
===== otm-cloud =====
python3
===== otm-core =====
jf/fix-tests-2to3
===== otm-addons =====
jf/fix-tests-2to3
===== otm-tiler =====
develop
===== otm-ecoservice =====
develop

~/Projects/otm-cloud (python3)
> kj git status
===== otm-cloud =====
On branch python3
nothing to commit, working tree clean
===== otm-core =====
On branch jf/fix-tests-2to3
Your branch is up to date with 'origin/jf/fix-tests-2to3'.

nothing to commit, working tree clean
===== otm-addons =====
On branch jf/fix-tests-2to3
Your branch is up to date with 'origin/jf/fix-tests-2to3'.

nothing to commit, working tree clean
===== otm-tiler =====
On branch develop
Your branch is up to date with 'origin/develop'.

nothing to commit, working tree clean
===== otm-ecoservice =====
On branch develop
Your branch is up to date with 'origin/develop'.

nothing to commit, working tree clean

~/Projects/otm-cloud (python3)
> kj git rev-parse HEAD
===== otm-cloud =====
25eda864651968a4f67298b5c1acada7d437d3a8
===== otm-core =====
82c79a2991aa2bd407706880ae734d859ff773c9
===== otm-addons =====
4b94756207566b68736e9e6fe3fe089cb743d11a
===== otm-tiler =====
65328ada3d6303b59b60d31d16dbc2909e3c2816
===== otm-ecoservice =====
8aeb53b2d858a3227e443c7f912d13bca77bacb0

I will reprovision my VM from scratch to make sure something strange isn't going on.

I also commented on a concern I have about removing base64 encoding from the auth process.

ret = get_signed(self.client, "%s/user" % API_PFX, **withauth)
self.assertEqual(ret.status_code, 401)

auth = base64.b64encode("foobar")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider updating the commit message to explain why we are removing this base 64 encoding. It is my understanding that basic auth requires base 64.

In basic HTTP authentication, a request contains a header field in the form of Authorization: Basic <credentials>, where credentials is the base64 encoding of id and password joined by a single colon :.

from https://en.wikipedia.org/wiki/Basic_access_authentication

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this detail. I took another dive in with this better baseline understanding, and get what is going on 100% now. In py3, there are ripple effect from having to have the initial credentials start as bytes. This complication appears on these lines: https://github.com/OpenTreeMap/otm-core/pull/3288/files#diff-c7aac248bb6ed915db70251295a116a3R96-R103. I pushed up changes to this effect.


data = [dict(list(zip(header, [x.decode('utf8') for x in row])))
for row in reader]
reader_rows = list(reader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a comment explaining why it is necessary to convert the reader to a list. It looks like the only use of reader_rows in in the list comprehension below, which I would expect to be able to iterate over reader as-is.

@jwalgran
Copy link
Contributor

I did provision a fresh app VM and got the same results

FAILED (failures=25, errors=264, skipped=23)

@fungjj92 fungjj92 force-pushed the jf/fix-tests-2to3 branch 3 times, most recently from 2bfb4ab to 6d97f28 Compare February 3, 2020 16:06
Copy link
Contributor

@jwalgran jwalgran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your diligent work on this PR. I compared the summary output of ./scripts/manage.sh test.

On the python3 branch

FAILED (failures=19, errors=392, skipped=23)

On this branch

FAILED (failures=25, errors=264, skipped=23)

👍

I also made a few nonessential suggestions.

@jwalgran jwalgran removed their assignment Feb 5, 2020
One major change in Python3 is a clean divorce of unicode and binary data. Strings are unicode and bytes are binary. In python2 they could sometimes be handled interchangeably. Now one must explicitly convert between the two data formats, encoding to strings to bytes or decoding bytes to strings.

- Django's HttpResponse.content is explicitly a bytestring
- Use BytesIO instead of StringIO to prepare request data
- Basic auth requires b64 encoding of byte credentials. The result of this is several new layers of de/encoding and unpacking are needed to verify the Basic auth credentials.
Specifically, Celery and Django-tinsel required version bumps. See: celery/celery#5049 and azavea/django-tinsel#9
The message attribute (and all attributes, other than args) were removed off the Exception base class between Py2 and Py3.
D/encoding worked differently and more loosely in py2. Now decoding converts bytes to unicode, vice versa for encoding. Also, fix a duplicate conditional expression that captures strings to capture integers too.
Some breaking changes in usage in modgrammar v.9 => v.10. Whitespace cleaning is invoked through a different class property (bool: grammar_whitespace => str: grammar_whitespace_mode) and  stripping is no longer done by default. The parse_string method has strictly defined behavior, prefer to use parse_text with args set.

Modgrammar is no longer maintained (or so it seems) and the latest / most helpful documentation is this google group post: https://groups.google.com/forum/#!msg/modgrammar/3CsPBb_UCck/kKr-0YmwtJ8J
fungjj92 and others added 6 commits February 6, 2020 14:04
A CSV by python definition is an iterator object that returns a string. A python3 string is defined as unicode and not binary, explicitly, so we have to be explicit about passing string data to csv generators.
A downside to this solution is we may now have 2 copies of the entire file in memory while it is being sliced into individual database rows but this should probably be ok given the expected use cases and available computing resources.
Strings are implicitly unicode in py3
Fix remaining broken tests in manage_treemap Django app
@fungjj92
Copy link
Author

fungjj92 commented Feb 6, 2020

Thank you for the well-considered rounds of reviews. Onward! Merging.

@fungjj92 fungjj92 merged commit a6e5a8b into python3 Feb 6, 2020
@fungjj92 fungjj92 deleted the jf/fix-tests-2to3 branch February 6, 2020 19:16
tzinckgraf added a commit to sustainablejc/otm-core that referenced this pull request Apr 13, 2021
commit 85e0494
Author: Tom Z <[email protected]>
Date:   Mon Apr 12 00:06:19 2021 -0400

    feature/api-python-3 - additional final features

commit 9f3bd4e
Author: Tom Z <[email protected]>
Date:   Sun Apr 11 23:43:41 2021 -0400

    feature/api-python-3 - final pieces of this feature

commit f4866de
Author: Tom Z <[email protected]>
Date:   Fri Apr 9 20:18:09 2021 -0400

    feature/api-python-3 - some additional fixes

commit 7c3de56
Author: Tom Z <[email protected]>
Date:   Sat Apr 3 23:15:48 2021 -0400

    feature/api-python-3 - update the map page with new layers and finishing
    touches on other pieces

commit bdb8183
Author: Tom Z <[email protected]>
Date:   Sat Mar 27 10:34:51 2021 -0400

    feature/api-python-3 - validation errors are working

commit 412ea36
Author: Tom Z <[email protected]>
Date:   Sun Mar 21 21:35:57 2021 -0400

    feature/api-python-3 - fixed up the creation. various fixes to the UI.

commit ed4d9bc
Author: Tom Z <[email protected]>
Date:   Fri Mar 19 17:14:04 2021 -0400

    feature/api-python-3 - more fixes

commit d1dd629
Author: Tom Z <[email protected]>
Date:   Sat Mar 13 13:34:59 2021 -0500

    feature/api-python-3 - fixes to the pagination and a sample for creation

commit c451010
Merge: fc1765d 13d6c14
Author: Tom Z <[email protected]>
Date:   Sun Mar 7 21:48:28 2021 -0500

    Merge branch 'feature/api-python-3' of github.com:sustainablejc/otm-core into feature/api-python-3

commit fc1765d
Author: Tom Z <[email protected]>
Date:   Sun Mar 7 21:34:50 2021 -0500

    feature/api-python-3 - updated bootstrap and everything associated with
    it

commit 13d6c14
Author: Tom Z <[email protected]>
Date:   Sun Feb 28 21:48:17 2021 -0500

    fixed the Dockerfile

commit e0b4e92
Author: Tom Z <[email protected]>
Date:   Sun Feb 28 18:43:18 2021 -0500

    feature/api-python-3 - added common trees. refactored some code to move
    it to easier places

commit 7c88c3c
Author: Tom Z <[email protected]>
Date:   Sat Feb 27 11:05:27 2021 -0500

    feature/api-python-3 - the api features combined with python 3

commit c7e20ca
Merge: e06bae0 8ee5891
Author: Tom Z <[email protected]>
Date:   Tue Feb 23 20:43:41 2021 -0500

    Merge branch 'python3' into feature/api

commit e06bae0
Author: Tom Z <[email protected]>
Date:   Sat Feb 20 17:05:37 2021 -0500

    feature/api - fixed the styling to match the current OTM. the create
    tree feature almost works

commit 8935e23
Author: Tom Z <[email protected]>
Date:   Mon Feb 15 22:40:30 2021 -0500

    feature/api - frontend working with grids and the sidebar

commit 8ee5891
Author: Tom Z <[email protected]>
Date:   Sun Jan 24 19:49:43 2021 -0500

    python3 - fixed minor issues, getting  closer to all test cases

commit 15e5b8f
Author: Tom Z <[email protected]>
Date:   Mon Jan 18 15:21:51 2021 -0500

    feature/api - included the build_all script

commit 9d5c0b1
Author: Tom Z <[email protected]>
Date:   Sun Jan 17 17:36:27 2021 -0500

    feature/api - added the tree popup

commit 4c2691c
Author: Tom Z <[email protected]>
Date:   Sat Jan 16 12:47:53 2021 -0500

    feature/api - a working instance of a UTF Grid

commit ad1fc56
Author: Tom Z <[email protected]>
Date:   Sat Jan 9 10:45:05 2021 -0500

    feature/api - added a bunch of fixes to Webpack to upgrade everything.

    Webpack is now v4

    This required some changes to the reverse library, as I could not get
    the export to work, so instead I updated the calls using reverse.

commit 4282184
Author: Tom Z <[email protected]>
Date:   Fri Jan 1 14:24:51 2021 -0500

    feature/api - upgrade to webpack

commit fbd1299
Author: Tom Z <[email protected]>
Date:   Sun Dec 27 16:05:54 2020 -0500

    feature/api - routes and users page

commit f745822
Author: Tom Z <[email protected]>
Date:   Sun Nov 15 00:02:39 2020 -0500

    python3 - more test cases working

commit a6e5a8b
Merge: c1fb9fb d3b4fb9
Author: Jenny Fung <[email protected]>
Date:   Thu Feb 6 14:16:07 2020 -0500

    Merge pull request OpenTreeMap#3288 from OpenTreeMap/jf/fix-tests-2to3

    First pass at fixing tests from Py2->3 upgrade

commit d3b4fb9
Merge: d85cd29 b8a632a
Author: Jenny Fung <[email protected]>
Date:   Thu Feb 6 14:14:23 2020 -0500

    Merge pull request OpenTreeMap#3290 from OpenTreeMap/jf/fix-tests-manage-treemap

    Fix remaining broken tests in manage_treemap Django app

commit b8a632a
Author: Jenny Fung <[email protected]>
Date:   Tue Feb 4 13:22:14 2020 -0500

    Remove obsolete unicode literal

    Strings are implicitly unicode in py3

commit db54d58
Author: Jenny Fung <[email protected]>
Date:   Tue Feb 4 13:21:11 2020 -0500

    Repackage bytestream into string stream for csv

    A CSV by python definition is an iterator object that returns a string. A python3 string is defined as unicode and not binary, explicitly, so we have to be explicit about passing string data to csv generators.
    A downside to this solution is we may now have 2 copies of the entire file in memory while it is being sliced into individual database rows but this should probably be ok given the expected use cases and available computing resources.

commit 0f2c3f0
Author: Jenny Fung <[email protected]>
Date:   Mon Feb 3 17:01:34 2020 -0500

    Format request params as unicode strings

commit c011701
Author: Jenny Fung <[email protected]>
Date:   Fri Jan 24 12:44:19 2020 -0500

    Encode all strings to bytes before hashing

commit ed3282c
Author: Jenny Fung <[email protected]>
Date:   Fri Jan 24 12:30:27 2020 -0500

    Handle image data as bytes not unicode

commit e6a731e
Author: Jenny Fung <[email protected]>
Date:   Fri Jan 24 09:25:02 2020 -0500

    Upgrade modgrammar behavior for v.10

    Some breaking changes in usage in modgrammar v.9 => v.10. Whitespace cleaning is invoked through a different class property (bool: grammar_whitespace => str: grammar_whitespace_mode) and  stripping is no longer done by default. The parse_string method has strictly defined behavior, prefer to use parse_text with args set.

    Modgrammar is no longer maintained (or so it seems) and the latest / most helpful documentation is this google group post: https://groups.google.com/forum/#!msg/modgrammar/3CsPBb_UCck/kKr-0YmwtJ8J

commit b79a409
Author: Jenny Fung <[email protected]>
Date:   Thu Jan 23 13:01:12 2020 -0500

    Encode image data in bytes not string

    Refs https://github.com/OpenTreeMap/otm-cloud/issues/505

commit d85cd29
Author: Jenny Fung <[email protected]>
Date:   Wed Jan 22 09:54:03 2020 -0500

    Fix incorrect coercion to string method

    D/encoding worked differently and more loosely in py2. Now decoding converts bytes to unicode, vice versa for encoding. Also, fix a duplicate conditional expression that captures strings to capture integers too.

commit b386116
Author: Jenny Fung <[email protected]>
Date:   Wed Jan 22 09:21:04 2020 -0500

    Handle CSV data de/encoding

commit 6fa64bf
Author: Jenny Fung <[email protected]>
Date:   Tue Jan 21 18:42:00 2020 -0500

    Comply with PEP 352

    The message attribute (and all attributes, other than args) were removed off the Exception base class between Py2 and Py3.

commit bc961a8
Author: Jenny Fung <[email protected]>
Date:   Tue Jan 21 15:22:25 2020 -0500

    Update deps to Py3.7-compatible versions

    Specifically, Celery and Django-tinsel required version bumps. See: celery/celery#5049 and azavea/django-tinsel#9

commit a1a3b6d
Author: Jenny Fung <[email protected]>
Date:   Fri Jan 10 09:55:35 2020 -0500

    Clarify usage of bytes vs. string

    One major change in Python3 is a clean divorce of unicode and binary data. Strings are unicode and bytes are binary. In python2 they could sometimes be handled interchangeably. Now one must explicitly convert between the two data formats, encoding to strings to bytes or decoding bytes to strings.

    - Django's HttpResponse.content is explicitly a bytestring
    - Use BytesIO instead of StringIO to prepare request data
    - Basic auth requires b64 encoding of byte credentials. The result of this is several new layers of de/encoding and unpacking are needed to verify the Basic auth credentials.

commit c1fb9fb
Merge: f63cfb9 ccef808
Author: Justin Walgran <[email protected]>
Date:   Fri Jan 10 14:40:03 2020 -0700

    First stage of the application-level Python 2 to Python 3 conve… (OpenTreeMap#3285)

    First stage of the application-level Python 2 to Python 3 conversion

commit f63cfb9
Author: Hector Castro <[email protected]>
Date:   Tue Jan 7 15:44:52 2020 -0500

    Adjust hashbang for manage.py to use Python 3

    Ensure that manage.py makes use of the Python 3.7 binary, which is
    distinct from the python (Python 2.7) and python3 (Python 3.6)
    binaries.

commit c27a86b
Author: Justin Walgran <[email protected]>
Date:   Fri Jan 3 11:53:55 2020 -0700

    Update requirements for Python 3 compatibility

    Django 1.11.17 is the minimum version required to run under Python 3.
    https://docs.djangoproject.com/en/1.11/faq/install/#what-python-version-can-i-use-with-django

    `functools32` was a backport of a Python 3 included library.

    The 4.2.0 version of kombu fixes a syntax error under Python 3.
    celery/kombu#841

    wsgiref is part of the Python 3 standard library.

commit ccef808
Author: Justin Walgran <[email protected]>
Date:   Wed Jan 8 12:36:05 2020 -0700

    Add migrations as a result of 2to3 converting choices to a list

    After running 2to3 the choices defined for the `ExportJob.status` and
    `Instance.itree_region_default` fields were wrapped with a call to `list()`.
    This resulted in the order of the choices changing. Any change to the choices
    defined for a field requires a migration.

commit 545e383
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 16:35:05 2020 -0700

    Fix flake8 indentation errors after 2to3

commit b96f11d
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 16:26:11 2020 -0700

    Replace reference to file type with a call to open

    Python 3 removed the built in `file` type, which was semantically equivalent to
    `open` and what we are required to use in Python 3.

    https://stackoverflow.com/a/32131309

commit e5a7b47
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 16:04:28 2020 -0700

    Fix flake8 line length errors after 2to3

    We shorten and rearrange statements as needed to return to having all lines
    under 80 characters.

commit 0afa0fb
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 15:38:18 2020 -0700

    Fix multiple imports on one line flake 8 error

    Programmatically fixed with the following command

    autopep8 --in-place --recursive . --select E401

    https://github.com/hhatto/autopep8/

    We also removed a duplicate set of imports from ecobackend.py

commit ef8a6f2
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 15:02:30 2020 -0700

    Fix undefined variable flake8 errors

    It appears that these lines may have been functioning properly by accident,
    relying on the fact that in Python 2 list comprehensions did not have their own
    function scope.

commit 335f284
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 14:21:58 2020 -0700

    Avoid DotDict exceptions due to Python 3 change in hasattr handling

    Django 1.11.17 calls `hasattr(a_dotdict_field, 'resolve_expression')`.
    https://github.com/django/django/blob/121115d2c291b3969ac00ca62253f23513481739/django/db/models/sql/compiler.py#L990

    The Python 3 documentation says that this ends up calling `getattr` and
    returning False if an `AttributeError` is raised
    https://docs.python.org/3/library/functions.html#hasattr

    For dictionaries `getattr` ends up calling `__getitem__`. Our implementation of
    `__getitem__` correctly raises KeyError, and we were relying on a bug in Python
    2 where any exception raised during a `hasattr` check would result in False. Now
    that Python 3 explicitly checks for `AttributeError` we need this workaround

    The tradeoff is that we now return `AttributeError` instead of `KeyError` when
    attempting to access a non-existent key. If we rely on catching `KeyError`
    somewhere in our application we can fix it there.

commit adf4233
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 09:47:10 2020 -0700

    Replace bytes literal with strings when using modgrammer in Python 3

    "cannot use a string pattern on a bytes-like object" exceptions were being
    raised by code using modgrammer which expects string objects under Python 3.

commit 506306f
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 09:43:05 2020 -0700

    Fix variable references in list comprehension for Python 3

    List comprehensions in Python 3 create a new functional scope. As a result in
    the case where we are using a list comprehension when defining a class-level
    member we do not have access to the other class-level members within the
    comprehension. We have implemented a workaround that uses an immediately
    executed lambda to bind the member variables in the new scope.

commit bb20099
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 14:51:51 2020 -0700

    Remove extra blank lines created by 2to3

    Programmatically removed with the following command

    autopep8 --in-place --recursive . --select E303

    https://github.com/hhatto/autopep8/

commit fe25f7b
Author: Justin Walgran <[email protected]>
Date:   Fri Jan 3 14:12:16 2020 -0700

    Use 2to3 to upgrade to Python 3 compatible syntax

    These changes were made by running the following command from the project root:

    ```
    2to3 -w -n --no-diffs .
    ```

    The resulting changes have lint errors and broken functionality but we are
    committing them separately to allow distinguishing between changes made by the
    automated upgrade tooling and manual fixes that were required after the fact.

commit 2040c21
Author: Hector Castro <[email protected]>
Date:   Tue Jan 7 15:44:52 2020 -0500

    Adjust hashbang for manage.py to use Python 3

commit fcd6fb6
Author: Justin Walgran <[email protected]>
Date:   Mon Jan 6 09:35:05 2020 -0700

    FIXUP replace py2 backport of modgrammer with latest py3 version

commit 6a29266
Author: Justin Walgran <[email protected]>
Date:   Fri Jan 3 11:53:55 2020 -0700

    Update requirements for Python 3 compatibility

    Django 1.11.17 is the minimum version required to run under Python 3.
    https://docs.djangoproject.com/en/1.11/faq/install/#what-python-version-can-i-use-with-django

    `functools32` was a backport of a Python 3 included library.

    The 4.2.0 version of kombu fixes a syntax error under Python 3.
    celery/kombu#841

    wsgiref is part of the Python 3 standard library.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants