-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
CLN: remove unnecessary check needs_i8_conversion if Index subclass does not support any or all
#58006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: remove unnecessary check needs_i8_conversion if Index subclass does not support any or all
#58006
Conversation
pandas/core/indexes/base.py
Outdated
| ): | ||
| # This call will raise | ||
| make_invalid_op(opname)(self) | ||
| if isinstance(self, ABCMultiIndex) or self.dtype.kind != "m": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the != "m" bit still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. I think we don't need != "m" here. This check worked only together with the check needs_i8_conversion(self.dtype).
I removed != "m" and corrected tests.
I have a question: shouldn't we raise for DatetimeIndex? So far we show FutureWarning:
"'any' with datetime64 dtypes is deprecated and will raise in a future version. Use (obj != pd.Timestamp(0)).any() instead."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a deprecation in nanany and nanall that i think is intended to be enforced before this needs_i8_conversion check is removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, it was my thought too.
Should I open new PR to enforce the deprecation in nanany and nanall, or can I do it in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enforced deprecation any/all with datetime64 in #58029
Co-authored-by: Matthew Roeschke <[email protected]>
| r"'IntervalArray' with dtype interval\[.*\] does " | ||
| "not support reduction '(any|all)'" | ||
| ) | ||
| if isinstance(idx, PeriodIndex): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can combine this branch with the IntervalArray branch directly preceding it and reduce the msg check to does not support reduction '(any|all)'
It is generally a good idea to try and reduce the amount of branching in tests. Having a bunch of if...else statements makes it unnecessarily harder to manage consistent behavior expectations
…//github.com/natmokval/pandas into remove-unnecessary-check-needs_i8_conversion
|
thank you @WillAyd , I combined these two branches and reduced the message checks as you suggested. I think if we change the error message in pandas/pandas/core/arrays/base.py Lines 1898 to 1901 in 4241ba5
to we can reduce the amount of branching in tests. Think that's a good idea? |
| assert idx.any() == idx._values.any() | ||
| assert idx.any() == idx.to_series().any() | ||
| else: | ||
| msg = "cannot perform (any|all)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can delete L225-L228? i.e. the only thing needed here to pass tests is msg = "does not support reduction '(any|all)'"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I deleted the branch with msg = "datetime64 type does not support operation: '(any|all)'", ci- green.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh OK I see the issue now. So you are branching because DatetimeIndex says does not support operation: '(any|all)' but the other cases say 'does not support reduction '(any|all)'
I don't want to go down a rabbit hole but having a message that differs between operation: and reduction doesn't seem that important. How hard is it to align them so you can just assign msg = "does not support operation: '(any|all)' and have the tests pass?
|
|
||
| def test_logical_compat(self, simple_index): | ||
| if simple_index.dtype in (object, "string"): | ||
| if simple_index.dtype in (object, "string", "datetime64[ns]"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the datetime skip required here? I am not overly familiar with this part of the code base but judging by the deleted comment I didn't think this needed to change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we shouldn't skip datetime64 here. We need the separate branch for datetime64 because we get different error message: "datetime64 type does not support operation: '(any|all)'"
If we replace the error message
Lines 521 to 523 in 6f39c4f
| if values.dtype.kind == "M": | |
| # GH#34479 | |
| raise TypeError("datetime64 type does not support operation: 'any'") |
with the msg
"does not support reduction '(any|all)'" we can combine these two checks. Can I do it in this PR or a new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to align the error messages and assign msg = “.* does not support operation .*“ . I replaced reduction with operation and operation: with operation to unify error messages. Tests passed.
WillAyd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for aligning those - looks pretty good one minor thing
| assert idx.any() == idx._values.any() | ||
| assert idx.any() == idx.to_series().any() | ||
| else: | ||
| msg = "cannot perform (any|all)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L225 looks like it can be safely deleted
WillAyd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm when green
|
Thanks @natmokval |
… does not support `any` or `all` (pandas-dev#58006) * cln-remove-unnecessary-check-needs_i8_conversion * fix tests * remove the check self.dtype.kind, fix tests * Update pandas/core/indexes/base.py Co-authored-by: Matthew Roeschke <[email protected]> * combine two branches in test_logical_compat * correct test_logical_compat * correct test_logical_compat * correct test_logical_compat * roll back the check for datetime64 in test_logical_compat * replace reduction with operation to unify err msg * delete unused line --------- Co-authored-by: Matthew Roeschke <[email protected]>
xref #54566
removed unnecessary check needs_i8_conversion if Index subclass does not support any or all.