-
Notifications
You must be signed in to change notification settings - Fork 1.8k
CPP: Model strdup #2635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPP: Model strdup #2635
Conversation
| outModel.isReturnValue() | ||
| or | ||
| inModel.isParameterDeref(iIn) and | ||
| outModel.isReturnValueDeref() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that there should not be data flow from x to strdup(x), only taint flow. The distinction between data and taint flow remains a bit blurry, and in the end it's about what people might use the libraries for. I can imagine using data flow to follow a pointer from malloc to free. Then I'd not want to follow it across strdup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved it to taint tracking - and then realized that it's the same as one of the cases in exprToDefinitionByReferenceStep only to a ReturnValueDeref rather than a ParameterDeref ... and in fact all of the cases in exprToDefinitionByReferenceStep should be duplicated for return values in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems right to me, but I'd like to hear the opinion of @rdmarsh2 before merging.
@rdmarsh2, I think you were involved the last time we made the taint-tracking library use the "deref" data-flow models to conflate pointers with their objects. We did it only for reference arguments, and this PR now extends it it to cover return values as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's the same caveats about false positives, and I'd like to see some testing on real code, but I'm for it.
| outModel.isReturnValue() | ||
| or | ||
| inModel.isParameterDeref(iIn) and | ||
| outModel.isReturnValueDeref() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems right to me, but I'd like to hear the opinion of @rdmarsh2 before merging.
@rdmarsh2, I think you were involved the last time we made the taint-tracking library use the "deref" data-flow models to conflate pointers with their objects. We did it only for reference arguments, and this PR now extends it it to cover return values as well.
|
When the change note is fixed, I'm happy to merge this PR. Then we can see in the nightly CPP-Differences whether there are any surprising FPs. |
jbj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge this. @rdmarsh2, if you test and find a problem with this PR, it should be easy enough to revert the offending code.
|
I plan to have a thorough look through the CPP-diffs on Monday (as suggested above). |
|
You'll have to wait for the submodule bump in https://git.semmle.com/Semmle/code/pull/36032 before the results show up. |
Yep, I noticed this change is not reflected in the latest CPP-diffs run (https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/742/). |
|
The corresponding diff job is https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/743/ and it contains no changes to query results. I'm not sure how surprising this is as I'm not sure how many queries are (just now) still using the old security library taint anyway. |
We already modelled
strdupas an allocation function. This PR addsArrayFunctionandDataFlowFunctionmodels for it as well (for https://jira.semmle.com/browse/CPP-466).Notably
DataFlowUtil.qllhas extremely sparse support for possible data flows through modelled functions. I suspect I will be making many changes to it in the coming days / weeks so please be super critical and/or constructive about that part.