-
Notifications
You must be signed in to change notification settings - Fork 194
Fix race in TraceProviderSDK #864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
func withLockVoid(_ body: () throws -> Void) rethrows { | ||
try withLock(body) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, please run this through swift-format, looks like you're using 4-space tabs, instead of 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You bet, thanks for the quick turnaround on the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
@bryce-b do you need to land this or is there a trick for me to land it? |
Is anyone able to merge this branch, I don't have that access. Thanks! |
I think it should be implemented using a more granular read/write lock, as it is shown in the sanitizer callstack that many threads can access it for reading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is the approach needed here.
@nachoBonafonte what implementation would you propose? As it was before with the r/w lock there was a race condition. As proposed, the lock will do just fine and perform very well as it is very constrained to a small area of concern which is processed quickly. I am not worried about performance by adding this lock. |
There is already a ReadWriteLock class in OpenTelemetrySdk/Internals/Locks.swift that could be used instead |
@nachoBonafonte that what was there before but the use of it in the provider was flawed as it caused race conditions. This PR fixes that issue. What I'm wondering is what you're thinking of as a better implementation than what I propose in this PR. |
What caused the race condition (I imagine because you did not post the whole callback and you changed the locks) is that with Swift using pthread synchronization objects directly can cause race conditions because of the way the memory is handled, and they must be wrapped in classes for them to work properly. Not because it was using a read/write lock for that. Being said that, in that method there is another source of thread race conditions that comes from the access to the shared state. |
It's true that the rw lock wasn't correctly implemented, but it doesn't need to be in a class. That being said, the race condition I'm pointing out is that the lock that was there previously held a read in order to check for the existence of the shared provider. Then it released the lock. Afterwards, did some work and took the write lock to set that new tracer. The time between the read and the write holds o the lock are the race condition I'm talking about. So basically, this PR fixes the issue you mention and the issue I mention. |
And what is the memory being accessed concurrently there? I would say none, and it there were something being written, it should be inside the write lock. |
And even if we used a standard lock (which I don't think is the proper solution), we should use the one existing in the Locks.swift file I mentioned, and not reimplement it. |
It's not concurrent access. The race condition in this case is that you can read and it's nil before another write happens, then the first write happens and that previous write will write a second time, overwriting the previous one leading to multiple instances of the provider with no clear owner.
I wanted to, but it would create a circular dep. We could if we moved it around a bit, but I thought it would be problematic to do that in this PR, but once we fix this issue, we can definitely do it. |
@nachoBonafonte In the spirit of speeding things up and coming to a compromise, I've updated the lock to the read/write lock class so it actually works and eliminates one of the race conditions. Sound good to you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works for me, thanks for the changes
Race condition: https://gist.github.com/naftaly/f5cda479adf1451da63a38e8672b3fb4