Skip to content

Conversation

@pitr-ch
Copy link
Member

@pitr-ch pitr-ch commented Oct 5, 2014

  • introduce #bind method to avoid value leaks
  • warn on using just #value=

@pitr-ch pitr-ch force-pushed the thread-local-var branch 2 times, most recently from 4be7c50 to 37cc349 Compare October 5, 2014 15:03
@elspethsoup
Copy link
Contributor

The part I don't understand is what is the harm in using the mechanism provided by ruby as the underlying store outside of jruby like the previous implementation? Switching from using symbols to the object_id of the ThreadLocalVar instance as the key seems like it would do that trick nicely with Thread#[] and Thread#thread_variable_get.

@chrisseaton
Copy link
Member

I've been looking for the original discussions, but can't find them at the moment. There was a lot of back-and-forth about this and it's changed several times, but I don't think we've arrived at the right solution yet. Perhaps we all need to have a think, arrive at a consensus carefully, and then code it up.

@chrisseaton
Copy link
Member

This was part of the discussion #22.

I'm thinking maybe we should use Thread#[], use the ThreadLocalVar as the key, but via WeakRef, and occasionally prune Thread#[] for collected ThreadLocalVars.

@pitr-ch
Copy link
Member Author

pitr-ch commented Oct 9, 2014

@soupmatt I wanted to do something like that too, but:

[1] pry(main)> Thread.current[123] = 'asd'
TypeError: 123 is not a symbol

It accepts only un-collectable Symbols as a key :/ (Strings are converted to symbols silently)

@chrisseaton
Copy link
Member

Ah yes - that was the key problem. Maybe we should just give up and get the programmer to specify their own symbol in the constructor?

@pitr-ch
Copy link
Member Author

pitr-ch commented Oct 9, 2014

I somewhat liked the current approach because then user is warned. (Hm, I should have probably put there a better warning message with better explanation, how and when the value leaks.) But the more I am thinking about this the more I believe I should find time to implement a WeakMap where only keys are weak, then a ThreadLocalVar would hold have this WeakMap with the values and Threads as keys. That should fix it.

@elspethsoup
Copy link
Contributor

@pitr-ch, the ref gem has a WeakKeyMap that might inform a possible implementation using weak references. You could use the Threads themselves as the keys, and then the values will get released when the keys are finalized.

- introduce #bind method to avoid value leaks
- warn on using just #value=
@pitr-ch
Copy link
Member Author

pitr-ch commented Oct 10, 2014

@soupmatt thanks for the link! I've looked at the implementation and it already does what I was about to write. (It verifies that the object_id was not reallocated for different object as well which I saw being an issue in similar lib.)

We are very carefully about adding dependencies, so far we have none. So we need to decide what to do next?

  • include dependency on ref
  • reimplement same weak_key_map in concurrent-ruby
  • do no add dependency into spec, just failing when ThreadLocalVar is required warning that ref needs to be installed, implies that ThreadLocalVar will become optional part of concurrent-ruby

cc @jdantonio, @mighe, @chrisseaton, @lucasallan

@jdantonio
Copy link
Member

@pitr-ch Thanks for looking into this and for the great suggestions. As you can probably guess I'm not a big fan of including other gems as dependencies. I'm not sure how long that position will remain pragmatic, but it is a slippery slope. Without that rule we would have already added dependencies on ruby-atomic and redcard at least. That being said, we want our code to be reliable. I'm comfortable with the third suggestion: optional requirement. But rather than failing if ref hasn't been required could we instead display a warning and use the current implementation? If I understand the situation (and I'll admit I'm not 100% sure I do) then it seems that the leak is very minor and may have no appreciable effect in most cases.

@pitr-ch
Copy link
Member Author

pitr-ch commented Oct 20, 2014

I think the leak is not that minor, let me provide an example: User will have local thread variable (e.g. current_user) and his application will be created Threads per request assigning a current user within the thread, then all the values assigned in the threads will leak. I've intentionally chosen storage on variable so it leaks value per thread not value per variable which would be happening for storage on thread. This comes from assumption that creating of many threads is bad, threads should be reused.

Therefore I would rather fix this or disable ThreadLocalVariable when ref is not loaded. Even with warning it could lead to unpleasant surprises.

@pitr-ch pitr-ch force-pushed the thread-local-var branch 3 times, most recently from 35fd4a8 to 096a10c Compare November 8, 2014 16:22
@pitr-ch
Copy link
Member Author

pitr-ch commented Nov 8, 2014

Fixed to use ref gem to avoid any leaks. It is not required by default. It fails on MRI where there is no ref gem installed. I did not include a fallback version to avoid any nasty surprise all together. Please review.

ThreadLocalVar needs to be required manually.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) when pulling 096a10c on thread-local-var into 9513562 on master.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of raising an error and refusing to work, could we instead output a warning and fall back to the original, degraded implementation? Leave it to the individual user to decide if they want to use the ref gem.

This is the approach the elasticsearch gem uses to enable persistent connections. It detects the presence of either typhoeus or patron and uses that gem when available. Without either of those optional dependencies the elasicesearch gem still works, it just doesn't perform as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me a memory leak means broken not just degraded implementation I would rather not provide it at all. As an user when I miss the warning about memory leak ending up debugging it for hours I would be much more frustrated that adding one dependency at the beginning. From my experience I really like libraries which are failing early explaining the problem. If you really want to avoid the dependency I could reimplement the weak_key_map for concurrent-ruby but than we'll have to maintain it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a bug and not just degraded functionality then we need to fix it. I have mixed feelings about the Ref gem. For pragmatism and expedience I'll probably suggest that for now we use Ref, but let me explain in more detail my concerns about adding dependencies to gems such as our. Adding dependencies to utility gems can lead to four problems, all of which I have experienced in production systems:

  1. Downstream incompatibilities: Once gem depends on other gems which depend on other gems and so on. Adding one gem dependency results in a tree of several gem dependencies being added. Incompatibilities then occur downstream. (This is the exact problem Bundler was created to solve).
  2. Code bloat: A gem dependency is added for one class/function, but the dependency gem has thousands of lines of code (internally and in its dependencies). That one class/function unnecessarily bloats the code base.
  3. Compilation errors: I work in a cross-platform shop and we regularly have problems with C extension which don't compile. This is especially problematic when I require a pure-Ruby gem that has a downstream dependency which uses C extensions. I'be only recently started working with Java native extensions but I presume this can be a problem for JRuby, too. (This is why I pre-compile various builds of this gem.)
  4. Abandonment: A dependency is added but the maintainers stop maintaining it.

The Ref gem passes two of these four metrics. It doesn't have any downstream dependencies and it is a very lean gem, so it's good for items 1 and 2 above. Not so much for items 3 and 4. With respect to item 3, Ref includes Java native extensions but it does not pre-compile those extensions. Therefore we cannot guarantee that any particular user of ours will be able to successfully install our gem on JRuby. With respect to item 4, Ref has not been updated since May of 2013. There are six open issues dating back to April 2012 and the maintainer has only responded to one of them. There is a PR that has been open since August of 2013 and the maintainer has not responded. As far as I can tell Ref has been abandoned by its maintainer.

I appreciate that it would be a bunch of extra work for us to implement our own weak key map, but we need to maintain the integrity of our own gem.

So I'm OK if we want to include Ref for now, but the best long-term solution is probably to build our own weak key map. Perhaps we can make that a goal for 1.0.

I may also reach out to the gem author. If Ref is no longer being actively maintained we may have an opportunity to bring it into the ruby-concurrency organization.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a very good point the 4th one. I agree we should either reimplement or pull the gem under concurrent-ruby in long-term. Technically it passes 3 because on JRuby the gem is not being loaded so gem 'ref', platform: :mri will avoid any problems with compilation on JRuby, that should make the short-term solution even more viable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about item 3. Even though we won't be using Ref in our JRuby gem, it will still be installed if we list the dependency in the gemspec. Compilation would fail on install even if we never load Ref at runtime. But now that I think it through, we can easily solve this. Since it isn't being used under JRuby I can added a guard in our gemspec to not require Ref when we create our pre-compiled JRuby build. Then Ref will never be installed.

I'll make the update on this branch later this evening.

This commit is only viable because we create a JRuby-specific gem build.
The conditional statement in the gemspec file is not processed at
runtime. It is only processed when the gem is built. Therefore it
*should* prevent the `Ref` gem from being installed when using the
JRuby-specific build. This has not been tested, however.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is only viable because we create a JRuby-specific gem build. The conditional statement in the gemspec file is not processed at runtime. It is only processed when the gem is built. Therefore it should prevent the Ref gem from being installed when using the JRuby-specific build. This has not been tested, however.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍 I think we can merge this PR after you test the build.

jdantonio added a commit that referenced this pull request Nov 14, 2014
Fix Thread leaking in ThreadLocalVariables

Tested the exclusion of the `Ref` gem in the pre-compiled Java build (using our automated build process) and it worked as intended. The `Ref` gem will not be installed under JRuby when using the precompiled Java build.
@jdantonio jdantonio merged commit 8f15a3e into master Nov 14, 2014
@jdantonio jdantonio deleted the thread-local-var branch November 14, 2014 12:41
@jdantonio
Copy link
Member

@pitr-ch Ever since we've merged this PR, our builds have consistently failed on MRI 1.9.3. The test Concurrent::ThreadLocalVar GC does not leave values behind when bind is not used always fails with the same result, ["expected: == 1 got: 101"](Concurrent::ThreadLocalVar GC does not leave values behind when bind is not used). I just discovered this today so I haven't had a chance to look at the code. Any idea what might be happening?

@pitr-ch
Copy link
Member Author

pitr-ch commented Nov 26, 2014

@jdantonio yeah this line 7d33d7e#diff-1de73cb618a4dedaa5d744e1dba7efc9R60 is not reliably triggering GC run. I'll try to figure something out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants