-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19574. Restore Subject propagation semantics for Java 22+ #7892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
This is final piece for Java 24 / 25 support. Please take a look @pan3793 @slfan1989 @jojochuang @cnauroth @steveloughran |
I have a few experienced members to recommend for the review. @Hexiaoqiao @ayushtkn @szetszwo I need your help to collaboratively review this PR. Thank you very much! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @stoty . It looks like Yetus is saying this won't apply to trunk. Does it need a rebase? I also just committed MAPREDUCE-7502, so if you rebase, the test fix won't need to be applied manually anymore.
Unfortunately Yetus cannot handle patches this big, I don't think we can get CI run via GitHub. |
@cnauroth @szetszwo Thank you for your responses! Regarding the issue with Yetus failing to compile, @ayushtkn proposed a solution to handle large PRs. I suggest we first run this PR to test it, and if it’s suitable for splitting, we can submit it in parts. @stoty Thank you for your contribution! Please review the details of this commit. During development, we need to replace Yetus with Ayush's branch. If this PR is merged, we should revert this commit. |
The problem with this is that it would make it impossible to get a clean JDK24/25 test run until all the changes are committed, hence all but the last commit would be flying blind WRT testing.
That sounds better, but I personally don't have the permissions for that. |
use ayushtkn's Yetus branch
Thanks, I've changed the Yetus version. |
@stoty I support creating a separate dev branch, +1, and I'm willing to help follow up and create the new branch. |
Thanks @slfan1989 . If the purpose of the new branch would be only getting Yetus work, and the new Yetus branch fixes the issue, then the feature branch may not be necessary. |
Thank you for your feedback. Let's wait for Yetus's results. |
For reference, here are my notes on the tests that failed on my JDK24 run: <style type="text/css"></style>
|
Thanks @stoty and @slfan1989 for your work. LGTM (with checked in PR #7886). One concerns, Yetus is saying not good to check in. Agree to work on dev branch, then merge to trunk when it is ready. Thanks all again. |
@stoty Yetus does have some issue applying changes when the changes are beyond a certain point. I reported it some time back: You can temporarily hack it by c-picking this commit & revert it once you get the result from Yetus: See, if it helps :-) |
@stoty Most tests can pass in Java 25 even without this PR. Is it possible to have a PR only include changes that fix affected tests? |
That would take some time @pan3793 . The problem is that very few of the tests are using Kerberos, and the subject issue mostly breaks Kerberized clusters. Making only enough changes to fix the tests would probably lead to a lot of real world breakages. We could remove the "unneccessary" changes from the test code. I'm not sure if Hadoop has a mechanism to run tests on a real cluster instead of miniclusters. If it does, then many of those changes will be needed for running tests on kerberized real clusters. I'm going to run the test sute on JDK24 without the patch, and report the breakages here. |
I general, I think for robustness reasons Hadoop should stop directly using Thread. Using a naked Thread by mistake in the code can lead to hard to debug bugs which only manifest in kerberized (i.e real production) clusters. It is much safer to use HadoopThread everywhere, the overhead is negligible compared to the cost of starting a new thread. (basically reading then setting a ThreadLocal or equivalent structure) |
* Runnables can be specified normally, but the work() method has to be | ||
* overridden instead of run() when subclassing. | ||
*/ | ||
public class HadoopThread extends Thread { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadoop has MiniKDC in the test infrastructure, is it possible to have a dedicated UT for HadoopThread
, so that we can test against different JDK versions to ensure the consistent semantics of Subject propagation?
and downstream projects that require Subject propagation also need to change their code, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need miniKDC for that, testing for the Subject propagation is trivial. Kerberos need Subject propagation, but we don't Kerberos to test that.
Yes, downstream projects that depend on the original Subject propagation semantics will need also to deal with the changes separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So HadoopThread
will be a public API used widely by downstream projects, then it needs more detailed javadocs (better to include examples), and a dedicated UT will also be a good example for developers to learn and understand what it does.
I think it's worth a dedicated PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a unit test.
It's really up to downstream projects.
For example, in HBase we support multiple Hadoop versions, so we will have to clone HadoopThread, otherwise we'd have to drop support for all earlier Hadoop versions.
Downstream users who are fine with supporting only the latest Hadoop release can use it directly.
If we consider this a public API, then maybe we could try to find a better name for it, as HadoopThread is not very descriptive, I just haven't been able to find one that is not super long or awkward.
SubjectInheritingThread would be a candidate, but it's too long IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support naming it SubjectInheritingThread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
I got confirmation that Hadoop cannot run the test suite on external cluters, so in theory we don't need to replace Thread everywhere in the tests. I still think that it's more robust to replace every Thread instance. |
|
||
@Override | ||
public final void run() { | ||
SubjectUtil.doAs(startSubject, new PrivilegedAction<Void>() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indention is incorrect here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reformatted the class
* Override this instead of run() | ||
*/ | ||
public void work() { | ||
throw new IllegalArgumentException(""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have a meaningful message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the logic to match the the one in Thread.
Subject startSubject; | ||
|
||
@Override | ||
public final void start() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indention
|
||
@Override | ||
public final void run() { | ||
SubjectUtil.doAs(startSubject, new PrivilegedAction<Void>() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indention
} | ||
}; | ||
|
||
HadoopThread t = new HadoopThread(r); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add another case that uses Thread
? and it should have different behaviors on different JDK versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stoty , thanks for working on this! I suggest to add a Builder; see the inlined comment.
public HadoopThread() { | ||
super(); | ||
} | ||
|
||
public HadoopThread(Runnable target) { | ||
super(); | ||
this.hadoopTarget = target; | ||
} | ||
|
||
public HadoopThread(ThreadGroup group, Runnable target) { | ||
// The target passed to Thread has no effect, we only pass it | ||
// because there is no super(group) constructor. | ||
super(group, target); | ||
this.hadoopTarget = target; | ||
} | ||
|
||
public HadoopThread(Runnable target, String name) { | ||
super(name); | ||
this.hadoopTarget = target; | ||
} | ||
|
||
public HadoopThread(String name) { | ||
super(name); | ||
} | ||
|
||
public HadoopThread(ThreadGroup group, String name) { | ||
super(group, name); | ||
} | ||
|
||
public HadoopThread(ThreadGroup group, Runnable target, String name) { | ||
super(group, name); | ||
this.hadoopTarget = target; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should add a Builder class instead of having public constructor. Then, we can make it configurable to return Thread
or HadoopThread
or something else in the future.
diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/concurrent/HadoopThread.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/concurrent/HadoopThread.java
index 5ce2c75f4f4..fedd4e3a107 100644
--- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/concurrent/HadoopThread.java
+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/concurrent/HadoopThread.java
@@ -32,6 +32,22 @@
*/
public class HadoopThread extends Thread {
+ public static Builder newBuilder() {
+ return new Builder();
+ }
+
+ public static class Builder {
+ private Runnable runnable;
+
+ public Builder setRunnable(Runnable runnable) {
+ this.runnable = runnable;
+ return this;
+ }
+ public HadoopThread build() {
+ return new HadoopThread(runnable);
+ }
+ }
+
Subject startSubject;
Runnable hadoopTarget;
diff --git a/hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java b/hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java
index 80d2859684f..35e7b71fd2e 100644
--- a/hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java
+++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/util/TestByteArrayManager.java
@@ -178,7 +178,9 @@ public void testAllocateRecycle() throws Exception {
waitForAll(allocator.futures);
// allocate one more should be blocked
- final AllocatorThread t = new AllocatorThread(arrayLength, bam);
+ final Thread t = HadoopThread.newBuilder()
+ .setRunnable(() -> new AllocatorMethod().run(arrayLength, bam))
+ .build();
t.start();
// check if the thread is waiting, timed wait or runnable.
@@ -227,18 +229,10 @@ static <T> void waitForAll(List<Future<T>> furtures) throws Exception {
}
}
- static class AllocatorThread extends HadoopThread {
- private final ByteArrayManager bam;
- private final int arrayLength;
+ static class AllocatorMethod {
private byte[] array;
- AllocatorThread(int arrayLength, ByteArrayManager bam) {
- this.bam = bam;
- this.arrayLength = arrayLength;
- }
-
- @Override
- public void work() {
+ void run(int arrayLength, ByteArrayManager bam) {
try {
array = bam.newByteArray(arrayLength);
} catch (InterruptedException e) {
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
index a9ecdd46bcb..8b4883beb96 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
@@ -508,8 +508,7 @@ void addBlockPool(final String bpid, final Configuration conf) throws IOExceptio
new ConcurrentHashMap<FsVolumeSpi, IOException>();
List<Thread> blockPoolAddingThreads = new ArrayList<Thread>();
for (final FsVolumeImpl v : volumes) {
- Thread t = new HadoopThread() {
- public void work() {
+ final Thread t = HadoopThread.newBuilder().setRunnable(() -> {
try (FsVolumeReference ref = v.obtainReference()) {
FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
" on volume " + v + "...");
@@ -523,8 +522,7 @@ public void work() {
". Will throw later.", ioe);
unhealthyDataDirs.put(v, ioe);
}
- }
- };
+ }).build();
blockPoolAddingThreads.add(t);
t.start();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO that's out of scope for this ticket, @szetszwo .
The current HadoopThread (now renamed to SubjectInheritingThread) intentionally follows the Thread syntax as close as possible for minimum changes in the callers.
Adding a Builder and rewriting everything to use that in this patch would increase the patch size and would be error-prone IMO, while it wouldn't add anything to the Java 25 compatibility.
I think that both a ThreadBuilder (in a separate class to avoid conflicts with the JDK21 native Thread.Builder) and rewriting the run()/work() overriding threads to use Runnables are good improvments, but they should be handled in a separate JIRA (or maybe even two separate JIRAs).
The change LGTM. @stoty thank you for such great works! The current diff is huge, my only suggestion is to extract the |
* Hadoop security heavily relies on the original behavior, as Subject is at the | ||
* core of JAAS. This class wraps thread. It overrides start() and saves the | ||
* Subject of the current thread, and wraps the payload in a | ||
* Subject.doAs()/callAs() call to restorere it in the newly created Thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: should be 'restore'
} | ||
|
||
/** | ||
* Behaves similar to {@link Thread#Thread(ThreadGroup, Runnable)} constructor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think 'Behaves similarly' might be more correct. https://english.stackexchange.com/questions/189825/behaves-similar-to-or-behaves-similarly-to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💔 -1 overall
This message was automatically generated. |
Conflict fixed on #7919 |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stoty , thanks for the update!
The current HadoopThread (now renamed to SubjectInheritingThread) intentionally follows the Thread syntax as close as possible for minimum changes in the callers.
In this patch, we are changing
new Thread(..)
->new SubjectInheritingThread(..)
My suggestion is to change
new Thread(..)
->new SubjectInheritingThread.newBuilder()...build()
If we do it in separated JIRAs, we will need to rewrite the almost all code this PR.
*/ | ||
@Override | ||
public final void run() { | ||
SubjectUtil.doAs(startSubject, new PrivilegedAction<Void>() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a performance problem in this PR since it makes all the threads calling doAs
. For pre-Java 22, it will hurt the performance.
Description of PR
HADOOP-19574. Restore Subject propagation semantics for Java 22+.
JDK 22 breaks subject propagation into new Threads.
This patch adds a new HadoopThread class which restores the pre JDK22 semantics, and replaces
most Thread objects in the code with HadoopThread.
The Daemon class is silimarly changed.
This is the -hopefully- final patch for full Java 24/25 support.
I went with the approach of replacing almost all Thread objects with HadoopThread, as most of the tests do not set any specific Principal, and it would be had to determine for every Thread usage whether Subject propagation is required for that instance.
While the patch is huge, 99% of it is just mechanically replacing Thread with HadoopThread, and the run() method with the work() method.
There are a few tests which check specific stack traces, which also needed updating.
I'm open to renaming anything if someone can come up with better names.
How was this patch tested?
There are a few failing tests, but those are either flakey or also fail on trunk and / or with Java 8 / 17.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?