Skip to content

Conversation

aswinmprabhu
Copy link
Contributor

Description of PR

DiskChecker was enhanced to do checking with some i/o in HADOOP-13738. But this was rolled back partially because of fsync issue seen in HADOOP-15450 and the problem of disk full being flagged as a check failure (HDFS-13538).

This PR tries to address HDFS-13538 and enable i/o based disk checking ONLY for HDFS with a flag that can be turned on in dfs configs.

How was this patch tested?

[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.util.TestDiskCheckerWithDiskIo
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.129 s -- in org.apache.hadoop.util.TestDiskCheckerWithDiskIo
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0


[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  35:00 min
[INFO] Finished at: 2025-08-29T01:17:06+05:30
[INFO] ------------------------------------------------------------------------

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 9m 5s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 10m 31s Maven dependency ordering for branch
+1 💚 mvninstall 20m 53s trunk passed
+1 💚 compile 8m 28s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 7m 19s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 2m 3s trunk passed
+1 💚 mvnsite 1m 56s trunk passed
+1 💚 javadoc 1m 38s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 53s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 24s trunk passed
+1 💚 shadedclient 22m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for patch
+1 💚 mvninstall 1m 10s the patch passed
+1 💚 compile 8m 8s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 8m 8s the patch passed
+1 💚 compile 7m 29s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 7m 29s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 2m 2s the patch passed
+1 💚 mvnsite 1m 53s the patch passed
+1 💚 javadoc 1m 32s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 49s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 3m 27s the patch passed
+1 💚 shadedclient 22m 13s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 18m 42s hadoop-common in the patch passed.
-1 ❌ unit 143m 38s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
304m 0s
Reason Tests
Failed junit tests hadoop.hdfs.tools.TestDFSAdmin
hadoop.tools.TestHdfsConfigFields
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7915/1/artifact/out/Dockerfile
GITHUB PR #7915
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 2e0741537d45 5.15.0-142-generic #152-Ubuntu SMP Mon May 19 10:54:31 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d66bf7a
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7915/1/testReport/
Max. process+thread count 3390 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7915/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@violetnspct
Copy link

@aswinmprabhu

Suggested Unit Test Scenarios

File: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DiskChecker.java

Method: doDiskIo
Recommended Test Scenarios:

  1. Test handling of actual disk failure error message - should be classified as disk failure
  2. Test handling of mixed error case where disk is both full and has other issues
  3. Test with null/empty error message - should handle gracefully
  4. Test with different variations of 'No space left' error messages (case sensitivity, different formats)

File: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java

Method: checkDirs
Recommended Test Scenarios:

  1. Test that when checkDirWithDiskIo flag is disabled, the method calls checkDir()
  2. Test that the checkDirWithDiskIo configuration flag properly reads from the configuration settings
  3. Test that the method handles null configuration gracefully
  4. Test that changing the configuration flag value at runtime properly switches the checking method

Edge Cases to Cover:

  1. Configuration flag is false and performs regular checks without disk I/O
  2. Same directory path checked with both methods produces consistent results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants