Skip to content

Conversation

@NullifiedSec
Copy link

Hi Maintainers,

This PR enhances the anew uniqueness tool by adding several commonly requested features to improve its flexibility and usability in various scenarios. The goal was to add practical options without significantly increasing the tool's complexity.

Motivation:

The base anew tool is useful for ensuring unique lines, but real-world use cases often require more nuanced handling:

  • Ignoring case differences (e.g., 'apple' vs 'Apple').
  • Skipping blank lines from input.
  • Directing unique output to a separate file instead of modifying the input file.
  • Getting feedback on how many lines were processed/added.
  • Safeguarding the original file when modifying it in place.

Changes Introduced:

This PR adds the following command-line options to anew:

  1. -i (Ignore Case): Performs case-insensitive comparisons when checking for existing lines and duplicates from stdin.
    # 'Apple' from stdin won't be added if 'apple' exists in existing.txt
    echo "Apple" | anew -i existing.txt
  2. -B (Ignore Blank Lines): Skips processing (and potentially adding) blank lines received from standard input.
    printf "line1\n\nline2\n" | anew -B existing.txt
  3. -o <outfile> (Output File): Specifies a different file to append the new unique lines to. If omitted, behavior remains the same (appends to the [input_filename] if provided). This allows merging unique lines into a new destination.
    # Read existing lines from check.txt, append new unique lines from stdin to unique_lines.txt
    cat new_stuff.txt | anew check.txt -o unique_lines.txt
    
    # Read only stdin, append unique lines to a new file
    cat new_stuff.txt | anew -o unique_only_from_stdin.txt
  4. -c (Counts): Prints statistics (lines read, duplicates found, blanks skipped, lines output/written) to stderr upon completion.
    cat new_stuff.txt | anew -c existing.txt
  5. --backup[=<SUFFIX>] (Backup): Creates a backup copy of the [input_filename] before modification. This only takes effect if output is being written back to the same file specified as [input_filename] (i.e., -o is not used or -o points to the same file).
    • If --backup is used without a value, the suffix .bak is used.
    • If --backup=<SUFFIX> is used, the specified SUFFIX is appended to the filename (e.g., --backup=.orig).
    # Creates existing.txt.bak before appending
    cat new_stuff.txt | anew --backup existing.txt
    
    # Creates existing.txt.timestamp before appending
    cat new_stuff.txt | anew --backup=.timestamp existing.txt

Internal Improvements:

  • Refactored flag handling into a Config struct.
  • Added a Stats struct for collecting counts.
  • Introduced a normalizeLine helper function to handle trimming and case-folding consistently.
  • Improved error handling around file operations (distinguishing ErrNotExist, checking scanner errors).
  • Used bufio.Writer for potentially more efficient file appends.
  • Added basic argument count validation.
  • Updated usage information.

Testing:

Manual testing was performed with various combinations of flags, input files (existing, non-existing), stdin content (with duplicates, blanks, case variations), and output scenarios (in-place, -o, dry-run).

Request for Review:

Please review the changes for correctness, adherence to project style, and potential edge cases. Particular attention to the logic for -o, --backup, and the interaction between -i, -t, and -B would be appreciated.

Thanks for considering this contribution!

@rasheedmhd
Copy link

good stuff. thanks. hoping it merged soon.

@NullifiedSec
Copy link
Author

NullifiedSec commented May 26, 2025

good stuff. thanks. hoping it merged soon.

thanks !

yeah i also hope it gets merged. maybe he is busy. i am thinking of continuing a separate fork if it doesn't get merged in 4 months

@noob6t5
Copy link

noob6t5 commented Jul 30, 2025

@NullifiedSec are u planning for separate fork?

@NullifiedSec
Copy link
Author

@noob6t5, I'm thinking of forking this project since tomnomnom seems inactive. I plan to keep it updated and add new features. Would you like to contribute? I'd also appreciate your opinion on something: given that the code is now almost unrecognizable from the original due to a complete restructure, should I create a new repository or stick with a fork?

@noob6t5
Copy link

noob6t5 commented Aug 10, 2025

@NullifiedSec I have some updated features for it using personally if it's okay for all I will PR in your forked repo or new repo , I think both is fine but forking this will weight more showing respect to author rather then creating new tool's.

But fully dedicating and updating is quite impossible right now as I'm Crushed with some tool's of mine 😅

@NullifiedSec
Copy link
Author

@noob6t5 i continued this as a separate repository maintaining a fork is kinda complex for me though i mentioned the original repo

here is my repo if you want to contribute

https://github.com/NullifiedSec/onew/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants