- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.5k
ENH: Add AES support for encrypting PDF files (merged via other PRs) #1816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Codecov ReportPatch coverage:  
 
 Additional details and impacted files@@            Coverage Diff             @@
##             main    #1816      +/-   ##
==========================================
- Coverage   93.89%   93.89%   -0.01%     
==========================================
  Files          33       33              
  Lines        6983     6946      -37     
  Branches     1382     1374       -8     
==========================================
- Hits         6557     6522      -35     
  Misses        274      274              
+ Partials      152      150       -2     
 ☔ View full report in Codecov by Sentry. | 
eliminate redundant `Encrypt` entries
| user_pwd: Optional[str] = None, # deprecated | ||
| owner_pwd: Optional[str] = None, # deprecated | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those two need to stay as long as we don't make a major version bump. As still many users are migrating from PyPDF2 (sometimes even PyPDF2 < 2.0.0), I want to keep the warnings for quite a while.
        
          
                pypdf/_writer.py
              
                Outdated
          
        
      | warnings.warn( | ||
| "pypdf only implements RC4 encryption so far. " | ||
| "The RC4 algorithm is insecure. Either use a library that supports " | ||
| "AES for encryption or put the PDF in an encrypted container, " | ||
| "for example an encrypted ZIP file." | ||
| ) | ||
| if user_pwd is not None: | ||
| if user_password is not None: | ||
| raise ValueError( | ||
| "Please only set 'user_password'. " | ||
| "The 'user_pwd' argument is deprecated." | ||
| ) | ||
| else: | ||
| warnings.warn( | ||
| "Please use 'user_password' instead of 'user_pwd'. " | ||
| "The 'user_pwd' argument is deprecated and " | ||
| "will be removed in pypdf 4.0.0." | ||
| ) | ||
| user_password = user_pwd | ||
| if user_password is None: # deprecated | ||
| # user_password is only Optional for due to the deprecated user_pwd | ||
| raise ValueError("user_password may not be None") | ||
|  | ||
| if owner_pwd is not None: # deprecated | ||
| if owner_password is not None: | ||
| raise ValueError( | ||
| "The argument owner_pwd of encrypt is deprecated. " | ||
| "Use owner_password only." | ||
| ) | ||
| else: | ||
| old_term = "owner_pwd" | ||
| new_term = "owner_password" | ||
| warnings.warn( | ||
| message=( | ||
| f"{old_term} is deprecated as an argument and will be " | ||
| f"removed in pypdf 4.0.0. Use {new_term} instead" | ||
| ), | ||
| category=DeprecationWarning, | ||
| ) | ||
| owner_password = owner_pwd | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, those need to stay for the same reason: Keeping helpful deprection messages for users
| @exiledkingcc Thank you so much for picking this up 🙏 I'm currently quite busy with my private live, but I try to review everything tomorrow :-) | 
        
          
                pypdf/xmp.py
              
                Outdated
          
        
      | def writeToStream( | ||
| self, stream: StreamType, encryption_key: Union[None, str, bytes] | ||
| ) -> None: # deprecated | ||
| """ | ||
| Use :meth:`write_to_stream` instead. | ||
| .. deprecated:: 1.28.0 | ||
| """ | ||
| deprecation_with_replacement("writeToStream", "write_to_stream", "3.0.0") | ||
| self.write_to_stream(stream, encryption_key) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to stay to help migrating users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MartinThoma
this depreciation is quite old and it also applies to internal objects. I think we could try to delete it.  It will be possible to restore it later.
Also write_to_stream() arguments have changed and for the same reasons I think the modification can be kept
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to follow our deprecation process:
- Deprecation warning was added in 1.28.0via DEP: Class, variable, and module names #867 in May 2022
- Deprecation error was added in 3.0.0<--- it should have been removed in 2.0.0; keep also in mind that we migratedPyPDF2topypdf. To thepypdfusers it's just fair to have a little bit more time.
- Removal: Originally it should be removed in 3.0.0, but now we're going to remove it in 4.0.0
It has "only" been one year since the warning was added. I want to make a pypdf==4.0.0 release with a lot of deprecated code removals in 2024 (probably early January). Until then, I want to keep those methods to allow people to have an easier time to transition.
Is there a reason to have it in this PR?
To me it seems as if this PR would become a lot smaller and thus easier to review without those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @pubpub-zz said, write_to_stream() arguments have changed, so writeToStream() should change, or be deleted. and it's not used in current code, and users do not use it, it's safe to delete it.
This makes PR #1816 smaller Co-authored-by: exiledkingcc <[email protected]>
This makes PR #1816 smaller Co-authored-by: exiledkingcc <[email protected]>
| @exiledkingcc Just a short heads-up: I have not forgotten this PR and I'm excited to get it into pypdf! However, I also want to review it thoroughly. Recently I had a lot going on in my private live (being sick / buying my own house) which meant that I didn't have the time for it so far. I will do it eventually :-) (I promise to do it this year, I hope to do it in July) | 
| @MartinThoma it's ok, just take your time. and take care. 😊 | 
| I've just added the "breaking change" label so that I don't forget that the PR currently contains breaking changes. I want to undo them for the moment, but I also want to collect those things so that I can do several breaking changes next year. The idea is to keep pypdf stable for a while to give the community time to move from PyPDF2 to pypdf. | 
Full credit goes to exiledkingcc This PR was only made to make it easier to merge the other changes / to avoid merge conflicts. Co-authored-by: exiledkingcc <[email protected]>
* `CryptFilter.encrypt_object` implemented * `AlgV5.generate_values` now crops the user_password / owner_password to 127 bytes * The `EncryptAlgorithm` Enum was added. It contains the parameter V (version), R (revision), and length * `Encryption.encrypt_object` was added * `Encryption.write_entry` was added * The static method `Encryption.make` was added This PR was only made to make it easier to merge the other changes / to avoid merge conflicts of other changes with #1816 . Full credit goes to exiledkingcc. The PR is marked as "MAINT" as it doesn't add a new feature that an end-user could use. Co-authored-by: exiledkingcc <[email protected]>
* PdfWriter.encrypt: Add 'algorithm' parameter * PdfWriter: Add _encryption property * PdfWriter: Add _encrypt_entry property This change was made in another PR to avoid merge conflicts / get it merged soon. Full credit for the work goes to exiledkingcc who did all of the work in #1816 Co-authored-by: exiledkingcc <[email protected]>
* PdfWriter.encrypt: Add 'algorithm' parameter * PdfWriter: Add _encryption property * PdfWriter: Add _encrypt_entry property This change was made in another PR to avoid merge conflicts / get it merged soon. Full credit for the work goes to exiledkingcc who did all of the work in #1816 Co-authored-by: exiledkingcc <[email protected]>
* PdfWriter.encrypt: Add 'algorithm' parameter * PdfWriter: Add _encryption property * PdfWriter: Add _encrypt_entry property This change was made in #1816 to avoid merge conflicts / get it merged soon. Full credit for the work goes to exiledkingcc who did all of the work in #1816 Co-authored-by: exiledkingcc <[email protected]>
This PR was made to avoid breaking changes the original PR introduced. We want to remove the outdated parameters, but not at the moment. See https://pypdf.readthedocs.io/en/latest/dev/deprecations.html Full credit for the work goes to exiledkingcc via PR #1816 Co-authored-by: exiledkingcc <[email protected]>
This PR was made to avoid breaking changes the original PR introduced. We want to remove the outdated parameters, but not at the moment. See https://pypdf.readthedocs.io/en/latest/dev/deprecations.html Full credit for the work goes to exiledkingcc via PR #1816 Co-authored-by: exiledkingcc <[email protected]>
Partially taken from #1816 Co-authored-by: exiledkingcc <[email protected]>
* MAINT: Deprecate the encryption_key parameter of write_to_stream Partially taken from #1816 Co-authored-by: exiledkingcc <[email protected]>
| @exiledkingcc I finally managed to merge all of the changes 🎉 It was a bit difficult because there were a lot of changes and I needed to get an overview. Hence I made several small PRs. I also didn't want to introduce breaking changes now. I gave you full credit via the co-authored-by feature + every commit message. I also made a shout-out on Twitter + in the release notes to ensure that you get all the credit for the awesome work you did 🙏 Thank you 🤗 | 
Closes #1754