-
Couldn't load subscription status.
- Fork 751
Add utility to automatically reference rules #6908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello @cipherboy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-04-29 20:18:00 UTC |
|
Changes identified: Show detailsOthers: Recommended tests to execute: |
|
Note that, if at the end you want a section of rules with multiple identifiers, you can do it like so: ### 5.3.1,5.3.2 Multi-use rules
- some_other_ruleThis wouldn't quite fit in the existing benchmark structure I suppose, but could be inserted in a useful place. I didn't experiment with what happened with one rule that is placed multiple times in a profile; it would probably fail the build too. |
Nicely laid out profiles have a structure matching their corresponding
benchmarks. I'm thinking the CIS and STIG profiles here.
Let's formalize that structure a bit.
Given a profile with format:
```yaml
selection:
# <identifier> description
- rule_choice
```
Where one or more comments precede one or more rules, and the closest
comment that matches the given format for a reference identifier wins.
Let's take this as an example:
```yaml
## 5.3 Configure PAM ##
### 5.3.1 Ensure password creation requirements are configured (Automated)
- var_password_pam_minlen=14
- accounts_password_pam_minlen
- var_password_pam_minclass=4
- accounts_password_pam_minclass
```
Here both accounts_password_pam_minlen and
accounts_password_pam_minclass should get CIS reference value 5.3.1. The
other two entries are vars, and since 5.3 is further away than 5.3.1,
5.3.1 should win.
Some ground rules:
- We should avoid guessing when possible.
- We should create minimal diffs.
- Some rules lack a references section; we should add them in that
case.
- If we're not sure, ignore the rule and print info telling the caller
about it.
- Rules that don't belong to a section shouldn't be in the profile!
- If we're adding a reference, don't clutter other products! Only do
our current product.
Mostly this is built on top of the ssg.rule_yaml helpers I've added
previously during my internship.
The structure roughly follows the previous utils/refchecker.py. Walk all
rules in a profile, load their YAML, load their actual lines, use a bit
of magic to auto-detect the rule identifier, and write the results out.
Most of the heavy lifting was done in rule_yaml.py to add a new helper
method, add_or_modify_nested_section_key, that adds a new key in a
nested section (and optionally, the section itself) or -- if said key is
already there -- modifies the value of that key.
Signed-off-by: Alexander Scheel <[email protected]>
|
I don't like the approach for following reasons:
I am sorry that you put quite some energy into this proposal, but I think that its elements could be leveraged when references are added to rule metadata as a step of the build process. |
|
Hi @matejak :-) Thanks for taking the time to look! I think this fits best in conjunction with my thoughts on #6889 and #6915 (the off-topic ones) if you haven't read them.
I definitely hear and understand that. I'll start thinking about how to solve that. :-) Do you have ideas yet?
I'd like to push back here. While the information is duplicated in the strictest since (and in conjunction with #6889 and #6915) the intent is to have the source of truth ultimately be the profile. This utility is a way to make that profile push references back to the rules themselves. The duplication is necessary for the build system but this tooling makes the profile the canonical reference location, allowing automatic update of the rules in question. Where I'm coming from is that it is really helpful to have content about a rule centralized in a single place. That's why we created the rule directory (remember the old structure? I barely do!) in the first place: everything about the XCCDF rule entry (including references, ...) are in one place. I'd be fine splitting out a rule into a separate The other intent with this is to make changes hmm auditable. You mentioned elsewhere about making sure changes don't unintentionally impact the generated content. I'm 100% on board with that. By checking in the resulting profiles and references, I feel it becomes easier to audit changes. You know any changes in content must reflect back into existing rules that are checked in, else they'll be ignored. For large changes (consider an upgrade from CIS benchmark at v1.0.0 to v3.0.x) --- being able to do so incrementally across multiple PRs (but having a large portion being automated such as references & prodtypes) would be a big help. With things not happening until the actual build stage, you're stuck auditing build artifacts. The resulting XCCDF still isn't completely You could even go so far as to add some of these steps to the test system if you prefer. The goal of my tooling is to be minimal and reproducible -- this makes review simple: run the same tools locally and compare the generated output or see if they make any new changes. They shouldn't if the PR is valid (is the assertion I'm trying to bake in with these tool changes). We're just beginning our CaC journey at Canonical, but already this tool has been very helpful for importing our existing CIS tooling into the CaC build system. The goal with this PR is to provide something that works for us, now. IIUC this is a capability that the controls file currently lacks. Over time, as we discuss and reach consensus over what changes we all agree to make, we can update our tooling and work towards that goal. This is, IMO, an important step towards making controls file (whether directly or via an intermediate, checked-in profile representation) automatically generate references. I'm in no way saying that this tool is perfect or fully complete. Yeah, there's still work to be done in the ultimate goal, but having some starting code that we can then change, reuse, extend, whatever to fit the eventual end goal is a lot better than having none. :-) (Or who knows, maybe it is better to throw it all away, and start over, but at least there's some existing code that solved a very similar problem already). Perhaps I should combine all my various discussions on what I'm envisioning into a single document for our call later, would that help? Hope this helps :-) |
I understand the thought that there would the primary source in the profile file, but as long as the build system looks into rules, then there is duplication. Moreover, some references are currently present only in rules, so a human being can't tell just by looking whether a reference in a rule YAML is derived from a profile file comment, or it exists just in the rule yaml. Obviously, one could enforce that every reference in rules has to be derived from a reference in a profile file, but we would be then talking about a backward-incompatible change of input formats in the project, which can't be introduced as a PR, but it needs public discussion in the ML at least.
I agree that there would have to be another build step. But I don't see that as something bad and scary - just take a look how a compiled rule differs from a rule in the source tree - there would be one more key s.a. control-based references, and that would be it.
prodtypes are another pain, yes. And I agree with the emphasis on the auditable behavior. Maybe I get it wrong, but I don't see the difference between auditing source rule files that have partial machine-generated content, and auditing compiled rule files that are a build artifact, but they are otherwise the same. We already have some kind of auditing of profile changes: https://complianceascode.github.io/template/2020/03/06/profile_guard.html And we already have a compiler of rule references to an ordered json: https://github.com/ComplianceAsCode/content/blob/master/utils/gen_tables.py that could be used in a similar way.
I think that you can benefit from the tooling without having it merged upstream. As I have outlined above, your proposal has big consequences, as basically changes the content input workflow and interpretation of the source yamls, and such proposal is much bigger than a PR. While it may improve workflows, I believe that the steps that the whole project makes should, among other things, play nice with backward compatibility, and they should introduce workflow changes that are likely not to change soon again, which I see as weaknesses of your proposal.
As I have already said, I believe that some of the code should be reused in the bigger-picture task to enhance the build system in a way that is not clear yet. We think that something needs to be done wrt reference handling, but parsing comments in profile files is not something that we would like. Regarding your proposal to accumulate ideas into a document - that's a great idea, go for it. |
|
Just two quick follow up points:
Happy to organize my thoughts into a document form if that benefits the community. I was hoping that GH Issue/PR with something concrete would be easy enough to see and respond to, but I guess not. :-) |
|
/retest |
|
Closing this and carrying this downstream. |
This pull request (ComplianceAsCode#6908) was rejected upstream; remove its corresponding documentation. Signed-off-by: Alexander Scheel <[email protected]>
This pull request (ComplianceAsCode#6908) was rejected upstream; remove its corresponding documentation. Signed-off-by: Alexander Scheel <[email protected]>
Description
Nicely laid out profiles have a structure matching their corresponding
benchmarks. I'm thinking the CIS and STIG profiles here.
Let's formalize that structure a bit.
Given a profile with format:
Where one or more comments precede one or more rules, and the closest
rule that matches the given format for a reference identifier wins.
Let's take this as an example:
Here both
accounts_password_pam_minlenandaccounts_password_pam_minclassshould get CIS reference value 5.3.1. Theother two entries are vars, and since 5.3 is further away than 5.3.1,
5.3.1 should win.
Some ground rules:
case.
about it.
our current product.
Mostly this is built on top of the
ssg.rule_yamlhelpers I've addedpreviously during my internship.
The structure roughly follows the previous
utils/refchecker.py. Walk allrules in a profile, load their YAML, load their actual lines, use a bit
of magic to auto-detect the rule identifier, and write the results out.
Most of the heavy lifting was done in
rule_yaml.pyto add a new helpermethod, add_or_modify_nested_section_key, that adds a new key in a
nested section (and optionally, the section itself) or -- if said key is
already there -- modifies the value of that key.
Signed-off-by: Alexander Scheel <[email protected]>Rationale
I was talking with Jan and Matej on
#openscapabout the new controls file. We're working on porting our current Ubuntu CIS benchmark into CaC; this is just one step that will help with it.I'm opening this in the interest of working together on a common solution to product-specific controls and benchmark-based profiles. Knowing what we're currently using should help show where we're coming from and make a common design possible.
See also: #6906 and #6907