Skip to content

Conversation

@zvigrinberg
Copy link
Contributor

@zvigrinberg zvigrinberg commented Feb 21, 2024

Implements Feature depicted in JIRA-2228
a new readiness check was added to endpoint :9000/q/health/ready, External Providers Readiness Check, and off course, it affects the readiness status ( up or down) - if there is no even one provider that returns valid http code, it will be DOWN:

{
    "status": "UP",
    "checks": [
        {
            "name": "camel-routes",
            "status": "UP"
        },
        {
            "name": "External Providers Readiness Check",
            "status": "UP",
            "data": {
                "check.kind": "READINESS"
            }
        },
        {
            "name": "context",
            "status": "UP",
            "data": {
                "context.phase": "5",
                "context.name": "camel-2",
                "context.version": "4.1.0",
                "context.status": "Started",
                "check.kind": "ALL"
            }
        },
        {
            "name": "camel-consumers",
            "status": "UP"
        }
    ]
}

Tested successfully when deployed exhort with 2 replicas in DEV enironment Openshift cluster.
Deployed 2 exhort instances ( deployment + service) in the same project, each deployment with 2 pods replicas, and demonstrated the difference after "sabotaging on purpose" exactly one of the pods of each of the deployments.
Procedure in the following Repo:

https://github.com/zvigrinberg/exhort-service-readiness-experiment/blob/main/README.md#simulating-by-blocking-external-access-from-1-pod-to-real-snyk-address-using-firewall

@zvigrinberg zvigrinberg force-pushed the feature/rediness-probe-external-services branch 2 times, most recently from 8b72bfa to 93eb36b Compare February 21, 2024 12:16
@zvigrinberg zvigrinberg requested a review from ruromero February 21, 2024 12:31
@zvigrinberg zvigrinberg force-pushed the feature/rediness-probe-external-services branch 3 times, most recently from e7d35cd to 223e072 Compare February 26, 2024 13:28
Copy link
Collaborator

@ruromero ruromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The probe is missing some details to give more information about the integration with all the providers.
The approach is good.

One thing that is missing is to check if the provider is enabled or not. In case it is disabled, it should also be shown in the health check.

Don't forget to add the check for OssIndex even though it is currently disabled.

@Override
public Map<Integer, String> getValue(Exchange exchange) {
return Map.of(
exchange.getMessage().getHeader(Exchange.HTTP_RESPONSE_CODE, Integer.class),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can create a POJO to hold the providerName, the status and message to show it in the check details.

 "data": {
                "check.kind": "READINESS",
                "osv-nvd": "500 - Unable to validate osv-nvd Token: onguard: nodename nor servname provided, or not known",
                "snyk": "200 - OK"
            },

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruromero i can , but then i'll have to add @RegisterForReflection annotation for this new POJO...
in the meantime, i'll add all the missing details using a map ( like i did in previous implementation), and then maybe we'll discuss it so i'll incorporate such a POJO with required fields as members...
WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using my approach, now it looks like this:

{
    "status": "UP",
    "checks": [
        {
            "name": "camel-routes",
            "status": "UP"
        },
        {
            "name": "External Providers Readiness Check",
            "status": "UP",
            "data": {
                "oss-index": "{responseBody=Invalid token provided. Unauthorized, isEnabled=false, StatusCode=401}",
                "osv-nvd": "{responseBody=Exception occurred during execution on the exchange: Exchange[A36E02E84B42677-0000000000000002], isEnabled=true, StatusCode=500}",
                "snyk": "{responseBody=Token validated successfully, isEnabled=true, StatusCode=200}",
                "check.kind": "READINESS"
            }
        },
        {
            "name": "context",
            "status": "UP",
            "data": {
                "context.phase": "5",
                "context.name": "camel-4",
                "context.version": "4.1.0",
                "context.status": "Started",
                "check.kind": "ALL"
            }
        },
        {
            "name": "camel-consumers",
            "status": "UP"
        }
    ]
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why creating a POJO and adding the annotation is an impediment?
The data shown now is much more descriptive now, the only downside is that you should do some transformation so that information is clearer. I don't think showing the Object reference of the Exchange is helpful.

I would suggest to show a proper error message and if the body is empty just show the Exchange.HTTP_RESPONSE_TEXT. Maybe look at how the ProviderResponseHandler#processResponseError retrieves the error from the Exchange.
Finally, when disabled, only show disabled and not even make the request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're insisting, i can use a POJO and an annotation.
I Think using Exchange.HTTP_RESPONSE_TEXT as raw http response body is enough, using what you did in ProviderResponseHandler#processResponseError is an overkill :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pointed out this method as reference. Sometimes the response_text is empty and you must retrieve the details from the exception and sometimes the exception is a wrapper of an httpclientException.

@zvigrinberg zvigrinberg force-pushed the feature/rediness-probe-external-services branch from 9bcc789 to a7961da Compare February 29, 2024 08:13
@zvigrinberg
Copy link
Contributor Author

zvigrinberg commented Feb 29, 2024

The probe is missing some details to give more information about the integration with all the providers. The approach is good.

One thing that is missing is to check if the provider is enabled or not. In case it is disabled, it should also be shown in the health check.

Don't forget to add the check for OssIndex even though it is currently disabled.

@ruromero I Added the check for ossIndex, but currently i put only empty token (" ") for the sake of the readiness check, so it will fail on 401, once we'll activate/enable oss-index, we'll define properties for user and token/password for ossindex, and will use them in ossindex credentials validations as default credentials ( if user didn't pass these headers in the request), both for request and for credentials validation routes.

@zvigrinberg zvigrinberg requested a review from ruromero February 29, 2024 08:26
Copy link
Collaborator

@ruromero ruromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zvigrinberg thanks for the changes but I consider the api.<provider>.disabled check a requirement for this feature. When a provider is disabled, it must be shown in the check as you already do but it should not perform a request.

@Override
public Map<Integer, String> getValue(Exchange exchange) {
return Map.of(
exchange.getMessage().getHeader(Exchange.HTTP_RESPONSE_CODE, Integer.class),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why creating a POJO and adding the annotation is an impediment?
The data shown now is much more descriptive now, the only downside is that you should do some transformation so that information is clearer. I don't think showing the Object reference of the Exchange is helpful.

I would suggest to show a proper error message and if the body is empty just show the Exchange.HTTP_RESPONSE_TEXT. Maybe look at how the ProviderResponseHandler#processResponseError retrieves the error from the Exchange.
Finally, when disabled, only show disabled and not even make the request.

@zvigrinberg
Copy link
Contributor Author

@zvigrinberg thanks for the changes but I consider the api.<provider>.disabled check a requirement for this feature. When a provider is disabled, it must be shown in the check as you already do but it should not perform a request.

@ruromero OK , So only perform the call if the provider is enabled right?, and show nothing in httpstatus in this case?

@ruromero
Copy link
Collaborator

@ruromero OK , So only perform the call if the provider is enabled right?, and show nothing in httpstatus in this case?

It is also interesting to show disabled in the status but nobody will see it as the management api is internal

@zvigrinberg
Copy link
Contributor Author

@ruromero OK , So only perform the call if the provider is enabled right?, and show nothing in httpstatus in this case?

It is also interesting to show disabled in the status but nobody will see it as the management api is internal

Yeah it's only for debugging purposes when there are problems with readiness of pods, you invoking that endpoint on the management interface, and can learn from the response body about the providers' status and why it's not ready, this is the only purpose and usage, there is no applicative usage indeed.

@zvigrinberg
Copy link
Contributor Author

zvigrinberg commented Feb 29, 2024

@zvigrinberg thanks for the changes but I consider the api.<provider>.disabled check a requirement for this feature. When a provider is disabled, it must be shown in the check as you already do but it should not perform a request.

@ruromero OK , So only perform the call if the provider is enabled right?, and show nothing in httpstatus in this case?

@ruromero maybe i can just re-use com.redhat.exhort.api.v4.ProviderStatus instead of creating a new POJO in vain. and just to give up on the isEnabled field

@zvigrinberg
Copy link
Contributor Author

@zvigrinberg thanks for the changes but I consider the api.<provider>.disabled check a requirement for this feature. When a provider is disabled, it must be shown in the check as you already do but it should not perform a request.

@ruromero OK , So only perform the call if the provider is enabled right?, and show nothing in httpstatus in this case?

@ruromero maybe i can just re-use com.redhat.exhort.api.v4.ProviderStatus instead of creating a new POJO in vain. and just to give up on the isEnabled field

@ruromero Using this POJO (ProviderStatus) ,it's working good, and with a bit of string manipulation and skipping its generated toString implementation, this is what eventually returned

{
    "status": "UP",
    "checks": [
        {
            "name": "camel-routes",
            "status": "UP"
        },
        {
            "name": "External Providers Readiness Check",
            "status": "UP",
            "data": {
                "oss-index": "providerName=oss-index, isEnabled=false, message=Provider oss-index is disabled",
                "osv-nvd": "providerName=osv-nvd, isEnabled=true, statusCode=503, message=Service Unavailable",
                "snyk": "providerName=snyk, isEnabled=true, statusCode=200, message=OK",
                "check.kind": "READINESS"
            }
        },
        {
            "name": "context",
            "status": "UP",
            "data": {
                "context.phase": "5",
                "context.name": "camel-15",
                "context.version": "4.1.0",
                "context.status": "Started",
                "check.kind": "ALL"
            }
        },
        {
            "name": "camel-consumers",
            "status": "UP"
        }
    ]
}

@zvigrinberg zvigrinberg requested a review from ruromero March 1, 2024 05:48
Copy link
Collaborator

@ruromero ruromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zvigrinberg
I agree with your comments and the solution. I have added a few comments to simplify the code.

@zvigrinberg zvigrinberg force-pushed the feature/rediness-probe-external-services branch from 07a223a to 251102a Compare March 6, 2024 10:36
@zvigrinberg zvigrinberg requested a review from ruromero March 6, 2024 10:42
@zvigrinberg zvigrinberg force-pushed the feature/rediness-probe-external-services branch from ad2e2b2 to 1835ab6 Compare March 6, 2024 10:50
@zvigrinberg
Copy link
Contributor Author

Thanks @zvigrinberg I agree with your comments and the solution. I have added a few comments to simplify the code.

Hi @ruromero ,
applied all suggestions.

Thanks.

Copy link
Collaborator

@ruromero ruromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for all the work. Just fix the commits to match conventional semantic commits and when the build is green we're ready to merge.

, add trusted content to healthcheck, and add more required adjustments

Signed-off-by: Zvi Grinberg <[email protected]>

style: indent routes for clarification and readability

Signed-off-by: Zvi Grinberg <[email protected]>

fix: add latest changes

Signed-off-by: Zvi Grinberg <[email protected]>
fix: unify failure and success message returned from providers

Signed-off-by: Zvi Grinberg <[email protected]>
@zvigrinberg zvigrinberg force-pushed the feature/rediness-probe-external-services branch from 1835ab6 to 9cebba8 Compare March 6, 2024 14:45
zvigrinberg and others added 4 commits March 6, 2024 16:48
Co-authored-by: Ruben Romero Montes <[email protected]>
after interactive rebasing with conflicts resolutions

Signed-off-by: Zvi Grinberg <[email protected]>
@zvigrinberg zvigrinberg force-pushed the feature/rediness-probe-external-services branch from 9cebba8 to ce64267 Compare March 6, 2024 14:49
@zvigrinberg zvigrinberg merged commit 6685d0f into guacsec:main Mar 6, 2024
@zvigrinberg zvigrinberg deleted the feature/rediness-probe-external-services branch March 6, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants