Skip to content

Conversation

Mebsina
Copy link
Contributor

@Mebsina Mebsina commented Jul 23, 2025

Description

Updating the current CLI to allow users to test their existing queries against newer OpenSearch SQL plug-in versions before upgrading their OpenSearch clusters.

For example, a user is currently using version 2.19 may want to validate query compatibility with version 3.1 first.

By using this CLI tool, they can:

  • Load and run SQL 3.1 FAT JAR locally, without upgrading their OpenSearch cluster.
  • Verify that their current queries execute as expected under the new SQL engine.
  • Avoid potential breaking changes and reduce the need for rollback in production.

After users verify their queries are working as intended then they can updating their OpenSearch cluster version. This CLI can act as a safe testing environment, allowing smooth transitions between versions with confidence.

Moreover, developers can use this to test their own SQL plug-in implementation.

Command--version

  • CLI will use the provided SQL plug-in version on maven repository
  • If no command provided then it will use from the Config file
  • If there is no FAT JAR file then shadowJar in the background

Command --local

  • CLI will use the provided directory to find the SQL plug-in jar to build with
  • If not found, ./gradlew assemble
  • Then shadowJar in the background

Command --remote

  • CLI will try to git clone the given branch name and url
  • git clone --branch <branch_name> --single-branch <url/sql.git>
  • Extract the repo name from sql.git . In this case, repo name is sql . All git clone repo will be save in remote directory
  • If repo already exist then it will ask if user want to replace it and reclone or use the current repo
  • Then when it gets the repo, it will use the same logic as --local

Command --rebuild

  • Will rebuild the FAT JAR file to update

Command -s

  • CLI will save the latest query into saved.txt file
  • It can be re-executed for later

Unit tests are done

TODO:

  • Publishing datasources submodule
  • Integration test

Issues Resolved

N/A

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Mebsina added 2 commits July 23, 2025 09:37
@Mebsina Mebsina changed the title New cli CLI with SQL plug-in version selection Jul 23, 2025
@dai-chen
Copy link
Collaborator

dai-chen commented Aug 4, 2025

@Mebsina CI is failing. Could you fix it first?

Mebsina added 5 commits August 4, 2025 16:00
Signed-off-by: cnoramut <[email protected]>
Signed-off-by: cnoramut <[email protected]>
Signed-off-by: cnoramut <[email protected]>
Signed-off-by: cnoramut <[email protected]>
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, the main use cases is when users run the CLI with a newer SQL/PPL version against an older domain. In this case, do new functions and commands (especially when pushdown to DSL) still work? If not, should we update the documentation?

@@ -0,0 +1,81 @@
{
"keywords": [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be maintained manually and binding to certain version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right now, it's manually added and not bind to any version. I will see if I can pull them from Lexer instead. OpenSearchPPLLexer and OpenSearchSQLLexer

Copy link
Contributor

@noCharger noCharger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Highlevel comments on java part:

  1. can we remove System.out.println and printStackTrace in this PR and use log4j logger instead?
  2. for the constant values, can we make them as variables?
    Example constant:
String path = isExplainQuery ? "/_explain" : "/_plugins/_ppl";
PPLQueryRequest pplRequest = new PPLQueryRequest(query, new JSONObject(), path, "");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove this module from cli codebase

};

// Check if this is an explain query
boolean isExplainQuery = query.trim().toLowerCase().startsWith("explain");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for PPL because it has explain statement but there is nothing for explain in the SQL Parser so I had to manually remove the explain prefix out from the query before sending it and set the path to _explain. If I send it as is then it will fail. Same thing as in curl

For example SQL:
This works

curl -X POST "http://localhost:9200/_plugins/_sql/_explain" -H "Content-Type: application/json" -d '
{
"query": "select * from sample"
}'

This fails

curl -X POST "http://localhost:9200/_plugins/_sql" -H "Content-Type: application/json" -d '
{
"query": "explain select * from sample"
}'

but for PPL it works for both _plugins/_ppl and _plugins/_ppl/_explain endpoints

curl -X POST "http://localhost:9200/_plugins/_ppl" -H "Content-Type: application/json" -d '
{
"query": "explain source=sample"
}'

Comment on lines 91 to 125
if (isPPL) {
System.out.println("Executing PPL query...");
// For explain queries, set the path to "/_explain"
String path = isExplainQuery ? "/_explain" : "/_plugins/_ppl";
PPLQueryRequest pplRequest = new PPLQueryRequest(query, new JSONObject(), path, "");

if (isExplainQuery) {
System.out.println("Calling pplService.explain()");
pplService.explain(pplRequest, explainListener);
} else {
System.out.println("Calling pplService.execute()");
pplService.execute(pplRequest, queryListener, explainListener);
}
} else {
System.out.println("Executing SQL query...");

if (isExplainQuery) {
// Remove "explain" prefix
String actualQuery = query.substring(7).trim();
System.out.println("SQL explain query for: " + actualQuery);

String path = "/_explain";
SQLQueryRequest sqlRequest = new SQLQueryRequest(new JSONObject(), actualQuery, path, "");

System.out.println("Calling sqlService.execute() with explain path");
sqlService.execute(sqlRequest, queryListener, explainListener);
} else {
// Regular SQL query
String path = "/_plugins/_sql";
SQLQueryRequest sqlRequest = new SQLQueryRequest(new JSONObject(), query, path, "");

System.out.println("Calling sqlService.execute()");
sqlService.execute(sqlRequest, queryListener, explainListener);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's refactor this code with design patterns, by learning Strategy Pattern and Factory Pattern.

System.out.println("Query type: " + (isPPL ? "PPL" : "SQL"));

try {
CountDownLatch latch = new CountDownLatch(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why introduce CountDownLatch?

return "Query execution fails because " + errorRef.get();
}

// Handle the response based on the query type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's refactor this section:

  1. use design patterns same as above
  2. result processing logic should be seperated from the query execution class

import software.amazon.awssdk.identity.spi.AwsCredentialsIdentity;
import software.amazon.awssdk.regions.Region;

// AWS Request Signing Interceptor by acm19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this class copied from somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using this from AwsRequestSigningApacheV5Interceptor but need to modify it because RestHighLevelClient generates BasicHttpRequest and it doesn't have its body content also the interceptor sign ClassicHttpRequest instead. I had to attach the body content manually. For Http4, it works fine. I will update the documentation regarding this modification.

* Client class for creating OpenSearch clients with different authentication methods using HTTP5
* for OpenSearch SQL plug-in version 3
*/
public class Client5 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rename client4 and client5

* Client class for creating OpenSearch clients with different authentication methods using HTTP4
* for OpenSearch SQL plug-in version 2
*/
public class Client4 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's refactor both client4 and client5 with better organization, separation of concerns, and improved readability.

  1. consider use builder pattern to build OpenSearchClient
  2. offload common logic into a function

Comment on lines 104 to 131
try {
// Determine which client class to use based on useHttp5 flag
String clientClassName = useHttp5 ? "client.http5.Client5" : "client.http4.Client4";
Class<?> clientClass = Class.forName(clientClassName);

if (useAwsAuth) {
// Call createAwsClient(awsEndpoint)
Method method = clientClass.getMethod("createAwsClient", String.class);
return (OpenSearchClient) method.invoke(null, awsEndpoint);
} else if (protocol != null && protocol.equalsIgnoreCase("https")) {
// Call createHttpsClient(host, port, username, password, ignoreSSL)
Method method =
clientClass.getMethod(
"createHttpsClient",
String.class,
int.class,
String.class,
String.class,
boolean.class);
return (OpenSearchClient) method.invoke(null, host, port, username, password, ignoreSSL);
} else {
// Call createHttpClient(host, port)
Method method = clientClass.getMethod("createHttpClient", String.class, int.class);
return (OpenSearchClient) method.invoke(null, host, port);
}
} catch (Exception e) {
throw new RuntimeException("Failed to create OpenSearchClient: " + e.getMessage(), e);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic can be refactored with factory build pattern

Comment on lines +310 to +338
public ResponseFormatter<QueryResult> getFormatter(
String formatName,
@Named("pretty") SimpleJsonResponseFormatter prettyFormatter,
@Named("compact") SimpleJsonResponseFormatter compactFormatter,
CsvResponseFormatter csvFormatter,
JdbcResponseFormatter jdbcFormatter,
RawResponseFormatter rawFormatter) {
if (formatName == null || formatName.isEmpty()) {
// Default to JSON
return prettyFormatter;
}

switch (formatName.toLowerCase()) {
case "csv":
return csvFormatter;
case "json":
return prettyFormatter;
case "compact_json":
return compactFormatter;
case "jdbc":
return jdbcFormatter;
case "raw":
return rawFormatter;
case "table":
return jdbcFormatter;
default:
return prettyFormatter;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@Mebsina
Copy link
Contributor Author

Mebsina commented Aug 6, 2025

Highlevel comments on java part:

  1. can we remove System.out.println and printStackTrace in this PR and use log4j logger instead?
  2. for the constant values, can we make them as variables?

I removed the System.out.println and printStackTrace. I cannot use log4j because it's conflicted with SQL plug-in so using logback instead. Then updated to use variables instead of constant

Signed-off-by: cnoramut <[email protected]>
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot what's the branching strategy of CLI. After this, should we merge 2.0.0-dev to main or a new release branch?

console = Console()


class Config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a config class in both java and python? can we consolidate them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Python config is for CLI but Java config is for SQL Plug-in settings in the GatewayModule to use. They both use the same config.yaml file, though.

Comment on lines +78 to +82
self.language_mode = "ppl"
self.is_ppl_mode = True
self.format = "table"
self.is_vertical = False
self.latest_query = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's define the constants separately instead of hardcoding them here

console = Console()


class VerifyCluster:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename

@Swiddis Swiddis merged commit cd84203 into opensearch-project:2.0.0-dev Aug 18, 2025
2 checks passed
Swiddis added a commit that referenced this pull request Sep 12, 2025
* cli with sql version selection



* update README.md



* update .gitignore for saved.txt



* update README.md and default 3.1 sql version



* update README.md for build.gradle instruction



* get config settings from previous CLI



* use opensearchpy like the previous CLI for indices and checking connection



* update usage.gif



* update config handling empty string



* get version from sonatype and dynamically build fat jar



* http5 for v3+



* add http4 for v2 and update build.gradle to pick dependency more dynamically



* use datasources submodule jar locally



* update README.md



* add necessary packages



* add --local and --remote commands for local jars dev/test



* add --local and --remote commands for local jars dev/test



* remove -v in version to avoid confusion



* allow user to clone to their chosen directory



* update artifact v3 to v4



* update remote to use --branch, default main



* update dev guide



* no longer need local datasources jar



* update rebuild logic



* will need to fix datasources in maven snapshot



* not using wrapper for sigv4 interceptors



* Python 3.8 to 3.9



* Python 3.8 to 3.12



* add packages



* fix error handling



* use logback instead of sysout



* simplified python



* refactoring queryexecution and formatter



* use builder for client and separate functions



* update gradle for default build



* parse calcite plain dynamically



* refactored with dictionary maps



* create logs dir if doesnt exist



* fix test fail



---------

Signed-off-by: cnoramut <[email protected]>
Co-authored-by: Mebsina <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants