-
Notifications
You must be signed in to change notification settings - Fork 952
fix(athena): always generate string literal from FileFormatProperty #5125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
326bbf7
to
1535ac3
Compare
dialect = None | ||
dialect: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this change? The rest of the file is not typed.
@@ -337,3 +337,22 @@ def test_parse_partitioned_by_returns_iceberg_transforms(self): | |||
assert isinstance(parsed.this, exp.Schema) | |||
assert next(n for n in parsed.this.expressions if isinstance(n, exp.PartitionedByBucket)) | |||
assert next(n for n in parsed.this.expressions if isinstance(n, exp.PartitionByTruncate)) | |||
|
|||
def test_ctas_uses_string_for_format(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a validate_all
test as well to demonstrate that Spark -> Athena is now covered?
def _file_format_property_sql(self: Athena.Generator, e: exp.FileFormatProperty) -> str: | ||
this = e.args.get("this") | ||
if not this: | ||
return "format=''" | ||
return f"format={exp.Literal.string(this.name)}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit more verbose than necessary, let's inline it in TRANSFORMS
:
exp.FileFormatProperty: lambda self, e: f"format={self.sql(exp.Literal.string(e.name)}",
Thanks for the contribution @aguynamedryan, LGTM, just a few comments to clean this up a little. |
@erindru can you take a look as well? I could reproduce the issue with a non-string: >>> from sqlmesh import Context
>>> ctx = Context()
>>> ctx.engine_adapter
<sqlmesh.core.engine_adapter.athena.AthenaEngineAdapter object at 0x121765400>
>>> ctx.engine_adapter.execute('CREATE TABLE "db"."tab" WITH (format=parquet) AS SELECT 1 AS "col"')
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: expression parquet is not a string literal For context, last modification was made here, and before that Vaggelis had made a relevant PR here. FWIW Trino and Presto do support string literals and we generate that syntax in Presto here, so I wonder if the override is needed now. I think if we got rid of that |
Err, how is that a non-string? You pass a string to My understanding is that the problem is this:
This results in So the issue is transpilation.
Almost. Unfortunately this results in the property name getting converted to uppercase, but the casing matters for Athena.
but In researching the right thing to do here and adjusting the tests to check my theory, I ended up implementing this: #5133 |
Sorry, I wasn't clear enough. I intentionally used a var in that Do Presto and Trino accept a lowercase |
Closing in favor of e9b3156. |
As described in #5124, Athena requires the WITH(format='parquet') to use a string literal on the right hand side of the the format=