-
Couldn't load subscription status.
- Fork 322
feat: microgen - adds code generation logic #2294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Migrates the empty __init__.py file to the microgenerator package.
Introduces the CodeAnalyzer class and helper functions for parsing Python code using the ast module. This provides the foundation for understanding service client structures.
Implements functions to analyze Python source files, including:
- Filtering classes and methods based on configuration.
- Building a schema of request classes and their arguments.
- Processing service client files to extract relevant information.
scripts/microgenerator/generate.py
Outdated
| template_path = os.path.join(config_dir, item["template"]) | ||
| output_path = os.path.join(project_root, item["output"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit, optional] The pathlib.Path's / operator is a little less verbose and seems to be the preferred for new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We added Path to this section, but:
- right now the function in
utils.pythat ends up using this wants astringso... we convert the whole concatenatedPathobject to astr() - Why not just update the
utils.pyfile to take aPathOR a {string,Path}? - We can't run tests without multiple files and edits that are in two or three PRs that have not been merged yet, so I have no confidence that all the stars will align AND I did not want to try and do temporary workarounds to let me test this update. PR #2307 includes some, but not all the necessary changes include tests that are specific to
utils.py
I will add an item to the TODO list hosted internally at b/445158219 to ensure that we circle back and clean up the os vs Path situation. I feel like there are prolly a couple other nooks and crannies where Path would be a better long-term solution.
| return f"from {path} import (\n {names_str}\n)" | ||
|
|
||
|
|
||
| def generate_code(config: Dict[str, Any], analysis_results: tuple) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tuple input is a bit difficult to review to determine if the order of the fields is correct. Have you considered using a frozen data class? Or if positional access is required a named tuple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the code first started, we were only passing one item, which became two in a tuple and then three and is now four items.
I agree, it is time to move it to a more robust solution. Not all the parts that will end up being affected by this move are in this PR, so I would much prefer to merge all the outstanding PRs before doing too many changes to logic, etc.
This is all microgenerator code so no customers are gonna see this OR interact with it, just us devs, but there are better approaches that will make our lives easier in the long run.
I will defer this to the TODO list hosted internally at b/445158219 for now.
| for class_name, methods in data.items(): | ||
| for method_name, method_info in methods.items(): | ||
| context = { | ||
| "name": method_name, | ||
| "class_name": class_name, | ||
| "return_type": method_info["return_type"], | ||
| } | ||
|
|
||
| # Infer the request class and find its schema. | ||
| inferred_request_name = name_utils.method_to_request_class_name( | ||
| method_name | ||
| ) | ||
|
|
||
| # Check for a request class name override in the config. | ||
| method_overrides = ( | ||
| config.get("filter", {}).get("methods", {}).get("overrides", {}) | ||
| ) | ||
| if method_name in method_overrides: | ||
| inferred_request_name = method_overrides[method_name].get( | ||
| "request_class_name", inferred_request_name | ||
| ) | ||
|
|
||
| fq_request_name = "" | ||
| for key in request_arg_schema.keys(): | ||
| if key.endswith(f".{inferred_request_name}"): | ||
| fq_request_name = key | ||
| break | ||
|
|
||
| # If found, augment the method context. | ||
| if fq_request_name: | ||
| context["request_class_full_name"] = fq_request_name | ||
| context["request_id_args"] = request_arg_schema[fq_request_name] | ||
|
|
||
| methods_context.append(context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With several nested loops and if statements, I'm having some trouble following along today. Maybe worth adding some private helper methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We pulled out two chunks of processing and created two helper functions. which definitely makes the code a bit easier to parse.
I think we might be able to a bit more, but gonna hold off until all the things are merged and working before pushing my luck.
| context = { | ||
| "name": method_name, | ||
| "class_name": class_name, | ||
| "return_type": method_info["return_type"], | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on using a data class for this instead of a dictionary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look this over and consider whether it should be modified in a future PR. Right now, for an alpha release to see what works and what doesn't, a very small dict is probably a reasonable conveyance in a microgenerator. Also added this to the TODO list for tracking.
| for key in request_arg_schema.keys(): | ||
| if key.endswith(f".{request_name}"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] This looks like it'd be a good fit for a trie data structure. https://en.wikipedia.org/wiki/Trie That said, the current dictionary is probably small enough and this is part of code generation, not the user-visible path, so maybe not worth it.
Alternatively, it may be worth it to create a separate dictionary from request_name to fully-qualified name, since this method will be called more than once. That would take us from O(n^2) to O(n) (or possibly O(n log n) since I think Python dictionaries are actually trees not hashmaps.


Adds code generation logic
Implements functions for generating Centralized Client code, including: