Skip to content

Conversation

@BurnzZ
Copy link
Contributor

@BurnzZ BurnzZ commented Aug 17, 2020

This test need not be merged into master but instead provides the necessary stimuli to see some pain points in shublang's usage.

In particular we can see that:

  1. The current sanitize functionality returns empty strings in its iterable. This presents the need to update it to prune out the empty strings, otherwise it would evaluate our test example as ['', '', '', 'price: $123,823.00', '']

  2. We need to do a double first, since the 1st one transforms [('123823.00',)] into ('123823.00',) and the 2nd one transforms (123823.00,) into 123823.00.

  3. The float functionality needs to be in between the double first since it only works on iterables.

As we can see, we need to jump on a lot of hoops just to properly extract this type of data.

Ideally, we should have a way to extract the data in a very concise manner like: re_search("(\d+\.\d{2}) | first_match

@renancunha
Copy link
Contributor

I agree with the need for simplicity/concision at the necessary pipes to extract the data in the provided example above. Given the current grammar, from the "logical" point of view makes sense to use the first twice, but analyzing it from the user perspective, it could be weirdy, or at least, verbose.

Another thing that I found exploring this example is that the first will fail if the re_search returns None. Also, if we try to apply the float to an empty value we will get an exception too because it expects an iterable. But I think that in these cases we can avoid breaking things evaluating the expressions inside a try/catch (if an exception is thrown we return None), in the same way that the universal parser extractor does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants