Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow json caster to parse boolean correctly #978

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

EdwardCuiPeacock
Copy link
Contributor

About a year ago, I made a contribution to add a feature to allow type casting (#704). In my more recent use case, I wanted to use @json to get a dictionary that contains boolean values, like the following:

my_field: 
  key: 
    attribute: True 

field_name: my_field

value: "@json @jinja {{ (this|attr(this.field_name)).get('key') or {} }}"

However, this throws the following error:

lambda x: json.loads(x.replace("'", '"'))
  File "xxxx/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "xxxx/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "xxxx/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 129 (char 128)

Which suggests to me that the json string parsing was not able to handle boolean values correctly. The json string to be parsed are in the form of

"{'attribute': True}"

which json.loads cannot handle. It necessarily needs to be the lower case true for json.loads to recognize this variable as a boolean.

The change I made allows @json caster to correctly parse the boolean variables by looking for strings "True" and "False" (without the quote; if it is quoted, then it stays as strings), and replace them with the lower case form.

Comment on lines 231 to 239
def _parse_json_strings(x):
x = x.replace("'", '"') # replace single with double quotes
# replace unquoted True / False with lower case true / false
if "True" in x:
x = re.sub(r'(?<!")\bTrue\b', "true", x)
if "False" in x:
x = re.sub(r'(?<!")\bFalse\b', "false", x)

return json.loads(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen in this corner case ?

VALUE = '@json {"TrueKey": False}'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"TrueKey" won't be replaced, but I think it is worth adding this case to the tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need regex here?

what if a simple equality check?

if value in ["True", "False"]:
    value = value.lower()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work because value is the whole serialized '{"TrueKey": False}' string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, the simple equality makes more sense. I tested this a little bit. The string to parse is probably almost always converted from a dict after parsed by jinja. For example,

value = str({"field": True})

gives me

"{'field': True}"

So like what you have above, I can potentially just match the substring ": True, and ": False, and replace them with ": true, and ": false. I will work on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I think the regex may be a good idea

@rochacbruno
Copy link
Member

This is related to #976

Copy link
Member

@pedro-psb pedro-psb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestions are:

  • rename _parse_json_string param x to value
  • add a test to assert the replace won't affect undesired True and False occurrences. e.g
'{"KeyTrue": True}' # ok, will keep "KeyTrue"
'{"Key-True": True}' # fail, will change to "Key-true"

Comment on lines 231 to 239
def _parse_json_strings(x):
x = x.replace("'", '"') # replace single with double quotes
# replace unquoted True / False with lower case true / false
if "True" in x:
x = re.sub(r'(?<!")\bTrue\b', "true", x)
if "False" in x:
x = re.sub(r'(?<!")\bFalse\b', "false", x)

return json.loads(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"TrueKey" won't be replaced, but I think it is worth adding this case to the tests

@EdwardCuiPeacock
Copy link
Contributor Author

My suggestions are:

  • rename _parse_json_string param x to value
  • add a test to assert the replace won't affect undesired True and False occurrences. e.g
'{"KeyTrue": True}' # ok, will keep "KeyTrue"
'{"Key-True": True}' # fail, will change to "Key-true"

Will do. Will add the test cases.

@@ -239,6 +239,17 @@ def evaluate(settings, *args, **kwargs):
return evaluate


def _parse_json_strings(value):
value = value.replace("'", '"') # replace single with double quotes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩

What will happen with a JSON like:

{"somekey": "This is a 'value' containing single' quotes"}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, I can't come up with a good regex to replace all the patterns I have considered,

input_strings =[
    '''{"somekey": "This is a 'value' containing single' quotes", "someotherkey": "Another 'value'"}''',
    """{'somekey': "This is a 'value' containing single' quotes", 'someotherkey': "Another 'value'"}""",
    #"{'somekey': 'This is a 'value' containing single' quotes'}", # this should never happend when converting from dict to str 
    """{'somekey': "This is a 'value' containing single' quotes", 'someotherkey': 'Another value'}""",
    """{'somekey': 'This is a value not containing single quotes', 'someotherkey': 'Another value'}""",
    # """{"somekey": 'This is a 'value',  'containing single' quotes'}""",
]

Then how about using ast.literal_eval? This would also fix the issue with True / False parsing.

@codecov-commenter
Copy link

Codecov Report

Merging #978 (65109cd) into master (9ca687a) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@           Coverage Diff           @@
##           master     #978   +/-   ##
=======================================
  Coverage   98.95%   98.95%           
=======================================
  Files          23       23           
  Lines        2196     2197    +1     
=======================================
+ Hits         2173     2174    +1     
  Misses         23       23           
Files Changed Coverage Δ
dynaconf/utils/parse_conf.py 98.91% <100.00%> (+<0.01%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

if isinstance(value, Lazy)
else json.loads(value),
else ast.literal_eval(value),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will address all the cases, look:

Case 1 OK

>>> ast.literal_eval("{'foo': True}")
{'foo': True}

Case 2 - NOK

>>> ast.literal_eval("{'foo': true}")
ValueError: malformed node or string on line 1: <ast.Name object at 0x7f4e111cb6d0>

It is really trick that we need to make both true (valid json) and True (valid python) working.

I am rethinking this issue

Should we allow invalid json as '{"foo": True}' to be passed to @json ?

Maybe we want to make it strict, and add a new marker @dict to accept python dict literals.

Another option is adding a filter to Jinja to address your use case @EdwardCuiPeacock

"@json @jinja {{this.FOO | as_bool }}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can instead of dict add a @py_literal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I didn't consider that case. The tricky part is, when first rendered with @jinja, the dictionary will be converted to strings, probably by str(value), before being further casted by @json. I guess this is why we have True instead of true, or single quotes ' instead of double quotes ". I am not familiar with the code enough to pinpoint where this string conversion is (maybe under Lazy._dynaconf_encode?). But if we can make special cases when converting Python dict to strings using json.dumps instead of str, that may also solve the problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EdwardCuiPeacock there is another option, custom JSONDEcoder

In [37]: class BoolDecoder(json.JSONDecoder):
    ...:     def raw_decode(self, s, idx=0):
    ...:         try:
    ...:             return super().raw_decode(s, idx=idx)
    ...:         except json.JSONDecodeError:
    ...:             # Handle the replacement here
    ...:             # find :True : True, :False, : False
    ...:             # pattern and replace with lowercase
                     # BETTER USE A REGEX HERE
    ...:             if "True" in s:
    ...:                 s = s.replace("True", "true")
    ...:             if "False" in s:
    ...:                 s = s.replace("False", "false")
    ...:             return super().raw_decode(s, idx=idx)
    ...: 

In [38]: json.loads('{"foo": True, "a": 1}', cls=BoolDecoder)
Out[38]: {'foo': True, 'a': 1}

In [39]: json.loads('{"foo": False, "a": 1}', cls=BoolDecoder)
Out[39]: {'foo': False, 'a': 1}

In [40]: json.loads('{"foo": false, "a": 1}', cls=BoolDecoder)
Out[40]: {'foo': False, 'a': 1}

In [41]: json.loads('{"foo": true, "a": 1}', cls=BoolDecoder)
Out[41]: {'foo': True, 'a': 1}

@rochacbruno rochacbruno added this to the 3.3.0 milestone Aug 21, 2023
@EdwardCuiPeacock
Copy link
Contributor Author

EdwardCuiPeacock commented Aug 21, 2023

@rochacbruno I think the casting idea | to_bool solved my current issue. In fact, i can use | tojson within the jinja template to convert to a json string first, then @json will correctly parse the rendered string.

y_field: 
  key: 
    attribute: True 

field_name: my_field

value: "@json @jinja {{ (this|attr(this.field_name)).get('key') | tojson or {} }}"

Here is what I have so far after the update:

  • Adding @py_literal casting to handle non-json cases
  • Added some unit tests to test the new @py_literal caster
  • For @json casting, remove the hack to replace single quote ' with double quotes, ", as this won't work for cases like {"somekey": "This is a 'value' containing single' quotes"}.
  • For @json casting, adding unit testing which will correctly raise json.decoder.JSONDecodeError if the input is not strictly json

@rochacbruno rochacbruno modified the milestones: 3.3.0, 4.0.0 Aug 22, 2023
@netlify
Copy link

netlify bot commented Aug 22, 2023

Deploy Preview for dynaconf ready!

Name Link
🔨 Latest commit 834e974
🔍 Latest deploy log https://app.netlify.com/sites/dynaconf/deploys/64e7832a49d3af000809afbe
😎 Deploy Preview https://deploy-preview-978--dynaconf.netlify.app/dynamic
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@rochacbruno rochacbruno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,

However we might keep this PR unmerged until we are ready for 4.0.0 as it includes a breking change.

@@ -254,11 +255,12 @@ def evaluate(settings, *args, **kwargs):
)
if isinstance(value, Lazy)
else str(value).lower() in true_values,
"@json": lambda value: value.set_casting(
lambda x: json.loads(x.replace("'", '"'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change, to be more strict, however this is a breaking change and we will merge this only for 4.0.0

if isinstance(value, Lazy)
else json.loads(value),
"@py_literal": lambda value: value.set_casting(ast.literal_eval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just short it to @pyliteral

# Testing list
res = parse_conf_data("""@py_literal ["a", "b", 'c', 1]""")
assert isinstance(res, list)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a test case for @pyliteral @jinja ... combination

@EdwardCuiPeacock
Copy link
Contributor Author

@rochacbruno Sounds good to me. Let me know when that happens.

@rochacbruno
Copy link
Member

BTW we need to document this

casting idea | to_bool solved my current issue. In fact, i can use | tojson within the jinja template to convert to a json string first, then @json will correctly parse the rendered string.

y_field: 
  key: 
    attribute: True 

field_name: my_field

value: "@json @jinja {{ (this|attr(this.field_name)).get('key') | tojson or {} }}"

@EdwardCuiPeacock
Copy link
Contributor Author

Added the new features in docs with examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants