Opened 6 weeks ago

Closed 6 weeks ago

#35946 closed Bug (wontfix)

Postgres ArrayField should convert values when given a list

Reported by: Zerq Owned by: Tanish Yelgoe
Component: contrib.postgres Version:
Severity: Normal Keywords: ArrayField to_python
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

Currently ArrayField.to_python only calls base_field.to_python when given string. In my opinion it should call it always. Currently using this field is inconsistent. Simple example:

    f = MyModel._meta.get_field("int_array")
    f.clean('[1, "2", 3]', None) # ok
    f.clean([1, "2", 3], None) # error

Proposed ArrayField.to_python, witch always calls base_field.to_python(val):

    def to_python(self, value):
        if isinstance(value, str):
            value = json.loads(value)
        return [self.base_field.to_python(val) for val in value]

Change History (5)

comment:1 by Tanish Yelgoe, 6 weeks ago

Owner: set to Tanish Yelgoe
Status: newassigned

comment:2 by Sarah Boyce, 6 weeks ago

Resolution: invalid
Status: assignedclosed

Hi Zerq, thank you for the ticket!

The string case is commented with # Assume we're deserializing and that's why the to_python is called (hence the inconsistency, there's different contexts)
Your suggested change is making us accept more values that we would previously reject (which could be a regression if folks rely on this behavior)
In short, I believe this by design and not a bug

in reply to:  2 comment:3 by Zerq, 6 weeks ago

Replying to Sarah Boyce:

Hi Zerq, thank you for the ticket!

The string case is commented with # Assume we're deserializing and that's why the to_python is called (hence the inconsistency, there's different contexts)
Your suggested change is making us accept more values that we would previously reject (which could be a regression if folks rely on this behavior)
In short, I believe this by design and not a bug

Thanks for your replay.

I agree that it would be breaking change. But currently it is inconsistent with how fields operate and that makes it a trap. This is why I think it should be patched in new release as breaking change or at least warning added to documentation. to_python is not only called during deserializing, but as a normal cleaning process (https://docs.djangoproject.com/en/5.1/howto/custom-model-fields/#converting-values-to-python-objects). Users can expect that values will be coerced to proper type when assigned to model and cleaned. Custom fields can also preprocess values in this method (example: https://github.com/django/django-localflavor/blob/085fb2be02cae1c95594c2a358b8df8d62fa5577/localflavor/es/models.py#L43). This field will work fine alone, but not in ArrayField.

Example with different base_field:

class MyModel(models.Model):
    boolean = models.BooleanField()
    bool_array = ArrayField(models.BooleanField())

def test_my_model():
    m = MyModel()
    value = "false"  # value read from network
    m.boolean = value  # ok
    m.bool_array = [value]  # error
    m.full_clean()

I see no design reason why ArrayField should decide what values are acceptable or skip preprocess on its own.

I do not try to force this change, but ask to give it second thought. For me it is a bug, breaking how fields work within ArrayField.

comment:4 by Zerq, 6 weeks ago

Resolution: invalid
Status: closednew

comment:5 by Sarah Boyce, 6 weeks ago

Resolution: wontfix
Status: newclosed

Looking at the linked docs: (https://docs.djangoproject.com/en/5.1/howto/custom-model-fields/#converting-values-to-python-objects

to_python() is called by deserialization and during the clean() method used from forms.
As a general rule, to_python() should deal gracefully with any of the following arguments:

  • An instance of the correct type (e.g., Hand in our ongoing example).
  • A string

Does it handle an instance of the correct type?
Yes, [1, 2, 3] would be correct in terms of a ArrayField with an IntegerField base_field, and [1, "2", 3] is not a valid array field with integer base

Does it handle a deserialization (the string case) correctly?
Yes, when we serialize an ArrayField with [1, 2, 3] we get '["1", "2", "3"]' which is handled to be valid.

To me, the odd case is that '[1, "2", 3]' valid. That's because we get '[1, "2", 3]' apply json.load and get [1, "2", 3] and then the to_python of the base fields handle both strings (as the deserializer case) but also the case that it just received a value of the correct instance. We _possibly_ could handle this. I am not sure the cost benefit.

So I am adjusting this to "wontfix". This means you can start a new conversation on the Django Forum, where you'll reach a broader audience and receive additional feedback, if there is consensus that we should change this. You can reopen the ticket.

Note: See TracTickets for help on using tickets.
Back to Top