Allow strings in `field/2` #4384

greg-rychlewski · 2024-02-24T04:34:20Z

I'm worried I'm missing something obvious. But I think with these 2 simple changes to the schema functions we can allow string names in field/2.

The __schema__(:type, field)__ functions will now catch string names and get the right type. And the __schema__(:field_source, field)__ functions will ensure the field is converted back into an atom during normalization so that the adapters don't have to change.

josevalim · 2024-02-24T08:36:22Z

Interesting. What happens if there is no schema?

greg-rychlewski · 2024-02-24T08:42:08Z

Ah good catch, I knew I forgot something. Then it fails but it can be fixed by changing the guard in the adapter here: https://github.com/elixir-ecto/ecto_sql/blob/master/lib/ecto/adapters/postgres/connection.ex#L835 and here https://github.com/elixir-ecto/ecto_sql/blob/master/lib/ecto/adapters/postgres/connection.ex#L841

josevalim · 2024-02-24T08:49:04Z

I think if we want to go down this route, we should rather go in the other direction. Have field_source return strings and then send it all the way down to the adapter as strings. If there is no field source, we normalize it to strings on the spot?

As long as it is a small change on the adapter side, we should be good. :)

josevalim · 2024-02-24T08:49:39Z

Although converting to strings on adapters such as etso and ecto_mnesia will likely cause other problems.

greg-rychlewski · 2024-02-24T08:58:56Z

It makes the inspection a bit weird too if it's normalized to string. I had to make a special condition to make the inspect look normal. I just pushed a change

josevalim · 2024-02-24T15:22:54Z

Thanks @greg-rychlewski. I am honestly a bit unsure. It feels we are at the worst of both words: sometimes they are atom, sometimes they are strings. What do you think about:

If there is a schema, we always normalize them to atoms in the planner (as long as the schema module has been loaded because we called a previous operation in it, String.to_existing_atom should be enough).
For schemaless queries, we always keep them as strings.

Perhaps we can start with step 2 for now (and raise if a string is given to a schema source)?

greg-rychlewski · 2024-02-24T16:29:33Z

That sounds like a good compromise. I'll update the PR.

There was one thing I was having trouble figuring out for inspect. I believe the dot notation for the fields does not play nice with strings. For example, atoms produce this

query = from p in Post, select: field(p, :visit)

IO.inspect query
# "from p0 in Inspect.Post, select: p0.visit"

But strings produce this

query = from p in Post, select: field(p, "visit")

IO.inspect query
# "from p0 in Inspect.Post, select: p0 . \"visit\"()"

I was previously handling it by transforming it to field inside of the inspect. But if we will normalize to string then it could be confusing if there is an error during planning and the user didn't use field/2. The only other thing I could think is to not use the dot notation, but then the inconsistency is probably confusing.

Do you know if there is anything I can do to make the string play nice with the dot notation?

josevalim · 2024-02-24T17:51:12Z

My only suggestion is to indeed convert it back to field(..., ...) when pretty printing/inspecting.

greg-rychlewski · 2024-02-24T18:23:48Z

I had one idea. Would you be against tagging the field/2 stuff all the way down to the adapter and them have the source resolved in the adapter? That way we could normalize everything given to field/2 to string and all the non field/2 stuff will stay as atom.

The reason I'm suggesting this is because when I tried normalizing schemaless sources to string it started to affect a lot of things I didn't expect like subqueries and joins. Doing it this way would minimize the blast radius and still have some kind of internal consistency. And either way we are asking adapters to change something.

josevalim · 2024-02-24T19:58:15Z

To be clear, you are saying this:

field/2 with a string arrives as a field in the AST all the way down to the adapter? That sounds good to me.

But we still need to decide what to do with strings and schemas. We have to options:

We say string fields are not checked, typed, or converted in any way (akin to a fragment)
We say string fields are validated and normalized accordingly

greg-rychlewski · 2024-02-25T02:32:09Z

I'm starting to feel like I completely misunderstood one of your earlier replies. When you said this

I am honestly a bit unsure. It feels we are at the worst of both words: sometimes they are atom, sometimes they are strings. What do you think about:

If there is a schema, we always normalize them to atoms in the planner (as long as the schema module has been loaded because we called a previous operation in it, String.to_existing_atom should be enough).

For schemaless queries, we always keep them as strings.

Were you talking about field/2 specifically? I read it as any field notation, for example p.title as well but now I think I was mistaken.

josevalim · 2024-02-25T09:14:21Z

At the time I did not mean field/2 but later on I certainly thought it should be only about field/2. :)

greg-rychlewski · 2024-02-27T18:58:12Z

Let me know what you think of this:

field/2 with string name

always normalizes its name as an atom when dealing with schemas
keeps its name as string when not dealing with schemas
`has its type validated when dealing with schemas. it's handled by adding guards to the reflection functions here:

ecto/lib/ecto/schema.ex

Line 698 in 4876779

{args, when_expr, body} ->

ecto/lib/ecto/schema.ex

Line 2327 in 4876779

types_quoted =

field/2 with atom name

no changes. though i'm not sure if you'd prefer to normalize its name to string when dealing with schemaless sources

greg-rychlewski · 2024-02-27T19:37:33Z

we could also make it so that field/2 with a string always emits a string and field/2 with an atom always emits an atom. for both schema/schemaless.

josevalim · 2024-02-28T10:46:45Z

My thinking would be that, if you pass a string, then it is kept as is and it would always be the field source. We never transform or manipulate it in any case.

greg-rychlewski · 2024-02-28T11:00:35Z

Do you think it would still be ok to check its type against the schema? For instance, if someone wanted to do this

field = get_user_input_as_string(...)
query = from p in Post, where field(p, ^field) == ^"some_string"

To work properly the parameter would have to be cast to the type of field(p, ^field).

Or alternatively they could wrap field(p, ^field) inside of type/2 but then it's kind of the asme problem where they have to dynamically find the type they want given a string field name.

josevalim · 2024-02-28T11:03:41Z

It really depends which problem we want to solve. Do we want to make it easier for people receiving parameters to create custom queries or do we want to make it possible for you to dynamically query databases without generating tons of atoms?

greg-rychlewski · 2024-02-28T11:21:39Z

I would say this is the bigger issue right now:

make it possible for you to dynamically query databases without generating tons of atoms

In which case I think you are saying this is not an issue currently for schemas. Because if you are using schemas then you know up front which strings are allowed and can handle it without this change?

If my above statement is not mistaken, then I think even though it's a bigger problem, it might be a bit confusing why we are not solving this problem at the same time if it's not too hard to do so:

make it easier for people receiving parameters to create custom queries

greg-rychlewski · 2024-02-28T11:28:08Z

I think I get it though. Basically it's about staying consistent using atoms when referring to schema fields right? If so it's ok to me. I could change the PR to raise if string is used for schemas.

josevalim · 2024-02-28T11:28:49Z

The problem I think is that we have two acceptable semantics. One is to make it a direct field source access and the other is to make it automatically cast. When we pick one, we exclude the other. For schemaless fields they are the same, however.

josevalim · 2024-02-28T11:31:11Z

In which case I think you are saying this is not an issue currently for schemas. Because if you are using schemas then you know up front which strings are allowed and can handle it without this change?

Yes, you could do String.to_existing_atom, althogh we can argue that we could have a more convenient API. The question is if it should be field or something else.

josevalim · 2024-02-28T11:43:59Z

For example, just for brainstorming purposes, it could be called field! or cast_field or something. Given the field may not exist and the query my fail.

greg-rychlewski · 2024-02-28T13:02:28Z

The question is if it should be field or something else.

One reason I can think to keep it as field/2 is because I'm seeing other areas that users might want opened up. For example, if someone has a need for field/2 with string then probably someone would also have a need for map/2 with a string list.

If we go down the route of making a new function for each one it might be cumbersome for us to maintain and for users to use. So it might be worthwhile to lax the definition to include stings in the existing functions and make them have the same semantics as atoms.

josevalim · 2024-12-23T20:18:07Z

I think we can go ahead with this one? Perhaps we start by simply allowing strings on field/2 which is then sent as is to the adapter?

Schultzer · 2024-12-23T20:40:10Z

I think we can go ahead with this one? Perhaps we start by simply allowing strings on field/2 which is then sent as is to the adapter?

I think this makes sense, the columns received from the database is strings as well.

greg-rychlewski · 2024-12-23T23:14:59Z

I forgot a lot about this but please let me know if my understanding below is correct.

When someone uses a string in field/2 we pass it as is to the adapter. That's not to say we pass the entire AST of field/2, but that the field name itself does not do anything like this:

Does not get changed to an atom (if using a schema)
Does not get changed to the value of :source (if using a schema)
Does not have any type associated with it. That means the auto-casting in situations like this where: field(p, ^field) == ^"some_string" will consider the left side to have type :any. A consequence of this is that query parameters might not be casted to the right type unless special care is taken.

If ^ is correct then this PR is good to review. I will just need to update the docs for field/2 to especially callout (3) above. It will also require this ecto_sql PR to work: https://github.com/elixir-ecto/ecto_sql/pull/596/files

lib/ecto/query/builder.ex

Co-authored-by: felipe stival <[email protected]>

lib/ecto/query/builder.ex

test/ecto/query/builder_test.exs

greg-rychlewski · 2024-12-24T02:18:21Z

It feels kind of weird to lose all the type behaviour and other things when querying a schema. Maybe raising if used with schema is better?

Schultzer · 2024-12-24T02:38:27Z

It feels kind of weird to lose all the type behaviour and other things when querying a schema. Maybe raising if used with schema is better?

Why would we lose type behaviour, would it be to expensive to convert the atoms to strings, or even keep them around, a-kind to how Ecto Enum works?

greg-rychlewski · 2024-12-24T04:02:41Z

My understanding is the debate is between these two choices:

Keep strings as strings and treat them like the field source. They are essentially bypassing everything done to schema fields in the planner like transforming to :source, checking for virtual fields, type behaviour, etc.
Normalize strings to atoms when dealing with schemas and don't let them bypass the schema field behaviour. Keep it as (1) for schemaless queries.

I understand Jose's last response to mean he's in favour of (1). So if you keep the type behaviour but not other things then the semantics are not consistent.

josevalim · 2024-12-24T08:38:02Z

Agreed. I prefer 1. We should not try to guess the user intent and this way they also behave consistently. For example, you can imagine that a string field could be used to access a source (or field_source) of a schema directly.

greg-rychlewski · 2024-12-25T01:14:14Z

@josevalim That makes sense to me. I have updated the docs so this PR and the Ecto SQL one are both ready for review whenever you have the time.

As a side discussion:

I think the use case getting the biggest benefit from this change is when you are dynamically managing a lot of schemaless queries at run time. i.e. there's not a tonne of structure to your queries that you know up front.

There is probably another use case, which is you have a lot of structure up front but the query is being dictated by outside input. It would probably be good to have a way for users to intentionally try to use an outside string as a schema field.

If you also think the above is a valid use case, would you be open to a follow-up PR or would you rather see how people use the change in this PR and what kind of feedback they give before going down that route?

josevalim · 2024-12-25T10:41:56Z

I would say that those later use cases are not blocked by this PR, right? They can use String.to_existing_atom and the existing reflection API to convert fields from strings to atoms. Perhaps we can add functionality to make their lives easier? Make a function like Ecto.cast_fields(schema, list_of_fields) or similar?

josevalim

Beautiful 😍

greg-rychlewski mentioned this pull request Feb 24, 2024

allow string names in field/2 elixir-ecto/ecto_sql#596

Merged

greg-rychlewski force-pushed the string_field_name branch from afa4369 to 4876779 Compare February 27, 2024 18:53

greg-rychlewski added 3 commits December 23, 2024 17:31

allow string field names

5cc65a1

add test

a32fdc6

inspect

3a654e7

greg-rychlewski added 2 commits December 23, 2024 17:31

review comment

9dc36d3

undo

fbac40f

greg-rychlewski force-pushed the string_field_name branch from 4876779 to fbac40f Compare December 23, 2024 22:35

greg-rychlewski added 2 commits December 23, 2024 17:36

undo

2ee8c85

fix type

ee1a571

add space

017a77f

v0idpwn reviewed Dec 24, 2024

View reviewed changes

lib/ecto/query/builder.ex Show resolved Hide resolved

Update lib/ecto/query/builder.ex

278e5f6

Co-authored-by: felipe stival <[email protected]>

Schultzer reviewed Dec 24, 2024

View reviewed changes

lib/ecto/query/builder.ex Outdated Show resolved Hide resolved

test/ecto/query/builder_test.exs Show resolved Hide resolved

fixes

2fd58b1

docs

0116a34

josevalim approved these changes Dec 25, 2024

View reviewed changes

greg-rychlewski merged commit 4990abd into elixir-ecto:master Dec 25, 2024
6 checks passed

greg-rychlewski deleted the string_field_name branch December 25, 2024 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow strings in `field/2` #4384

Allow strings in `field/2` #4384

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024 •

edited

Loading

josevalim commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 25, 2024

josevalim commented Feb 25, 2024

greg-rychlewski commented Feb 27, 2024 •

edited

Loading

greg-rychlewski commented Feb 27, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024 •

edited

Loading

greg-rychlewski commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Dec 23, 2024

Schultzer commented Dec 23, 2024

greg-rychlewski commented Dec 23, 2024 •

edited

Loading

greg-rychlewski commented Dec 24, 2024

Schultzer commented Dec 24, 2024

greg-rychlewski commented Dec 24, 2024

josevalim commented Dec 24, 2024

greg-rychlewski commented Dec 25, 2024

josevalim commented Dec 25, 2024

josevalim left a comment

Allow strings in field/2 #4384

Allow strings in field/2 #4384

Conversation

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024 • edited Loading

josevalim commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 24, 2024

josevalim commented Feb 24, 2024

greg-rychlewski commented Feb 25, 2024

josevalim commented Feb 25, 2024

greg-rychlewski commented Feb 27, 2024 • edited Loading

greg-rychlewski commented Feb 27, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024 • edited Loading

greg-rychlewski commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

josevalim commented Feb 28, 2024

greg-rychlewski commented Feb 28, 2024

josevalim commented Dec 23, 2024

Schultzer commented Dec 23, 2024

greg-rychlewski commented Dec 23, 2024 • edited Loading

greg-rychlewski commented Dec 24, 2024

Schultzer commented Dec 24, 2024

greg-rychlewski commented Dec 24, 2024

josevalim commented Dec 24, 2024

greg-rychlewski commented Dec 25, 2024

josevalim commented Dec 25, 2024

josevalim left a comment

Choose a reason for hiding this comment

Allow strings in `field/2` #4384

Allow strings in `field/2` #4384

greg-rychlewski commented Feb 24, 2024 •

edited

Loading

greg-rychlewski commented Feb 27, 2024 •

edited

Loading

josevalim commented Feb 28, 2024 •

edited

Loading

greg-rychlewski commented Dec 23, 2024 •

edited

Loading