-
-
Notifications
You must be signed in to change notification settings - Fork 714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JSON functions to the query interface #1423
Conversation
🤩 Thank you @myyra! I'm looking forward to looking at your pull request. Please hold on a few days 🙏 |
No rush! I added version requirements (iOS 16, macOS 13, tvOS 16, watchOS 9) based on the comments about SQLite versions, but I still need to confirm the versions on simulators at least. The failing tests look a bit funny to me, since only the operator is failing, although the functions should fail on the older versions as well. And the error message makes me think even more that I failed something in the operator implementation. Or maybe this is the case?
If I understood the docs correctly, the functions were available as opt-in, but not the operators which were introduced in |
You are very welcome :-)
We need to deal with strings at some point, anyway. I'm totally fine with it. My reluctance about key paths is based on the fact that record types do frequently, but not always, exactly reflect their database representation. The reality is that record types are a Swift interface to their inner database representation. And it's the same for "json records" as well. That's how people use them. They'll replace a text database value with a Swift string-based enum value. They'll publicly expose a And that's totally natural. It helps users avoiding an extra layer of "behind the front line" models when they want to hide intimate database details. It's good to be able to deal with a single layer of record types (some of them acting as a facade when needed). And this explains why key paths are not our friends: they only have access to the Swift application-facing side of the record types, not to the inner, private, database-facing side made of columns or json keys. Key paths are not columns or JSON keys. Some people think that a properly-designed app should define two types (one "proper" model that feeds from a "data transfer object" that feeds from the database). Well the library does not prevent those people from doing as they like. The rest of us can have their record types hide intimate database details when they feel like it. People should be free to choose their level of over-engineering, and the library tries to be architecture-agnostic.
The new Swift functions return
That's very good like that. Most GRDB tests are integration tests and even end-to-end tests. They definitely are about checking that we output what the user expects.
Last time I tried including an operator, it quickly conflicted with other libraries. It just makes the life of users miserable 😅 Could we explore subscripts, maybe?
Let's take care about
😬 (see below)
Swift dictionaries and arrays do not conform to those protocols. The choice between one or the other depends on whether or not you want the user to use standard Swift values in your apis. Your experience using the JSON functions should guide your choice, here. For example, the GRDB function Also, On the other hand,
OK. This is not easy because this is not well documented. Sometimes I have to use a conservative system version (maybe too restrictive), especially on macOS. Usually Apple synchronizes SQLite version across iOS/macOS/etc releases. Above all, do not harm: make sure the api is available on the declared systems. Users won't test this for us, and we don't want to have their apps fail at runtime on their users's old devices. Oh, and we'll have to talk about custom SQLite builds and SQLCipher :-)
What error do you get? |
This makes total sense, and this is how I use records as well. And somehow, "We need to deal with strings at some point, anyway" really resonated with me. A bit like I initially thought, anything more wouldn't really be GRDB's job, as it's a database library. If I were to build a separate library/DSL for constructing SQLite JSON paths, it would totally make sense and be usable with GRDB. I think this really confirms who should be responsible for it.
Yeah, I was thinking about the decoding. Since part of the JSON functionality in SQLite is checking that it's valid, I thought it would be nice to indicate that with something like a return type. But
puts it well I think. It wouldn't be nice to hide what the database actually returns; JSON functions are nothing special in that way. As for the actual decoding, I clearly didn't think it through, as of course what I was going after is already supported by the API 🤯 (to be honest I still have to look at how and why exactly does this work). As a test, I put some simple JSON
That is more or less what I assumed :D
I like this a lot. It is definitely worth exploring, and maybe could also be a nice bridge towards a prettier path API like I was talking about (in my head, I see something like
I might have worded that poorly, but to clarify, I was talking about including both or none of the SQL operators as Swift operators. Both should be accessible from Swift, as you said. Subscripts sound like an excellent way to have both. On the subject, I couldn't immediately think of a good way to differentiate between
Thanks, I think I got it now. While I don't see any immediate risk for conflicts with Swift itself,
Makes sense. I've run the JSON functions on iOS 16 and macOS 13 at least, but I'll test the rest of the platforms as well. And as I mentioned, it looks like they might be compiled into the older versions. But that will be harder to test as I currently don't have access to physical devices with lower OS versions.
I'm not sure about everything that this includes :D Conditional compilation for older versions that don't include the functions by default?
The original got fixed after limiting the versions, but I think this is the same one: Link to GitHub Actions. So maybe it was a version/availability issue after all? Here's the actual error from the original:
|
I removed the custom Swift operator and added subscripts for both SQLite operators. Currently |
Thank you @myyra. Please give me some more time. |
@myyra, this pull request is straight to-the-point, and this is very good. At the same time, I do have some critics, api-wise:
Your opinions on above critics are very welcome. Finally, I have questions about some of my intuitions, and whether or not they match your experience. I spent quite some time in order to understand the difference between My intuition is that the use case of extracting values is more frequent than extracting json. The difference becomes clear as soon as one considers strings, that
It's not common that one wants to extract // SELECT info -> 'name' FROM player
let info = JSONColumn("info")
let names = Player // [String]
.select(info["name"], as: String.self)
.fetchAll(db) Does this match your experience? Do you happen to use Those feedbacks come from an experimental branch experimental/json where I experience, just like you, the kind of problems we are trying to solve and the difficulties to overcome. There's something I like about the // CREATE UNIQUE INDEX unique_player_name ON player(info ->> 'name');
try db.create(
index: "unique_player_name",
on: "player",
expressions: [JSONColumn("info")["name"]],
options: .unique) (Thinking out loud - I'm deeply missing a good knowledge of actual use cases) |
Yeah, I'm not entirely happy with it either. Personally, I don't remember ever using the function in a real app, so it's hard for me to design the API for it. But I assumed it would be quite rare for people to want to input JSON text as-is to the function (because of I also agree about the name. You are correct in that I was just following the established naming scheme, but plain It might take some time for me to come up with a good alternative for the name, but since the docs speak about validation and minification, I'm thinking one of those terms could be included in the signature. Something like
Agree, they felt really off after using them a bit. I was actually going to update them to
This is quite in line with my experience. I mostly use
This is excellent. I have to experiment a bit in a real app, but I've always thought of columns containing JSON as a sort of separate thing (since they basically require their own mini sub-query), and a separate type for them really matches that mental model. |
Thanks for your previous insightful answer 🙏 While writing a first reply, it came to me that we need to make a break and identify JSON use cases that people (you, me, other users) might expect from GRDB. This will help clarifying our needs and goals! And first, this requires a little analysis of the What json() is useful for?The
So here is a proposed list of JSON use cases. Each use case groups related features that go well together, in a self-contained and complete unit. I care about completeness because there's few things more frustrating than developing against an api that lets you down after it has lured you into an illusory successful path.
@myyra, does this look sensible to you? Also, I see that your pull request comes with In its current state, the PR addresses completely, or partially, some of these use cases. And there are some use cases that I wish we had 😅
¹ Mentionned here but out of reach until GRDB is able to deal with table-valued functions. This will require a lot of work. I care about VALIDATE_JSON because I see some value in adding check constraints on JSON columns (can be useful when an app stores raw JSON loaded from an untrustable remote server): // CREATE TABLE test (info TEXT CHECK (JSON_VALID(info)))
try db.create(table: "player") { t in
t.column("info", .jsonText) // New type that self-documents the content type
.check { $0.isValidJSON } // There's some value here
} Of course, VALIDATE_JSON can come in another PR. |
So it looks like you don't quite have any need for the BUILD_JSON use case. This is good to know, because we probably can refocus the work and don't design any api around this use case (and probably remove support for
You're probably correct. I was very focused on literals and raw strings due to my own tests, but this might not be a frequent use case, especially given BUILD_JSON is out of scope.
Right. In my alternate branch, I explore
We can do better for
🎉
Then, I'm considering favoring On the other hand, To sum up:
So yeah 🙄 As you can see I'm still in the learning phase... I wonder if GRDB should express those SQLite subtleties, or ignore them altogether. My experimental branch was leaning towards the first choice, but I'm now reconsidering my position. Those SQLite subtleties are hard. Experience tells me that in such case, it's often good to let SQLite experts find in GRDB a faithful companion, so that their SQLite experience can be translated into Swift code without hassle. It's probably important to expose a set of straightforward JSON GRDB apis that match SQLite one-to-one. At the same time, I'd very much like to see a
There are currently two ways to define columns in GRDB: // An enum with cases
extension Player {
enum Columns: String, ColumnExpression {
case id, name, info
}
}
// An empty enum with static members (frequently used for Codable records)
extension Player {
enum Columns {
static let id = Column(CodingKeys.id)
static let name = Column(CodingKeys.name)
static let info = Column(CodingKeys.info)
}
} JSON columns could be defined this way: // An enum with cases
extension Player {
enum Columns: String, ColumnExpression {
case id, name
static let info = JSONColumn("info")
}
}
// An empty enum with static members (frequently used for Codable records)
extension Player {
enum Columns {
static let id = Column(CodingKeys.id)
static let name = Column(CodingKeys.name)
static let info = JSONColumn(CodingKeys.info)
}
} But as you say, we mainly need some way to derive a json column from // NOT GOOD
extension Player {
// 😖 `info` not present in Columns.allCases
enum Columns: String, ColumnExpression, CaseIterable {
case id, name
static let info = JSONColumn("info")
}
}
// BETTER
extension Player {
// 🙂 `info` present in Columns.allCases
enum Columns: String, ColumnExpression, CaseIterable {
case id, name, info
// 🙂 json column for handy `infoJSON["name"]` access to subcomponents
static let infoAsJSON = Self.info.asJSON // Something like that
}
}
// 🙂 Works well with static members as well
extension Player {
enum Columns {
static let id = Column(CodingKeys.id)
static let name = Column(CodingKeys.name)
static let info = Column(CodingKeys.info).asJSON
}
} |
I had written this as a reply to your previous comment before noticing your newest one, so I'll post this here and reply to it separately
Excellent idea. Had I known the extent to which we'd discuss this, I would've started a discussion around the subject first. Also, I didn't initially know what kind of interface for JSON would fit GRDB, so I thought I'd make a few small PRs with the JSON functions since those should sit nicely next to the other function wrappers. And my mindset has still been on just adding the simple wrapper functions. But looking at all of this now, and especially considering the scope of your work on experimental/json (which is pretty much exactly what I would've expected GRDB's JSON interface to look like, but I'll write more of my thoughts on it after a bit more use), I agree it's a good time to step back.
Thanks for the overview (and reminder)! I reread the docs, and remembered that I even had this documentation comment for it locally 😅: /// - Attention: This function is not appropriate for checking the validity of JSON.
/// Use ``isJSONValid(_:)`` instead. I have yet to use such nested JSON handling in SQLite, so I haven't really grasped how it should fit in the API. I agree that minification will probably be a niche use case (I've been going with the assumption that most Swift developers will handle JSON outside of SQLite through Considering that and the recommendation against using it for validation, I almost feel something like "standardizing" would be a better term to describe it (though I'm not sure if that's confusing when comparing GRDB APIs to SQLite). I think the majority of use cases will be to ensure whatever JSON, be it from the DB or as input, would be in a "standard" form understood by SQLite. How do you feel about this?
It does. One idea: COUNT_ARRAY_ELEMENTS and QUERY_JSON_TYPE seem a bit specific; maybe those could go under INSPECT_JSON? But otherwise, it's a good grouping, including BUILD_JSON. I actually realized that many of the remaining JSON functions probably should have been included in the PR for it to make sense, including the rest in BUILD_JSON, and already had them implemented locally but lacking tests. I haven't needed
I was planning to split my PRs into a few separate ones out of habit, but with this list, the boundaries make less sense, in addition to forgetting a few, as mentioned above. So maybe I actually had too many functions implemented? Also, considering the direction of your experimental/json branch, do you think it's still worth continuing this PR? The API in there is pretty much what I'd expect from GRDB for full and native JSON support, while I was merely considering adding a few helper functions, as I mentioned 😄.
I'm looking forward to when I have to use these. It seems like a perfect learning opportunity for API design and a deep understanding of GRDB's internals.
It certainly should be included. I noticed I had missed it when updating the documentation comments earlier, and realizing that
P.S. Thanks for giving me a good laugh with the remarks about SwiftUI and SwiftData when I was glancing your comment yesterday evening 😄. |
Not yet, at least. Removing support for
I might not be the heaviest user of the JSON features, but it certainly felt weird writing all the tests and experimenting with strings, since in all the actual apps I've built, almost all of the interactions are with values in the database.
I liked your approach there, as it type-safely ensures it's used in the correct places, which is really nice. But at the same time, I had trouble understanding when exactly it was getting called. But I think it's too early from both sides to consider such things more deeply yet.
Indeed!
That is certainly a challenging thing to decide. One thing I remembered from the docs, though:
To me, this hints that the So, should we just standardize the APIs to use And with both cases, could we have a type (
I thought of just adding a variable slightly after posting the comment, good thing it was a proper solution :D
I find my own argument less compelling after the above, but I also don't tend to use |
@myyra, I did not forget about this pull request :-) I'm actually close to something satisfying! |
Adds query interface wrappers for a subset of the SQLite JSON functions.
Sometimes it's near impossible to design a nice database schema due to external factors, and I've found myself needing the JSON functions quite a bit. They are a bit awkward to use without wrappers in the query interface (but otherwise work perfectly, which speaks a lot about the API design), so I wanted to start a discussion of how best to include them by making a PR that implements the read-only subset of JSON functions, and the
->>
operator. I'm looking to implement the editing functions as well, but they'll be in another PR for multiple reasons.Some notes/considerations:
Since SQLite has the concept of JSON paths, I thought about making a key path based API for the functions. But I then remembered that they can't be converted to
String
s anyway, so I dropped the idea. And building any other DSL for it, while nice, would probably be quite overkill as it would basically mean building thejq
query language in Swift… And if I remember correctly, you had some good reasons for not using key paths in the query APIs, so I'm assuming that this isn't something you'd want to pursue in the future either. Also, anything lighter would probably just end up wrapping.joined(separator: ".")
in an overcomplicated way, so now all the paths are simplyString
s.This might be out of scope since the PR is only a change to the query interface, but returning JSON
String
s seems a bit funny to me. I don't know of a nice native way to handle a lot of arbitrary JSON in Swift, so handling the returned data is currently left to users and the likes ofAnyCodable
. But I'd like to figure out something for this.The tests are currently a reflection of SQLite's JSON examples since that was the easiest way to confirm that they were working, but since that's testing SQLite in addition to the query interface, should they be changed to just check the generated output? They are also in a separate file, which I don't think is the correct place, so I'd appreciate help organizing them.
Regarding the
->>
operator, I'm not sure whether it should actually be included, but I wanted to try to implement it (with questionable success) and start the discussion of whether custom operators are something that should be in GRDB. I think my biggest issue is that we can't implement (at least to my knowledge) the->
operator, so it feels weird to only have the other one. Especially since it's not technically needed, as the functionality heavily overlaps the JSON functions.JSON support is quite new, so all the functions will likely need OS version constraints. Can this be validated on simulators?
Lastly, despite reading the docs on
SQLSpecificExpressible
andSQLExpressible
, and understanding the point of them from the compiler's point of view, I don't think I really grasped how they should be used in the APIs. So I'm assuming all of that needs to be refactored.Pull Request Checklist
development
branch.make smokeTest
terminal command runs without failure. (couldn't run it fully yet due to a few stupid issues on my side, but the unit tests pass. Will double-check the whole thing before undrafting the PR)