-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BOOL should accept undef #12
Comments
I think this sounds like a very reasonable thing. (Edit: Obviously, I needed to think about this more) |
I believe in the principle that booleans should be treated the same as numbers/text/the-rest as far as undef goes, either they all include undef or none of them do, and for that matter there should be a version of every one of them that does NOT include undef, whether or not there is a version that does include it. |
@duncand, there's a good reason to consider booleans as separate from numbers and strings with regards to accepting undef. use warnings;
my $thing = undef;
my $text = "Hello $thing"; # warns
my $sum = 42 + $thing; # warns
do_something() if $thing; # no warning Perl itself is happy to treat undef as false in boolean contexts with no warning. So from that perspective, undef is "valid" as a boolean, while it's more questionable as a string or number. |
The problem with booleans as numbers/text/undef, is that there's no semantic information with them. Internally, that kinda works: was that |
On 2023-07-21 16:34, Toby Inkster wrote:
@duncand <https://github.com/duncand>, there's a good reason to
consider booleans as separate from numbers and strings with regards to
accepting undef.
use warnings;
my $thing =undef;
my $text ="Hello $thing";# warns
my $sum = 42 +$thing;# warns
do_something()if $thing;# no warning
Perl itself is happy to treat undef as false in boolean contexts with
no warning. So from that perspective, undef is "valid" as a boolean,
while it's more questionable as a string or number.
There is value, and there is state. Those are IMO best seen as different
dimensions.
<undef> could be both a state (represented by an undef-flag) and a value
(similar to the float value NaN).
Perl has many falsy values: undef, 0, "", "0".
A boolean data type adds another falsy value, that is distinguishable
from an integer zero.
<if> tests the state (or boolean casting) of a value.
Any non-zero integer will test as true, so would "from that perspective"
be "valid" as a boolean?
Does that mean that "BOOL should accept" 42?
IMO the BOOL data check should not accept undef.
…-- Ruud
Just to be funny:
perl -E'local $| = 42; say $|'
1
|
@tobyink Just because Perl doesn't warn when given an undef to a boolean context doesn't mean that undef should be considered a valid boolean when it isn't a valid number or string. An undef logically means unknown or unspecified information in the general sense, so treating it the same as known-to-be-false can produce wrong answers, just as wrong as treating undef as zero or the empty string. While it is reasonable for some programs to treat undef as being equivalent to those default/empty/etc values for a type, users should have the choice to have a bool that explicitly rejects undef, same as they have numbers or strings that reject undef. |
Perhaps something like a |
it's common practice to do empty return and use that function's value in boolean context. |
This is a non-issue if the course I recommend is followed that 2 versions of the BOOL are provided, one including undef and one without. Given that Oshun has never been in production, there is no existing code using any existing definition of BOOL, so no existing Perl code would have to be refactored one way or the other. Also when it comes to multiple variants of anything including or not including undef, I believe the version that excludes undef should always have the shorter plainer name, eg plain BOOL/INT/etc should always be the version that excludes undef, as the name fits, what you see is what you get, and longer things like Maybe[BOOL] or BOOL? etc would be the versions that include undef. |
Currently I'm more concerned about the fact that Perl now has native
Either you're producing or consuming that value. If you're consuming:
If you're producing:
You could also argue that the code should explicitly return
I am very sympathetic towards this idea, but since Perl cannot distinguish between undefined and uninitialized, we're kind of stuck because we don't know which is which for Perl. So let the consumer specify So the concerns seem trivial to resolve with the existing spec. Have I missed something? Principle of Parsimony: from there, if we decide to allow |
FWIW, I am with @happy-barney on the question of whether undef is a valid bool. We only introduce "true" booleans to perl recently, and for most of the existence of Perl Larry defined "false" to be "undef", "0", 0 and the empty string, and everthing else to be true (modulo overload). So I think it would be quite counter productive to not have a way to respect that design intent. It seems to me that the proposal we should have two keywords for booleans, perhaps distinguished by spelling. For instance I could imagine "Bool" and "BOOL". Or I could imagine "Bool" and "Truthy" or whatever, or maybe "PerlBool" and "JsonBool". (I cant remember the spelling convention we chose) I don't think it is helpful to say that only one model of boolean is supported. I could pass in an obect with boolean overloads, and it should pass a bool test. I understand the invocation of the principle of parsimony here, but i think it is misguided. Perl supports multiple notions of bool, and our checks should as well. I would be very surprised and consider it counter productive to not be able to use "undef" as a "false" value in a place expecting a bool. I also note that people seem to be saying that bools should be a number or a string, but i consider every unoverloaded ref to be a valid true value, and any overloaded reference could act as a boolean as well, so I think there is a conceptual clash there as well. Related I think the arguments that undef has a specific meaning closer to an SQL "NULL" tend to fall down under the semantics that Larry chose to provide. undef is == to undef and 0, and undef is eq to undef and "" (albeit with warnings), the mutator operators consider "undef" to be a valid initial value silently mapping to 0 or the empty string, and it is considered a valid boolean value in all internal contexts. So the weight of history does not support that "undef" means "no value" in the same sense that SQL NULL means that. We need to accomodate that history, especially as it is particularly useful in many contexts. A great deal of code depends on
working the same regardless if |
@demerphq oh, thanks mentioning
|
I agree that Otherwise, to answer @happy-barney's question, in Data::Checks::Parser, we have this:
So a
So yeah, currently it requires it being defined. (I'm not saying this is how things have to be, I'm saying "that's what the code currently does") Also, we don't allow implicit coercion anywhere (that I recall) because of this:
Currently, it's trivial to do things like this: Any thoughts about applying checks outside of assignment should be post-MVP. |
On 2023-07-22 10:23, Yves Orton wrote:
FWIW, I am with @happy-barney <https://github.com/happy-barney> on the
question of whether undef is a valid bool. We only introduce "true"
booleans to perl recently, and for most of the existence of Perl Larry
defined "false" to be "undef", "0", 0 and the empty string, and
everthing else to be true (modulo overload). So I think it would be
quite counter productive to not have a way to respect that design intent.
For me this (again) conflates states ("test results") and BOOL values.
I see no need for a BOOL data element to support undef.
A laxer BOOL-variant can auto-cast undef to false, either always or only
at initialization.
But I prefer BOOL itself to be strict.
What <if> tests, is truthiness, which for a number means "not zero" etc.
so that does not depend on BOOL. Similar for ternary, etc.
P.S. I also like to see (optional) default values, like in C++, where
"int x;" sets x to 0.
|
@druud We vetoed default values early on because they tend to be arbitrary. Just because something defaults to false doesn't mean it's false, for example. If we default |
better example is with
what should be default ? |
oh, test case for parser implementation (should where syntax by like this):
|
On 2023-07-22 12:18, Ovid wrote:
@druud <https://github.com/druud> We vetoed default values early on
because they tend to be arbitrary. Just because something defaults to
false doesn't mean it's false, for example. If we default |my UINT
$count;| to zero, what if there's actually a count, but due to poor
code, it's not assigned? What would be defaults on |GLOB| or |HANDLE|?
I think it is fine to not have a way to define a default value right now,
but that at some point it will come up again as really useful.
Whether it will then be opt-in or opt-out, is harder to predict.
For example: Java has no native 64-bit unsigned integer.
(in 64-bit context)
For defaults, I stick to the 'empty' approach:
zero for numeric values, empty string for textual values,
false for boolean values, empty for arrays and for hashes,
etc. Not all data types need to support a default.
And maybe not call it :default, but call it :empty.
…-- Ruud
|
On 2023-07-22 12:42, Branislav Zahradník wrote:
better example is with |where|
|my UINT where { $_ > 1 && $_ % 2 && $_ % 3 && $_ % 4 } $x ... |
what should be default ?
If the default of UINT (or rather its "empty value") is defined as 0,
and there is no initialization value, then this declaration would error.
If UINT supports undef, then its "empty value" is (always) undef,
and it can't have a (different) "empty value" configured.
If UINT doesn't support undef, then it can have an "empty value".
Without a configured "empty value", it then must be initialized at
declaration.
So it depends on what is in the '...'.
And on how UINT is configured.
…-- Ruud
|
@druud I'd say this discussion is irrelevant for MVP. In later stages (that's why I created Roadmap discussion ...) there should be warning / error when reading uninitialized variable which had specified contract. |
On 2023-07-22 16:56, Branislav Zahradník wrote:
@druud <https://github.com/druud> I'd say this discussion is
irrelevant for MVP. In later stages (that's why I created Roadmap
discussion ...) there should be warning / error when reading
uninitialized variable which had specified contract.
Yes, all good.
…-- Ruud
|
Maybe call it :unassigned or some synonym instead which is more accurate on what it actually is talking about. Some types don't have a concept of the "empty" value either. What would be the "empty" value for a day-of-week type, or an odd-integer type? No matter what we call it, given that in Perl like in many languages an object isn't always just representing a plain old value that can be cleanly serialized and deserialized, like a tree whose endpoints are all regular scalars, instead sometimes objects represent other resources or things, such as an open file handle or something, which are practically impossible to serialize/deserialize and it only makes sense to create them with explicit constructor arguments, and so you can't have a default/empty/whatever value. I feel the simplest solution that takes all of the options into account is, if the type includes undef then that is its default value when not explicitly initialized, and otherwise an explicit initializing assignment is required for any variable having a CHECK. I seem to recall something like this, or explicit-assignment-always, was proposed and favoured before. |
Another odd consideration: If we allow my BOOL $success; # succeeds because `undef` satisfies BOOL
my INT $count; # fails because `undef` doesn't satisfy INT Not saying it's a blocker, but it's a tiny thing we might want to consider. |
I wouldn't call that minor, I would call that a MAJOR inconsistency, that shouldn't happen. All of the plain-named types/checks like BOOL/INT/etc should be mutually consistent, either they all include undef, or they all exclude it, not some of one and some of the other. Having them different in this respect violates the principle of least surprise, that things which look similar should behave similar etc. So I strongly disagree with the original proposal that BOOL is treated differently than the others like INT/etc. |
That is my position too. And don't conflate a BOOL data value with 'boolean context'. |
Coming back to 'empty': your examples of weekday and odd-int, just show why I propose to call it 'empty' and not 'undefined'. Some types are good to get set when "left empty", and others aren't. I don't think we need this 'empty' feature from the start, though I think it will help making things easier to accept, as it would lead to less boilerplate: no need to initialise (almost) every INT to 0, TEXT to "", etc. Such an 'empty' mechanism is best strictly limited to only use values like 0, "", \0, false and such. Such an 'empty' mechanism is much smaller than an 'undefined' mechanism would be, and leads to much less 'action at a distance'. So IMO it shouldn't support a <BOOL $active> with a default of true, but a <BOOL $obsoleted> with a default of false would be OK. For example have an UINT32Z, that is mostly like UINT32, but also auto-initialises to 0. |
Important is that such a mechanism can not be used to auto-initialise values to any truthy, non-empty value. So the \0 that I mentioned earlier, must be blessed and have proper overload logic. |
@druud milestone XYZ:
|
So what I'm seeing here:
Given the first point, I think trying to figure out what people are going to do with this is maybe a bad thing. Let's look at his from a "lean project management" standpoint: we build what we need and nothing more. When we get feedback on real-world use, we can incorporate that. Does that sound good, or have I missed something? |
Sounds good to me. |
Sounds good to me too. |
On 2023-07-25 07:36, Darren Duncan wrote:
@druud <https://github.com/druud> On a tangent, I might be naive on a
point, but what does your |\0| example refer to? A data type
representing exactly one octet or character code point? I didn't
recall Perl having such a type, and that in practice either an integer
or a string of a single element would satisfy all the use cases one
may want to represent this. And in that case the most reasonable empty
values are covered by the integer zero and the empty string
respectively. A string with a single element like zero isn't "empty"
like the other cases are. If the data type is "non-empty string" that
might make sense as an "empty" version of that, but it would be rather
contrived much like having a default value for "positive integer" that
equals 1, and I can see a stronger case for the latter, but I think we
decided it wasn't desired.
Sorry for the confusion. I had already tried to explain it a bit later,
by mentioning that such a ref-case would need to be blessed and overloaded.
A blessed \0 value, is a "Perl person's practice" to express some base
(or "empty") value of a "type", like false for BOOL.
IIRC, some JSON modules do it that way, and likely some Math modules as
well.
- - - -
We should indeed counter the urge to have a mechanism to make 1 the
default value for POSINT.
To me, POSINT is a "data checkpoint" that should not be allowed to apply
a default.
Even though its value 1 can be seen as "empty" in multiplications, so as
"false" in some sense ...
;)
…-- Ruud
|
@druud With regards to your comment about the value 1 and multiplications, I see this as being with respect to a completely different feature set which is about being able to annotate particular properties of typed functions so that higher-level functions can do more intelligent things with them. For example, annotate that multiplication for particular types is commutative and associative operation with an identity value of 1, and similar for addition but identity value of zero, so that higher order functions such as reduce/fold know that if they are given the multiplication operator plus an empty list then the result is defined to be the identity value rather than the result being undefined, or the optimizer knows what flexibility it has for what order elements are combined, etc. Identity is very distinct from empty in concept. |
Yes, and please don't conflate them on my account; it was merely a funny way to show why non-empty defaults should not be supported. Still OK to notice the relation between empty and identity. :) Here at work I block non-empty defaults for (non-dynamic) columns in (MySQL) table definitions. We call that "no business logic on the db-side". The only exceptions are dynamic timestamps, for which we use names like mysql_row_created_at and mysql_row_updated_at, which are not allowed to be touched from the application-side, so are to be treated as row-attributes rather than columns. For application-side timestamps, use a signed bigint and store an application-side time-value. IOW, keep the time worlds of app and db completely separated. The profit from all of this is that the application doesn't always need to read back a row right after its creation, to find out what the database made of it, which saves us millions of roundtrips per minute. |
Unlike other primitive
BOOL
should acceptundef
as well.There are tons of existing code returning
undef
as false.IMHO it's not good idea forcing users to change their existing codebase in order to use newer syntax to add some features.
The text was updated successfully, but these errors were encountered: