Disallow orphan instances #1247

garyb · 2015-07-09T23:05:49Z

We keep talking about this but don't seem to have an issue for it. I'm quite an enthusiastic proponent of disallowing them and not providing an escape hatch either.

Worth discussing too perhaps, what counts as an orphan in the presence of multi-parameter typeclasses? I guess for an instance to not be an orphan it must satisfy either:

The instance is declared in the same module as its class.
At least one of the types appearing in the instance head is declared in the current module.

paf31 · 2015-07-09T23:06:29Z

👍 for no escape hatch.

puffnfresh · 2015-07-09T23:23:51Z

Hugely important to me 👍

And I'm pretty certain those are the correct rules for enforcing coherency with mutli-parameters 👍

bbarker · 2019-07-12T04:03:28Z

Sorry for reviving an old thread, but I'd like to understand this issue a bit more. I've been thinking about Scala implicits as a model for clashing instances. In my few years of programming Scala, it has never been much of an issue; Scala has manual import/export ability for implicit values as well (iirc, getting a bit rusty with Scala lately).

Existing (non-orphan) instances could continue to be imported implicitly, but orphan instances would require manual import, or give a compile-time failure otherwise. Which seems like a satisfactory outcome, as it is for conflicting implicit values in the same scope in Scala.

I know @puffnfresh has a lot more experience than me in both Scala and PureScript, though, and I haven't thought a lot about this, so I'm sure I'll learn something! But it would be cool to have that power, without sacrificing safety beyond compilation failure at use-sites, which could be avoided by altering imports.

hdgarrood · 2019-07-18T20:31:23Z

This gets asked occasionally, and I'm not aware of a solid writeup, so it would probably be good to have one. Quite a lot has already been written about this in the Haskell world too, so that's probably a good place to look. See e.g. https://www.reddit.com/r/purescript/comments/4y0qwf/why_are_orphan_instances_strictly_disallowed/ and https://stackoverflow.com/questions/3079537/orphaned-instances-in-haskell. I also recommend looking at the GHC users' guide, which gives a flavour of how involved these issues can become: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#instance-overlap

I think it's basically to do with coherence: the property that any given type can only have one instance of a given class. This is such a natural property that I suspect we often make use of it without even thinking about it. For instance, Map relies pretty heavily on the assumption that a given type can only have one Ord instance: if you were to somehow construct a Map K V in one module with some Ord K instance in scope, and then try to lookup values in that Map in another module where you had a different Ord K instance in scope, you will end up with lookup returning Nothing when it should be returning Just something, because Map depends on coherence of the Ord instance in use.

Another way of looking at it is this: if we allow orphan instances, we have to work out what to do when we have two different candidates for a particular instance in scope.This is a huge can of worms (this can be seen by googling for the numerous threads, blog posts, and stack overflow questions on the subject). One option is to reject the program entirely, but this isn't great: an upstream library adding an instance can now cause your program to stop compiling. Another is to have the compiler pick the most specific instance, but this is difficult to describe and it still suffers from a similar problem: an upstream library might add a more specific instance than the one you were using with slightly different behaviour, so that your program silently starts using that instance as opposed to the more general one which was being used previously.

shmish111 · 2020-08-14T11:09:39Z

@garyb @hdgarrood instances in purescript have names, why not just use the non-orphan by default and give an option to use a named instance (that's what Idris does for example). I can say that not having an escape hatch for orphan instances is the single biggest waste of my time and damage to the codebase when using purescript. As soon as you want to encode and decode things (which is pretty often given purescript is a browser language) you have to add massive amounts of boiler plate, lots of newtypes, wrapping and unwrapping, etc etc etc. I can see all the arguments against orphan instances but it is really, really, really impractical. On a reasonably large codebase, once every month or so I end up wasting a whole day doing this thankless task and I am forced to make my code less readable and more complicated. An extra newtype and the odd unwrap is not a problem but these add up, our codebase is significantly worse because of it. I know I'm not the only person who has had the same experience.

P.S. I'm not as angry as I sound, purescript is an amazing and highly practical tool 😄

garyb · 2020-08-14T13:16:47Z

I wonder what it is in our respective codebases/styles of working that makes this a problem for you/others, but something I almost never encounter. 🤔

I think Arbitrary / Coarbitrary instances for quickcheck is the only time it's occurred for me, outside of situations where a missing instance was just an oversight and could be PR'd to the library. I guess you're using classes and datatypes from libraries that don't have an existing relationship with each other, but I'm not sure what those would be?

puffnfresh · 2020-08-15T04:06:17Z

Lack of orphan instances is one of the best things about PureScript. It's not a problem at all.

shmish111 · 2020-08-15T10:44:58Z

It is intriguing. One problem that hits me is that foreign-generic has no instances for list, set, map and tuple and that makes sense from a library perspective I guess since different encodings make sense in different systems (map and tuple particularly). Newtyping every map and tuple in my domain so that I don't have to write very difficult encode/decode instances means that I then have lots of wrap/unwrap in places that I don't want it. I've just had to do this and there was really a lot of boiler plate, way too much so that it made parts of the code very difficult to read. If my project could have it's own instances for these basic data types and I could specify the instance names then all this would be gone. Since we already have to name instances, from a user perspective having some sort of using syntax would be very comfortable and explicit about the fact that you are overriding the default behaviour.

@garyb what do you do about sending deeply nested maps and tuples over JSON?

garyb · 2020-08-15T17:56:06Z

I don't use any kind of "automated"/derived mechanisms for serializing, we write our codecs explicitly as values using https://github.com/garyb/purescript-codec-argonaut (some reasoning here).

I would suggest instead of using newtypes in your domain, having a serialization model separate from the domain model might be slightly simpler? That way any newtyping you need to do to massage the serialization won't get in the way of program logic at least. This is like a half-way step to the full codec approach too, it at least provides some protection from accidentally breaking the format and such.

shmish111 · 2020-08-16T12:25:58Z

@garyb very often our communication between the front end and back end is developed in sync (even using purescript bridge sometimes). Writing encoding by hand is 100% waste of time in these cases, a large amount of effort for no real-life benefit. Deriving can be a safer often because you don't forget to change it when you change a data type. If you use a specific deriving mechanism then it can certainly be safer than writing a codec yourself (of course not always) that is assuming the deriving mechanism works as expected (maybe that's more likely to work correctly than humans writing codecs though). I think your article presents things as a bit black and white (although maybe that's no problem because that side of the problem probably isn't presented often enough), however as we know, everything has trade-offs and there are many situations where deriving everything is certainly the better (and safer) option. The newtyping I was doing doesn't need to get in the way if I have a way to escape the no orphan instances rule. Or, as is the case now, I decide to fork the library where the classes are defined. I'm doing this now and on balance it is creating less work, higher quality, less to go wrong and safer code. In this case it also happens that foreign-generic is not a large library and doesn't need to change very often so forking it is not a big deal however clearly that's not always the case.

None of this would be a problem if I could use an Idris like using syntax.

garyb · 2020-08-16T13:53:00Z

I think your article presents things as a bit black and white

Yeah, very true! I wrote it from my experience, and I've just always had trouble with derived codecs in the long run (we even have trouble in the compiler with them inadvertently breaking things from time to time). The case of having a frontend/backend pair where they both speak the same types, and where there's no chance of the domain model being persisted to disk, is perhaps the only situation where there are no drawbacks, but that's not a situation I've encountered personally.

hdgarrood · 2020-08-16T18:26:02Z

The problem with allowing you to choose the instance you’re using in a specific part of your program is that assuming instances are globally unique is really very natural; I suspect we are all doing it more often than we realise. The classic example is if a map is used in two places where different Ord instances for its key type are in scope, it totally breaks. I think maintaining your own serialization type classes is really not a bad option at all, especially if you’re relying on Generic deriving.

natefaubion · 2020-08-16T18:34:01Z

I also agree that maintaining your own serialization class is the way to go. I just don't think classes like From/ToJSON currently make for a good library abstraction. There are just going to be too many special cases, assuming that anything is a desired globally unique implementation is going to lead to someone being upset. At work, we codegen (much more efficient dictionaries and implementations) but if I was using Generic or needed to maintain certain representations against a backend API, I would most certainly just use an application internal typeclass. Maybe something like deriving via will make special cases like this easier to override in the future. I don't see us ever relaxing the orphan rule, personally.

shmish111 · 2020-08-16T18:42:06Z

Ah yes, I could argue (to myself) that maintaining a fork of foreign-generic is actually just maintaining a serialization package for my project as a whole. Thanks for allowing me to think about this.

It's certainly a real shame if the orphan rule is never relaxed, it seems from my perspective to go against the difference between Elm and Purescript. If I want my language to be simple and restrictive (often a good choice) then I would go with Elm however if I want more power I'll go with PS. The orphan instances thing is one place where it seems PS is disallowing something "for my own good" but in fact is making perfectly valid use cases that would make my code better, impossible. I can say with certainty that it's the only thing that keeps coming up after almost 5 years of using purescript.

natefaubion · 2020-08-16T18:46:47Z

I'll note that the orphan rule does not have to necessarily be restricted to modules. That is just our compilation unit. Rust for example also has an orphan rule, but because it uses entire crates as compilation units, it is able to enforce uniqueness across the entire crate, and not just to a single module. Maybe in the future with a package-aware compiler, it would be able to provide the same guarantees over a wider scope, but I don't see the current rule being relaxed to the point that you can provide implementations for both types and classes from other packages.

bbarker · 2020-08-16T18:52:47Z

Just another possible use case where this came up for me: purescript/purescript-enums#39 ; I also felt that taking another approach was just too unwieldy here. This resulted me in having to fork purescript-enum, which also meant none of my resulting libraries could benefit from being on pursuit, which is too bad as that is a really wonderful site.

paf31 · 2020-08-17T16:30:28Z

If newtyping or separating types for different domains are really not options, then there are ways to solve this in libraries:

import Prelude
import Data.Array as A
import Data.List
import Data.Generic.Rep
import Foreign.Class
import Foreign.Generic
import Type.Proxy
import Unsafe.Coerce

data Foo = Foo (List Int)

derive instance genericFoo :: Generic Foo _

instance x :: Encode Foo where
  encode = unsafeWithListInstance (Proxy :: _ Int) genericEncode defaultOptions
  
unsafeWithListInstance :: forall a r. Encode a => Proxy a -> (Encode (List a) => r) -> r
unsafeWithListInstance _ f = (coerce f :: ({ encode :: List a -> Foreign } -> r)) { encode: encode <<< A.fromFoldable } where
  coerce :: forall a. (Encode a => r) -> ({ encode :: a -> Foreign } -> r)
  coerce = unsafeCoerce

But this definitely doesn't have to be a compiler change.

bbarker · 2020-09-20T04:57:09Z

Not to attempt an argument from authority, however there is an interesting (historical!) discussion on the topic from 12 years ago, in which SPJ claims there is no implementation-independent reason to stay away from orphan instances: https://mail.haskell.org/pipermail/libraries/2008-August/010399.html As near as I can tell, a lot of the same pros and cons were discussed then as are being discussed in this thread.

The problem with allowing you to choose the instance you’re using in a specific part of your program is that assuming instances are globally unique is really very natural; I suspect we are all doing it more often than we realise. The classic example is if a map is used in two places where different Ord instances for its key type are in scope, it totally breaks. I think maintaining your own serialization type classes is really not a bad option at all, especially if you’re relying on Generic deriving.

This is a great point and a great example. I imagine there could be solutions to this, and I'm probably not the best person to think them up. But I'll give it a few tries:

Course grained approach: Global choice of instance for the entire compilation. Easy to implement, probably, but not very flexible. Still, useful for many cases I imagine..
instance implementation as a hidden parameter of the types that it is used for in the type checker. These could possibly be erased early on if only one instance is in scope in a project (not sure how the compiler works here), but the idea is that in e.g. the Map/Ord case, different Ord impls for a Map would actually be different Map types. Most of the time this wouldn't matter unless we try to use two Map values of the same time together, in which case the type checker would catch this.
Explicit scoping. This probably isn't mutually exclusive with (2) -- or maybe it is -- but the idea would be that a default instance ceases being in scope if an orphan instance is explicitly brought into scope. I'm actually not sure if this is hard to implement - I could see it going either way from the little I know.

I think between 2 and 3, and maybe just 3, this would solve the problems with orphan instances described here. I'd really like to know if I've missed understanding some subtlety here.

Again this comes back to Scala implicits. There's not a lot about Scala I miss, but I actually rather enjoyed that feature of implicits. Of course, if we could pass around typeclass instances as first-class values, it would be similar, as we could thread them through a Reader monad.

bbarker · 2020-09-20T05:13:15Z

I've been a bit curious about Idris2 lately and was wondering how it handled it, and in this thread it appears that both Idris(2) and Agda essentially support orphan instances. As near as I can tell without knowing either of those languages much beyond the hello-world stage, it seems like they do something similar to Scala implicits, or the explicit importing I mentioned as option (3) above, though what they have sounds closer to implicits as it is more granular than file-level, I think.

bbarker · 2021-01-06T15:15:38Z

Here's another example I thought of that is slightly problematic for the current approach. Let's say you want to make a typeclass that others use that supports multiple existing data types of varying degrees of popularity. One example might be a hypothetical ConcatMap typeclass. Do I then depend on every other datatype that impelements concatMap, or make PRs and hope that they will depend on my ConcatMap library?

Neither of these options seems good.

In the first case, you are adding a lot of maintenance overhead and potential bloat to your library. I see this discussed frequently as a concern, e.g.: purescript/purescript-either#38

In the second case ... it simply won't happen, between maintainers that are either MIA or disagree about including your typeclass for whatever reason.

garyb · 2021-01-06T15:21:16Z

I'm not sure about the other maintainers, but just to be upfront about it: there is no argument that can be made on this issue that will make me come around to agreeing that allowing orphan instances again is a good thing.

It's just how PureScript works now and forever more as far as I'm concerned. If there is ever a PureScript 2, maybe it'd be something up for discussion then, as the only way orphans would be agreeable for me is in a system that is fundamentally different from how typeclasses work just now.

bbarker · 2021-01-06T15:38:42Z

Fair enough, I suspected as much but I was primarily posting here as this seems to be the best place to include information on it at the moment, though a discourse thread would probably be better.

Also side point: i realize you probably wouldn't actually want a ConcatMap class - that was just an example.

garyb · 2021-01-06T15:45:44Z

Sure! That's fine if you want to keep posting ideas and things here in case of some possible future version of PS or something. 🙂

I just thought I should make my position on it clear so you don't feel like you're wasting time if nothing ever comes of it again.

bbarker · 2021-01-06T15:49:54Z

Thanks! - and sorry if it sounds like I kept badgering people on this - that's not the intent.

JordanMartinez · 2021-01-06T15:52:06Z

I'm not sure about the other maintainers, but just to be upfront about it: there is no argument that can be made on this issue that will make me come around to agreeing that allowing orphan instances again is a good thing.

I'm in agreement with Gary here, too.

natefaubion · 2021-01-06T16:58:12Z

I agree with that as well. I really don't think there will be a convincing argument for "allow orphans everywhere".

rhendric · 2021-01-06T17:26:16Z

I think there are probably other solutions to these problems worth considering besides fully reverting the orphan ban. Collecting issues caused in part by PureScript's position on orphans seems to me to offer more potential value than just accumulating ideas for a different language.

For example, Coercible definitely gets rid of a lot of the problems I see with using newtypes to get around the orphan ban.

Similarly, the scenario @bbarker just raised about ConcatMap makes me think that if we ever get around to reforming PureScript's module/package system, there might be some opportunity there to reduce some of the costs of, or safely weaken, the orphan ban. Two ideas that come to mind (I have not considered these deeply, I don't recall if anyone else has, and this is far from a proposal for either):

The ConcatMap module could explicitly declare ‘friend’ modules ConcatMap.List, ConcatMap.Array, etc., which the compiler would not require to exist, but if they exist, instances of ConcatMap would be allowed in them. This would significantly narrow the scope of breaking-downstream-libraries-by-adding-orphan-instances problems relative to the 2015 status quo (module name collisions are already a caveat emptor situation), while still allowing the core ConcatMap package to be small.
The compiler and the packager could become more deeply integrated such that a dependency on a kitchen-sink concatmap package would only pull in the dependencies that actually get used by the entry point. This has broader benefits and drawbacks, of course, but it would lessen how bad the concatmap maintainer might feel about adding support for every last esoteric collection type.

I assume/hope that the maintainers' position on the whole class of ideas similar to the above is weaker than can't-imagine-ever-supporting?

natefaubion · 2021-01-06T18:13:56Z

I assume/hope that the maintainers' position on the whole class of ideas similar to the above is weaker than can't-imagine-ever-supporting?

Orphans are a bit of a red-herring. The issue is preserving global coherence, while allowing the solver to reason locally about available instances, in the presence of non-trivial type-level features and separate/incremental compilation. I think there are certainly other avenues that can be discussed, but it has to be in terms of preserving the advantages and correctness of the existing orphan rule. There won't be convincing arguments if it requires us to compromise on that.

bbarker · 2021-01-06T18:23:23Z

I have seen a few mentions of the Idris way of doing things in this thread, but I don't believe I've seen any feedback on that from those opposed to orphan instances. While it would be a major change in the language, I'm not quite sure it would necessitate a "PureScript 2", for instance - but I certainly could be wrong.

Edit: And as per @natefaubion 's recent comment, I think everyone wants to have coherence - so I shouldn't really say "those opposed to orphan instances" but "those championing the merits of global coherence" :-)

natefaubion · 2021-01-06T18:41:56Z

I have seen a few mentions of the Idris way of doing things in this thread, but I don't believe I've seen any feedback on that from those opposed to orphan instances.

I have not seen anyone attempt to phrase the tradeoffs in the terms I just enumerated. I'm not an expert on the mentioned systems, but I believe they just forgo global coherence. This is less of an issue for dependently-typed languages because types can be indexed by values. See Kmett's "Typeclasses vs the World" talk.

shmish111 · 2021-01-06T18:43:38Z

I'm not sure about the other maintainers, but just to be upfront about it: there is no argument that can be made on this issue that will make me come around to agreeing that allowing orphan instances again is a good thing.

It's quite disheartening to hear anyone say this about anything 😞 While I can understand that in the case of "just allow orphan instances everywhere", this thread is providing serious motivating examples and possible solutions where orphan instances could be allowed with certain constraints. It doesn't seem to make much sense to completely shut down the possibility of a mechanism that would solve people's problems in the future in the way that this comment suggests.

I have no authority to comment on this really since I have not contributed to the purescript language and @garyb has done an amazing, massive job however I just found it very sad to see this and had to comment 😢

I hope that the scenarios that are not possible are considered as good challenges in the future and that "just newtype everything" is recognized as a highly imperfect workaround and not a complete solution to the issues.

natefaubion · 2021-01-06T18:58:04Z

I could imagine a backpack-like layer that lets packages depend on a signature module with a class and potentially default implementations, where an application/project can instantiate it while adding any additional implementations they want. This essentially lets you move the definition of this class into your own project without doing it manually, while reusing existing instances defined with the types. There is no downside to this currently, since we don't ship built artifacts as part of package install, but in the future this would just mean modules that depend on a signature module cannot share build products. They incur a dependency on the instantiated module. This does not run afoul the orphan rule, because you are essentially just (automatically) copying the entire source tree of these downstream modules into your project.

Note, this can very likely be prototyped without compiler support as a meta/build layer.

JordanMartinez · 2021-01-07T04:24:28Z

It's quite disheartening to hear anyone say this about anything 😞 .... It doesn't seem to make much sense to completely shut down the possibility of a mechanism that would solve people's problems in the future in the way that this comment suggests.

Good point. Let me clarify. I so strongly doubt that someone would invest the time and energy to think through, design, and implement an "it actually works!" solution to this problem in such a way that the tradeoffs are actually worth the change that my gut reaction is to just say, "No." By all means, prove me wrong! Let logic and truth win! However, I wouldn't see the point in talking about it until you've implemented code that actually works and has been tested by time.

natefaubion · 2021-03-04T23:44:34Z

Further required reading: https://github.com/AndrasKovacs/pny1-assignment/blob/master/DepTC.md

garyb added type: enhancement typechecker labels Jul 9, 2015

garyb added this to the 0.8.0 milestone Jul 9, 2015

paf31 modified the milestones: 0.9.0, 0.8.0 Aug 8, 2015

garyb mentioned this issue Aug 13, 2015

Disallow orphan instances #1376

Merged

garyb closed this as completed in #1376 Aug 13, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disallow orphan instances #1247

Disallow orphan instances #1247

garyb commented Jul 9, 2015

paf31 commented Jul 9, 2015

puffnfresh commented Jul 9, 2015

bbarker commented Jul 12, 2019 •

edited

Loading

hdgarrood commented Jul 18, 2019

shmish111 commented Aug 14, 2020

garyb commented Aug 14, 2020

puffnfresh commented Aug 15, 2020

shmish111 commented Aug 15, 2020

garyb commented Aug 15, 2020

shmish111 commented Aug 16, 2020

garyb commented Aug 16, 2020

hdgarrood commented Aug 16, 2020

natefaubion commented Aug 16, 2020

shmish111 commented Aug 16, 2020

natefaubion commented Aug 16, 2020 •

edited

Loading

bbarker commented Aug 16, 2020

paf31 commented Aug 17, 2020

bbarker commented Sep 20, 2020

bbarker commented Sep 20, 2020 •

edited

Loading

bbarker commented Jan 6, 2021

garyb commented Jan 6, 2021 •

edited

Loading

bbarker commented Jan 6, 2021 •

edited

Loading

garyb commented Jan 6, 2021

bbarker commented Jan 6, 2021

JordanMartinez commented Jan 6, 2021

natefaubion commented Jan 6, 2021

rhendric commented Jan 6, 2021 •

edited

Loading

natefaubion commented Jan 6, 2021 •

edited

Loading

bbarker commented Jan 6, 2021 •

edited

Loading

natefaubion commented Jan 6, 2021

shmish111 commented Jan 6, 2021

natefaubion commented Jan 6, 2021 •

edited

Loading

JordanMartinez commented Jan 7, 2021

natefaubion commented Mar 4, 2021

Disallow orphan instances #1247

Disallow orphan instances #1247

Comments

garyb commented Jul 9, 2015

paf31 commented Jul 9, 2015

puffnfresh commented Jul 9, 2015

bbarker commented Jul 12, 2019 • edited Loading

hdgarrood commented Jul 18, 2019

shmish111 commented Aug 14, 2020

garyb commented Aug 14, 2020

puffnfresh commented Aug 15, 2020

shmish111 commented Aug 15, 2020

garyb commented Aug 15, 2020

shmish111 commented Aug 16, 2020

garyb commented Aug 16, 2020

hdgarrood commented Aug 16, 2020

natefaubion commented Aug 16, 2020

shmish111 commented Aug 16, 2020

natefaubion commented Aug 16, 2020 • edited Loading

bbarker commented Aug 16, 2020

paf31 commented Aug 17, 2020

bbarker commented Sep 20, 2020

bbarker commented Sep 20, 2020 • edited Loading

bbarker commented Jan 6, 2021

garyb commented Jan 6, 2021 • edited Loading

bbarker commented Jan 6, 2021 • edited Loading

garyb commented Jan 6, 2021

bbarker commented Jan 6, 2021

JordanMartinez commented Jan 6, 2021

natefaubion commented Jan 6, 2021

rhendric commented Jan 6, 2021 • edited Loading

natefaubion commented Jan 6, 2021 • edited Loading

bbarker commented Jan 6, 2021 • edited Loading

natefaubion commented Jan 6, 2021

shmish111 commented Jan 6, 2021

natefaubion commented Jan 6, 2021 • edited Loading

JordanMartinez commented Jan 7, 2021

natefaubion commented Mar 4, 2021

bbarker commented Jul 12, 2019 •

edited

Loading

natefaubion commented Aug 16, 2020 •

edited

Loading

bbarker commented Sep 20, 2020 •

edited

Loading

garyb commented Jan 6, 2021 •

edited

Loading

bbarker commented Jan 6, 2021 •

edited

Loading

rhendric commented Jan 6, 2021 •

edited

Loading

natefaubion commented Jan 6, 2021 •

edited

Loading

bbarker commented Jan 6, 2021 •

edited

Loading

natefaubion commented Jan 6, 2021 •

edited

Loading