Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nano does not roundtrip as part of entity #1304

Open
NorfairKing opened this issue Aug 7, 2021 · 6 comments
Open

Nano does not roundtrip as part of entity #1304

NorfairKing opened this issue Aug 7, 2021 · 6 comments

Comments

@NorfairKing
Copy link
Contributor

NorfairKing commented Aug 7, 2021

Bug Reports

I'm using persistent as part of LTS 16.12 with sqlite.

I have this database definition:

Place sql=place
    query Text -- Address
    lat Nano
    lon Nano

    UniquePlaceQuery query

    deriving Show
    deriving Eq
    deriving Generic

I noticed something really weird when running this in a property test:

                liftIO $ pPrint place
                mPlace <- DB.get placeId
                liftIO $ pPrint mPlace

output

Place
  { placeQuery =
      "\119188\844395\748003\773996\537954\635786\579726\1085648\596327\160180\526694\824331\996801\669722\926453\142441\1050632\260094\410357\477790\829171\521373"
  , placeLat = 6778287944.859847017
  , placeLon = -3037898653.585851994
  }
Just
  Place
    { placeQuery =
        "\119188\844395\748003\773996\537954\635786\579726\1085648\596327\160180\526694\824331\996801\669722\926453\142441\1050632\260094\410357\477790\829171\521373"
    , placeLat = 6778287944.859847068
    , placeLon = -3037898653.585852147
    }

As you can see, the latitude and logitude are different.

I did some digging already. (If you do so, watch out for this issue.)

> 6778287944.859847017 :: Nano
6778287944.859847017 
> fromPersistValue (toPersistValue (6778287944.859847017 :: Nano)) :: Either Text Nano
Right 6778287944.859847017

So far so good, so it looks like the problem is between sqlite and persistent.

Repro script:

#!/usr/bin/env stack
-- stack --resolver lts-16.12 script

{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE UndecidableInstances #-}

import Control.Monad
import Control.Monad.IO.Class (liftIO)
import Data.Fixed
import Data.Text (Text)
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import System.Exit
import Text.Show.Pretty (ppShow)

share
  [mkPersist sqlSettings, mkMigrate "migrateAll"]
  [persistLowerCase|

Place sql=place
    lat Nano
    lon Nano

    deriving Show
    deriving Eq
|]

main :: IO ()
main = runSqlite ":memory:" $ do
  runMigration migrateAll

  let expected =
        Place
          { placeLat = 6778287944.859847017,
            placeLon = -3037898653.585851994
          }
  placeId <- insert expected
  mActual <- get placeId
  liftIO $
    forM_ mActual $ \actual ->
      if expected == actual
        then putStrLn "Done!"
        else
          die $
            unlines
              [ "Roundtrip failed.",
                unwords ["expected:", ppShow expected],
                unwords ["actual:  ", ppShow actual]
              ]

Output:

Migrating: CREATE TABLE "place"("id" INTEGER PRIMARY KEY,"lat" NUMERIC(19,9) NOT NULL,"lon" NUMERIC(19,9) NOT NULL)
Roundtrip failed.
expected: Place
  { placeLat = 6778287944.859847017
  , placeLon = -3037898653.585851994
  }
actual:   Place
  { placeLat = 6778287944.859847068
  , placeLon = -3037898653.585852147
  }
@parsonsmatt
Copy link
Collaborator

The relevant instance here is:

-- Database/Persist/Sql/Class.hs line 1258
instance (HasResolution a) => PersistFieldSql (Fixed a) where
    sqlType a =
        SqlNumeric long prec
      where
        prec = round $ (log $ fromIntegral $ resolution n) / (log 10 :: Double) --  FIXME: May lead to problems with big numbers
        long = prec + 10                                                        --  FIXME: Is this enough ?
        n = 0
        _mn = return n `asTypeOf` a

So, this explains NUMERIC(19, 9), which should work to accommodate those values.

However, SQLite doesn't support numbers with arbitrary position or size. See #1096 for a related issue, and the SQLite's datatype support documentation.

REAL. The value is a floating point value, stored as an 8-byte IEEE floating point number.

An 8 byte float

A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if the text is a well-formed integer or real literal, respectively. If the TEXT value is a well-formed integer literal that is too large to fit in a 64-bit signed integer, it is converted to REAL. For conversions between TEXT and REAL storage classes, only the first 15 significant decimal digits of the number are preserved. If the TEXT value is not a well-formed integer or real literal, then the value is stored as TEXT. For the purposes of this paragraph, hexadecimal integer literals are not considered well-formed and are stored as TEXT. (This is done for historical compatibility with versions of SQLite prior to version 3.8.6 2014-08-15 where hexadecimal integer literals were first introduced into SQLite.) If a floating point value that can be represented exactly as an integer is inserted into a column with NUMERIC affinity, the value is converted into an integer. No attempt is made to convert NULL or BLOB values.

(emphasis mine)

We can verify this behavior in GHCi:

λ> let x = 6778287944.859847017
λ> :t x
x :: Fractional p => p
λ> import Data.Fixed
λ> x :: Nano
6778287944.859847017
λ> x :: Double
6.778287944859847e9
λ> x :: Float
6.778288e9
λ> let xd = x :: Double
λ> realToFrac xd :: Nano
6778287944.859847068

I'm all ears for ways we can support this better in persistent.

@NorfairKing
Copy link
Contributor Author

NorfairKing commented Aug 10, 2021

I think a database library should be able to roundtrip the data it stores. (I guess this is an opinion .. ?)
If sqlite cannot do this for Numeric(19, 9) then persistent shouldn't have an instance that uses this type.

I'm all ears for ways we can support this better in persistent.

I think the best way to deal with this is either

  1. poison this instance for Fixed or
  2. Store the Fixed data as integer numbers instead of floats.

I'd be happy with either, but we'd need some migration strategy for the second. (Maybe we could have the fromPersistValue also read the old value but have toPersistValue produce the new value.)

What are your thoughts on this, @parsonsmatt ?
In particular, maybe other databases like postgres don't have this problem and would do a fine job for Numeric(19, 9)?
That would complicate things even more.

EDIT: Just so you know why I have this opinion:
The whole reason why I use Nano instead of Double, is so that I don't have NaNs, infinities and -0.0 to deal with (and perhaps to not lose precision in calculations by accident).
If I can't roundtrip Nanos, and instead have to settle for "something close enough", then I have to re-evaluate if I shouldn't just use Double instead.

@parsonsmatt
Copy link
Collaborator

In particular, maybe other databases like postgres don't have this problem and would do a fine job for Numeric(19, 9)?
That would complicate things even more.

Bingo. This is a problem with SQLite's datatype rules.

In any database library that tries to support multiple datastores, you run into tensions like these - given that all databases differ on their features, what functionality should a library offer? One extreme is to only support the minimal functionality that is equally shared among all datastores. Another is to support everything nominally and use slow defaults or error on bad implementations. persistent leans somewhere in the middle - we don't fully support SQL to make NoSQL stores easier to support. But we do support a lot of stuff that's not easily replicable in other data stores (eg upsert is natively supported in Postgres but not in MySQL, and ON DUPLICATE KEY UPDATE was added only recently to Postgres but present in MySQL for a while).

One option is to omit the instance for Fixed from persistent itself, and expose an orphan from persistent-{mysql,postgresql,redis,mongoDB}. Or, write newtypes, like PgNumeric, similar to how we have PgInterval. This would be a pretty big breaking change, though, and for consistency's sake, we'd need to do that for every type that isn't fully supported by SQLite - that's not something that seems great to me.

Switching to the Integer datatype in SQLite doesn't help, it just moves the goalposts (the Word64 PR above uses Integer and runs into the same round trip overflow problem).

@NorfairKing
Copy link
Contributor Author

What a mess. Then the best we can do is write docs with "This doesn't roundtrip" in the right places, I guess?

@NorfairKing
Copy link
Contributor Author

more ideas:
Can we store fixed-point numbers as Int64 instead?
Can we just poison the instance because they contain an Integer, which can never be roundtripped anyway?

@parsonsmatt
Copy link
Collaborator

Can we store fixed-point numbers as Int64 instead?

I don't think this is a good approach - while it would allow for larger Pico etc values to be stored, it won't actually fix the root problem. Additionally, this change would have to affect the postgresql and mysql backends as well, since PersistFieldSql is shared among them. Code that currently produces a reasonable NUMERIC type would then require a migration, which would potentially cause other problems with use.

Can we just poison the instance because they contain an Integer, which can never be roundtripped anyway?

I can be convinced this is a good idea. After all, we don't support Integer, and we dropped support for Natural in #1032. There's nothing in the Data.Fixed module documentation that indicates that they are not supposed to be unbounded. And unbounded numbers aren't really a good fit for a database storage format.

Postgresql's NUMERIC type supports "up to 131072 digits before the decimal point; up to 16383 digits after the decimal point." I may not be an ultrafinitist, but that's close enough for me.

MySQL's NUMERIC type isn't as burly, with only 65 digits total. This should be running into similar issues.

OK, I am convinced - let's drop Fixed p instances in persistent itself.

  1. Remove Fixed instances of PersistField and PersistFieldSql from persistent library.
  2. Replace with an OverflowFixed instance that documents the shortcomings and links to this issue.
  3. Supporting Postgres options:
    a. persistent-postgresql gets the Fixed instances as an orphan.
    b. persistent-postgresql gets PgNumeric type, corresponding with the PgInterval type.

This'll be a breaking change, so would need to go under 2.14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants