The future of Hive #246

simc · 2020-02-28T21:49:50Z

TLDR: Hive 2.0 will be rewritten in Rust to get amazing performance, multithreaded queries, read and write transactions, and super low memory usage. The code will work 1:1 in the browser.

Situation

I have thought a long time how to correctly implement queries in Hive and I have come to the conclusion that it is not possible with the current architecture.
I have reviewed many projects on GitHub which use Hive and most of them have to create their own suboptimal workaround for queries.
Apart from queries, Hive has another problem: Dart objects use much RAM. Since Hive currently keeps at least the keys in RAM, you can hit the OS limit of mobile devices quite fast.

I also created polls some time ago on the Hive documentation page and there were two very strong takeaways:

Queries are something almost every user wants
An overwhelming majority (86%) of users don't mind breaking changes

Idea

So here is what I have come up with:
I will completely rewrite Hive in Rust. I will use the strengths of the old implementation (like auto migration) and fix the issues.
On the VM, Hive will use LMDB as backend and on the Browser IndexedDB. The VM implementation will provide the same features as IndexedDB to allow easy code sharing.
The two main goals of Hive will stay the same: Simplicity and Performance.

I have a small prototype and the performance is amazing. LMDB has to be some kind of black magic.

Sample

Here is how it is going to work:

The model definition is very similar to current models:

@HiveType(typeId: 0)
class Person {
  @Primary
  int id;

  @HiveField(fieldId: 0)
  @Index(unique: false)
  String name;

  @HiveField(fieldId: 1)
  int age;
}

Hive will then generate extension methods so you can write the following query:

var box = Hive.openBox<Person>('persons');
box
  .query()
  .where()
  .nameEquals('David')
  .or()
  .nameStartsWith('Lu')
  .filter()
  .ageBetween(18, 50)
  .sortedByAge()

`where()` vs `filter()`

The difference between where() and filter() is that where() uses an index and filter() needs to test each of the resulting elements. Normally a database figures out when to use an index itself but Hive will provide you the option to customize.

There are multiple reasons for that:

This code will work 1:1 with IndexedDB
You know your data best and can choose the perfect index
The database code will be significantly easier

Things to figure out

How can auto-updating queries be implemented efficiently?
What are the restrictions for shipping binaries on iOS?
Should there still be a key-value store?

Blocking Issues (pls upvote)

Other issues

Reference types for phase 4? WebAssembly/reference-types#61

For existing apps using Hive 1.x:

I will continue to keep a Hive 1.x branch for bugfixes etc.

What do you think?

The text was updated successfully, but these errors were encountered:

kaboc · 2020-03-02T09:12:54Z

Rewriting in Rust sounds so interesting. Better performance and less memory usage are attractive and welcome.

However, I'm seriously worried about compatibility of Box. Can boxes for v1.x still be used in v2 as well? It's dubious as to what other users actually meant in the polls. For me, only breaking changes of APIs are acceptable, not of Box.

I have to decide whether to leave Hive and choose another package for the app I'm currently developing if there is a risk that I need to go through all the trouble to port old boxes to new ones on my own sometime in the future.

Having said that, your plan is exciting too all the same. I look forward to seeing the first release of the new major version.

simc · 2020-03-02T17:26:35Z

Yes I agree. It is bad that old boxes will not be compatible and existing apps in production cannot upgrade to the new version without loosing their data.

Hive is very young and I still think it is the right path. For future breaking changes there will be auto migration. Unfortunately that is not possible for this change because we switch the backend.

shinayser · 2020-03-02T19:26:02Z

A noob question: how will you make Rust to work with Dart?

simc · 2020-03-02T19:53:36Z

Using Dart FFI

shinayser · 2020-03-02T19:59:54Z

Using Dart FFI

But DART FFI is only for C language, not Rust, right?

It will require the user to use FFI or are you planning to provide a working interface already in Dart?

simc · 2020-03-02T20:24:00Z

But DART FFI is only for C language, not Rust, right?

Rust does provide C interop...

It will require the user to use FFI or are you planning to provide a working interface already in Dart?

The user will only use Dart and does not even notice the Rust backend.

Mravuri96 · 2020-03-03T09:35:45Z

You should give https://vlang.io/ a shot 😜

simc · 2020-03-03T10:40:18Z

You should give https://vlang.io/ a shot

V is interesting but in my opinion, there are multiple reasons why it is not a good idea to use V at the moment:

V still very unstable
V has a very small community and not many packages
V is a very "basic" language. While the creator argues that this is a design choice, I expect languages to provide at least a Set data structure.
I think Rust is superior to V in almost any aspect. There are many zero-cost abstractions like iterators

MarcelGarus · 2020-03-12T15:33:05Z

What initially excited me about Hive is that it's a pure Dart library without external dependencies, so it runs everywhere.
Obviously, the same is true if the backend would be implemented in Rust, but I begin to wonder: There are loads of existing database implementations in Rust that are far more advanced. There are of course the usual SQLite-ish standards, but also document-based databases like MongoDB and truly innovative approaches like this one.
I'm afraid there's nothing about Hive that's fundamentally better than with other database solutions, so rather than reimplementing the wheel, why not use some existing database and built a nice Dart-wrapper around it? Because developers also use the Rust database on its own, there are more users, more contributors and all developers from both the Rust and the Flutter community benefit from the research, optimizations, and bug fixes that are implemented on the Rust side.
This package could simply focus on providing the most intuitive Dart API possible, which would make maintaining the package easier as well.

simc · 2020-03-12T15:53:09Z

What initially excited me about Hive is that it's a pure Dart library without external dependencies

Yes, that was the goal but it turned out that most users don't want a database that is basically an in-memory KV-store. The problem with Dart is that it is kind of slow, its objects are rather memory hungry and it misses essential features to implement a more advanced database.

There are loads of existing database implementations in Rust that are far more advanced

I thought the same thing but the list of candidates is short. In fact, I didn't find a single database that is suitable for mobile devices and our requirements.

Also, to my knowledge, there is no database that is built as a counterpart to IndexedDB. It is not trivial to write a database that works exactly the same in the browser. IndexedDB is very different from most other databases.

I'm afraid there's nothing about Hive that's fundamentally better than with other database solutions

As I said, I don't think there exists a single cross-platform database that also works in the browser and I don't think existing databases can be easily used with Dart and still have great performance. Realm, for example, will never work with Dart because it relies on proxy objects.

So I'm writing basically writing an abstraction around IndexedDB and LMDB in Rust which can be compiled to a binary or WASM.

And then there will be the Dart wrapper around this "backend".

It should be easily possible to use only the Rust side for example with React native.

Edit: I already have a fully working prototype of the LMDB part of the wrapper and not much Rust code is required. The performance is exceptional.

If you have an alternative approach that allows us to have a "real" database which also works in the browser, I'd love to discuss it.

Edit2: Another advantage of this approach is that breaking changes of the binary format will no longer be required and bugs that corrupt the database will not happen anymore because the storing of the data will be handled by LMDB and IndexedDB respectively.

Edit3: Like most other databases, Noria, the one you linked, is for backends and thus not really suitable for mobile devices.

MarcelGarus · 2020-03-12T17:12:05Z

Okay, I see. I was really expecting more lightweight Rust databases to exist.
Then I take back what I said before — the Rust-Dart-architecture seems to be a great fit 😊
I'm looking forward to using this

simc · 2020-03-12T17:50:24Z

There is one topic where I still need input: Since we are rebuilding Hive anyway, I'd like to make it ready for synchronization from the beginning.

What do you guys think about CouchDB as a backend?
Do you know good articles or papers on sync without conflicts? I need an easy to use (for the user) mechanism to avoid or resolve conflicts.

ashim-kr-saha · 2020-03-12T18:00:47Z

Syncing with CouchDB, is exactly what I am looking for my next project.

PouchDB, implementation in dart will be great solution.

frank06 · 2020-03-12T18:02:24Z

Used CouchDB a long time ago, and while the conflict resolution mechanism was neat, queries were a pain. That might have changed, or not. I thought one of the major drivers of this Hive rewrite was query support.

simc · 2020-03-12T18:11:44Z

I thought one of the major drivers of this Hive rewrite was query support.

Yes, the queries you use with should be more or less independent of CouchDB.

CouchDB is just an idea and nothing I have decided yet. I just want to figure this out before the first stable release of the new version has been released.

It would be great if someone knows a backend which fits our use case.

jamesdixon · 2020-03-12T18:20:29Z

Unsolicited advice / thought:

I realize you're planning for the future ahead of time by factoring in sync support, but it's a complex
topic and something I personally would leave till after the rewrite 😄

I also say this because I'm not using CouchDB and while I'd love conflict resolution, I haven't found any really easy way to sync CouchDB changes to Postgres. This seems to be a problem for many who may use another database. Limited research, but I just wanted to throw it out there given that my anecdotal evidence suggests that more people are using Postgres/MySQL/etc and if CouchDB support for those is weak, it may not be worth the additional effort upfront.

All that said, killer job with Hive thus far. Excited to see what's next.

simc · 2020-03-12T18:43:10Z

Thanks for your opinion. Yes it would be cool to have a conflict resolution which is independent of the backend database but I have to do a lot of research because I have no idea how to do it 😆

jonataslaw · 2020-03-14T00:38:30Z

I think it would be prudent to create a second project> Hive ffi <or something like that. I have 9 applications in production using Hive, and it makes me very afraid to think that users with 300mb/500mb of data on Hive, may lose everything after a library update.
I feel very enthusiastic to test Hive with Rust, it must be incredible, however, in my opinion, changing the backend is not legal for a stable library, if it were pre-release it would be justified, but there are many people who use Hive as a KV storage for many scripts than SP does not do so well, and queries are legal, but even cooler than that would be to maintain compatibility.
I'm following the thread because if Hive changes, maybe I will need to fork this project, but if there is a risk-free way of migration, I fully support the idea.

simc · 2020-03-14T14:33:32Z

I don't think it will be possible to automatically migrate the data because the two models are not entirely compatible. But I will maintain a branch that contains the current version so you can just continue to use it.

algodave · 2020-03-18T22:34:23Z

@leisim In your vision, will the new Hive still allow to Create adapter manually? I'm not using code generation In my project, I'm just defining my own class MyModelHiveAdapter extends TypeAdapter<MyModel>

pishguy · 2020-03-21T12:13:59Z

when this version can be release and we can use that? 💃 💃 💃

simc · 2020-03-21T13:59:28Z

In your vision, will the new Hive still allow to Create adapter manually?

@algodave I don't think it will be possible in the same way as it is currently because in order to query your data, Hive needs to understand its structure. Probably there will still be adapters that map objects to Map<int, dynamic>. The keys of this map will be the field ids and the values are the primitive values of the fields (int, double, bool, String, List<int>, List<double>, List<bool>, List<String>). You can customize these adapters.

when this version can be release and we can use that?

@MahdiPishguy It probably still takes another month until I have the first test version.

xylobol · 2020-04-09T20:59:58Z

@jonataslaw

I'm following the thread because if Hive changes, maybe I will need to fork this project, but if there is a risk-free way of migration, I fully support the idea.

I've been working on a mission-critical project with Hive, and a major pull was that it's completely written in Dart, so I may need to fork as well. If you're interested, I can keep you posted.

stefanrusek · 2020-04-10T13:48:35Z

Might I request you create a new library? A complete rewrite, with different behavior, and large api changes is not a new version but a new library. Going down this path means there will be numerous forks of Hive 1, and people would just be better served if you started a new project and let Hive continue to evolve.

algodave · 2020-04-10T16:10:06Z

@xylobol @jonataslaw I am one of those who would be interested in a fork

listepo · 2020-04-19T20:04:28Z

@leisim any news?

stevenspiel · 2020-04-20T16:40:41Z

@leisim I'm also interested in the progress on this.

simc · 2021-02-14T10:07:40Z

@yringler Yes I'm quite happy with the progress. According to my tests and benchmarks, Isar will probably solve all problems people have with Hive currently.

dgandhi17 · 2021-02-15T04:20:05Z

@leisim Can we expect stable version by this month?

simc · 2021-02-15T07:17:16Z

@dgandhi17 stable probably not because that implies that it has been battle tested. You can expect a beta version within the next few weeks.

michalisioak · 2021-02-20T17:12:34Z

A noob opinion:
may sound silly but what if we make hivedb a system service and though IPC (interprocess communication) talk to dart clients (and may other languages)

Advantages ( I thought)

less storage taken
later support for having the same database stored on all user devices
modularization

Disadvantages ( I also thought)

more complicated (client must be different from the service)
dont know how to implement this
security, (in my head, every application would have a key initialized hardcoded in the app, something like containerization but for db)

Feel free to correct me and point if this is possible

ryanheise · 2021-02-21T00:58:09Z

Apart from queries, Hive has another problem: Dart objects use much RAM. Since Hive currently keeps at least the keys in RAM, you can hit the OS limit of mobile devices quite fast.

Casual observer here, but I wonder whether Dart compiler improvements are having/will have any impact.

The new compiler in SDK 2.12 contains memory and performance optimisations when code is compiled with sound null safety.

Some other enhancements on the roadmap:

https://github.com/dart-lang/sdk/projects/23#card-55259467

Hassico · 2021-02-28T16:40:20Z

Humble Hive user here. Is it possible to make data posts or hive files have max age, stale ... options like we find in caching properties?

erf · 2021-03-15T22:31:11Z

I only need a db solution for my dart server project, have you thought about making a pure dart wrapper around LMDB as a side project to avoid bloat by including indexdb etc.

I'm using Redis now, but i suspect LMDB would be faster.

erf · 2021-03-15T23:48:48Z

Maybe some benchmarks for isar would be interesting. Comparing with some other popular db solutions, including Hive, Redis, sembast, sqflite etc.

simc · 2021-03-16T07:03:05Z

@erf Dart is very good at tree shaking unused code so the indexeddb wrapper will never be included in your mobile app build.

Redis is an in-memory database so it's likely faster than anything else. I'm not sure benchmarks are very useful here because the databases are completely different. That being said I expect Isar to be faster than all other databases currently available for Flutter.

Currently I have more important tasks (unit tests 🙄) than writing a benchmark but maybe someone from the community will write one.

wesleytoshio · 2021-03-21T01:23:24Z

Hello, is it already possible to use custom queries? or sort by date as you said at the beginning of the post?

fzyzcjy · 2021-10-06T14:23:12Z

Hi friends! As is suggested here, maybe this package is helpful: https://github.com/fzyzcjy/flutter_rust_bridge

bsutton · 2021-12-09T22:28:57Z

I have to say that if this makes hive harder to deploy then I would be against it.
A multi-language project will also make it harder to recruit contributors.

Hive performance is OK but its missing decent indexing and of course memory usage is a problem.

These could be solved by hive implement its own disk based btree indexes in dart.
This is would fix most of the memory problems (no longer need to load all keys/objects into memory).
This could be done as new type of box - IndexedBox.
Existing users can still use an in memory Box, a LazyBox or an IndexedBox.

Use meta data to define the indexes and as you have suggested the 'where' methods to access the indexes.
It would be nice if the filter command could automatically use indexes but that would be a lot tricker. For an IndexBox ideally the filter would page keys/objects into memory rather than loading the whole set.

I found this implementation of a btree in java which would port easily across to dart.

https://github.com/myui/xbird/blob/master/xbird-open/main/src/java/xbird/storage/index/BIndexFile.java

I would suggest that this would also be much less work than the proposed rust path.

If I had to do additional work to ship with hive (i.e. deploying a binary) then I wouldn't have chosen hive.

FWIW: I build and deploy cli tooling that needs to work across linux/macos/windows.

simc · 2021-12-10T13:27:23Z

@bsutton Tge additional work to deploy the Isar binaries is to add:

dependencies:
  isar_flutter_libs: any

Dart misses mmap and threads and is therefore not a good language for a database. Please read the first post for a more detailed explanation.

themisir · 2021-12-10T13:39:58Z

Dart misses mmap and threads and is therefore not a good language for a database. Please read the first post for a more detailed explanation.

Just want to add that, from what I see is, there's no interest in implementing proper threading support to dart, the team rather prefers to improve their Isolate implementation which does limits what could be done with threads, lacks shared memory and does involve serialization to move static typed data between isolates

bsutton · 2021-12-10T20:03:07Z

@leisim I'm not actually doing flutter Dev but rather server side.

Does the flutter lib work with for non flutter apps?

Are all the noted platforms supported?

simc · 2021-12-11T13:14:37Z

@bsutton I don't recommend using any embedded database on the server. You're almost always better off using a dedicated database.

bsutton · 2021-12-12T00:00:42Z

From my use case an embedded server is the correct answer. This is a cli script not a server. S. Brett Sutton Noojee Contact Solutions 03 8320 8100

…

On Sun, 12 Dec 2021 at 00:14, Simon Leier ***@***.***> wrote: I don't recommend using any embedded database on the server. You're almost always better off using a dedicated database. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#246 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAG32OEILNC6WQKB6SSHLFLUQNFERANCNFSM4K56YKVA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

om-ha · 2021-12-18T09:58:31Z

@bsutton Tge additional work to deploy the Isar binaries is to add:
dependencies:
  isar_flutter_libs: any
Dart misses mmap and threads and is therefore not a good language for a database. Please read the first post for a more detailed explanation.

Your work is hugely appreciated Simon.

I think Isar complements Hive. Hive could be used instead of shared preferences for simple stuff and Isar could be used for large queryable data.

EDIT: One might ask, why not use shared_preferences instead of hive? Well my obviously non-affiliated answer to that is:

performance benchmarks
multiple separate boxes (for separation of concern)
boxes run on a separate isolate (internally, IsolatedBox)
encryption
etc...

simc · 2021-12-18T10:26:34Z

I think Isar complements Hive

Fully agree! It's not intended to be a replacement.

simc · 2022-01-01T14:43:45Z

Isar is stable now. Thanks everyone for your help and participation!

simc pinned this issue Feb 28, 2020

simc mentioned this issue Feb 28, 2020

Hive 2.0 #216

Closed

14 tasks

AKushWarrior mentioned this issue Mar 21, 2020

Use AES-GCM instead of AES-CBC #259

Open

vaind mentioned this issue May 18, 2021

[docs] question about hive deprecation in your docs objectbox/objectbox-dart#244

Closed

This was referenced May 18, 2021

[proposal] benchmark against "objectbox" isar/isar#107

Closed

Benchmarks isar/isar#22

Closed

fallaciousreasoning mentioned this issue Jun 18, 2021

Reliance on the generic security origin specification limits the web file-handlings APIs ability to compete with traditional applications. WICG/file-handling#64

Open

simc closed this as completed Jan 1, 2022

ac130kz mentioned this issue Jan 1, 2022

feat(hydrated_bloc): Isar instead of Hive for storage felangel/bloc#3107

Open

baumths mentioned this issue Jan 3, 2022

Questions about Type ID #525

Closed

lokingwei mentioned this issue Feb 14, 2022

Documentation - and what are your plans? feedmepos/foodb#1

Open

baumths mentioned this issue Sep 9, 2022

Hive is Abandoned #1068

Open

tycebrown mentioned this issue Feb 13, 2023

List Persistance Pertempto/Lists#6

Closed

The future of Hive #246

The future of Hive #246

Comments

simc commented Feb 28, 2020 • edited Loading

Situation

Idea

Sample

where() vs filter()

Things to figure out

Blocking Issues (pls upvote)

Other issues

For existing apps using Hive 1.x:

What do you think?

kaboc commented Mar 2, 2020

simc commented Mar 2, 2020

shinayser commented Mar 2, 2020 • edited Loading

simc commented Mar 2, 2020

shinayser commented Mar 2, 2020 • edited Loading

simc commented Mar 2, 2020 • edited Loading

Mravuri96 commented Mar 3, 2020

simc commented Mar 3, 2020

MarcelGarus commented Mar 12, 2020

simc commented Mar 12, 2020 • edited Loading

MarcelGarus commented Mar 12, 2020

simc commented Mar 12, 2020

ashim-kr-saha commented Mar 12, 2020

frank06 commented Mar 12, 2020

simc commented Mar 12, 2020

jamesdixon commented Mar 12, 2020

simc commented Mar 12, 2020

jonataslaw commented Mar 14, 2020

simc commented Mar 14, 2020 • edited Loading

algodave commented Mar 18, 2020

pishguy commented Mar 21, 2020

simc commented Mar 21, 2020 • edited Loading

xylobol commented Apr 9, 2020

stefanrusek commented Apr 10, 2020

algodave commented Apr 10, 2020

listepo commented Apr 19, 2020

stevenspiel commented Apr 20, 2020

simc commented Feb 14, 2021 • edited Loading

dgandhi17 commented Feb 15, 2021

simc commented Feb 15, 2021

michalisioak commented Feb 20, 2021 • edited Loading

ryanheise commented Feb 21, 2021

Hassico commented Feb 28, 2021

erf commented Mar 15, 2021 • edited Loading

erf commented Mar 15, 2021

simc commented Mar 16, 2021 • edited Loading

wesleytoshio commented Mar 21, 2021

fzyzcjy commented Oct 6, 2021

bsutton commented Dec 9, 2021

simc commented Dec 10, 2021

themisir commented Dec 10, 2021

bsutton commented Dec 10, 2021

simc commented Dec 11, 2021 • edited Loading

bsutton commented Dec 12, 2021 via email

om-ha commented Dec 18, 2021 • edited Loading

simc commented Dec 18, 2021

simc commented Jan 1, 2022

simc commented Feb 28, 2020 •

edited

Loading

`where()` vs `filter()`

shinayser commented Mar 2, 2020 •

edited

Loading

shinayser commented Mar 2, 2020 •

edited

Loading

simc commented Mar 2, 2020 •

edited

Loading

simc commented Mar 12, 2020 •

edited

Loading

simc commented Mar 14, 2020 •

edited

Loading

simc commented Mar 21, 2020 •

edited

Loading

simc commented Feb 14, 2021 •

edited

Loading

michalisioak commented Feb 20, 2021 •

edited

Loading

erf commented Mar 15, 2021 •

edited

Loading

simc commented Mar 16, 2021 •

edited

Loading

simc commented Dec 11, 2021 •

edited

Loading

om-ha commented Dec 18, 2021 •

edited

Loading