-
-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for deserialization "into" existing collection (instead of overwrite) #337
Comments
I managed to find a workaround for now: // Input
ReadOnlySpan<byte> serialized = ... // This is the serialized data
int length = ... // length of T[] before serialization
// Actual work
var state = MemoryPackReaderOptionalStatePool.Rent(null);
var reader = new MemoryPackReader(data, state);
var arr = new T[length + 1];
Span<T?> span = arr.AsSpan(0, length)!;
reader.ReadSpan(ref span);
arr[length] = new T() // the added item
// Then proceed to serialize after addition
var newSerialized = MemoryPackSerializer.Serialize(arr); This works, and definitely better than allocating a new array just for deserialization. However, it is still suboptimal as:
By the way - BUG: as the documentation explains, the overwrite method, if receives a collection that has a public class ICollectionWrappedArray<T> : ICollection<T> {
private readonly T[] _arr;
private int _index;
public ICollectionWrapperArray(T[] arr) {
_arr = arr;
}
// Empty ctor required for MemoryPack for some reason - otherwise runtime error
public ICollectionWrapperArray() { _arr =[]; }
public void Add(T item) {
_arr[_index] = item;
_index++;
}
public void Clear() { _index = 0; }
// Rest of implementation doesn't really matter for this use case
}
// The instead of the span workaround above
T[] newArr = new[length + 1];
var collection = new ICollectionWrappedArray(newArr);
int written = MemoryPackSerializer.Deserialize(serializedData, ref collection); This should have worked according to the documentation, as the collection has But, my tests failed, and after debugging I saw that it always went to the empty ctor, causing it to allocate a new array instead of using the one that was provided. |
In order to reduce allocations of collections when deserializing, MemoryPackSerializer has a method that overwrites an existing instance. For 1 item this is fine, but for it would have been very helpful for collections that instead of overwriting or replacing them, it would serialize into the available space.
For instance, if a
T[]
is used there, the entire instance is replaced, regardless if the deserialized items could have been inserted from some index to some index. In some cases, the opposite would have been very efficient.For the sake of the example, consider that I use MemoryPack to serialize a
User[]
in some API. In the crud operation of adding a new user, right now I have to do this:So I am forced to allocate a secondary array with larger length just to add items.
I want to be able to do this:
This essentially uses a destination span for the required items, it could be a part of a larger buffer (which also enables usage with BufferWriter(s), Arrays rented from pools etc...
And using this functionality I completely avoid allocating an array of size n. (this is big, especially if the usage is in an API like in the example, which calls it lots of times, using an array from ArrayPool.Shared would increase the benefit immensely, and the same API would support it).
Currently, it is impossible to do from my tries, firstly because the overload of
Deserialize
that overwrites the instance uses aref object? obj
for the instance, and obviouslyspan<t>
cannot be cast toobject
. I have tried to implement a workaround using Memory, like this:Which logically should produce similar results as
Memory.Slice
is also very efficient and doesn't actually change the backing array. But even after theDeserialize
call, my debugger showed that the backing array was unchanged (remained only initialized without the deserialized data). And my unit tests failed.Btw, as it currently stands,
written
doesn't actually tell the count of deserialized items, it just returns the length ofbinaryData
which means if you are expecting to get 3User
s, from abyte[]
of length 80 for instance,written
would be 80, and trying to use it to advance indexes will throw you out of bounds. (So this is also a bug)I know that implementing such a change would lead to thinking that it would mean a lot of overloads and code to add, but you don't really need much, supporting
Span<T>
would cover 99% of cases, as you can get the span out of arrays and lists and they are the most common data types for such operations.The text was updated successfully, but these errors were encountered: