Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve database model conversion performance by caching extra data and keeping row cache #3534

Merged
merged 2 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

# Upcoming

### 🔄 Changed
### ⚡ Performance
- Improve performance of accessing database model properties [#3534](https://github.com/GetStream/stream-chat-swift/pull/3534)
- Improve performance of model conversions with large extra data [#3534](https://github.com/GetStream/stream-chat-swift/pull/3534)

# [4.69.0](https://github.com/GetStream/stream-chat-swift/releases/tag/4.69.0)
_December 18, 2024_
Expand Down
2 changes: 1 addition & 1 deletion Sources/StreamChat/Database/DTOs/ChannelDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -477,7 +477,7 @@ extension ChatChannel {

let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.default.decode([String: RawJSON].self, from: dto.extraData)
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: dto.extraData)
} catch {
log.error(
"Failed to decode extra data for Channel with cid: <\(dto.cid)>, using default value instead. "
Expand Down
16 changes: 6 additions & 10 deletions Sources/StreamChat/Database/DTOs/CurrentUserDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -224,16 +224,12 @@ extension CurrentChatUser {

let user = dto.user

var extraData = [String: RawJSON]()
if !dto.user.extraData.isEmpty {
do {
extraData = try JSONDecoder.default.decode([String: RawJSON].self, from: dto.user.extraData)
} catch {
log.error(
"Failed to decode extra data for user with id: <\(dto.user.id)>, using default value instead. "
+ "Error: \(error)"
)
}
let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: dto.user.extraData)
} catch {
log.error("Failed to decode extra data for user with id: <\(dto.user.id)>, using default value instead. Error: \(error)")
extraData = [:]
}

let mutedUsers: [ChatUser] = try dto.mutedUsers.map { try $0.asModel() }
Expand Down
18 changes: 10 additions & 8 deletions Sources/StreamChat/Database/DTOs/MemberModelDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ extension ChatChannelMember {

let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.default.decode([String: RawJSON].self, from: dto.user.extraData)
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: dto.user.extraData)
} catch {
log.error(
"Failed to decode extra data for user with id: <\(dto.user.id)>, using default value instead. "
Expand All @@ -198,13 +198,15 @@ extension ChatChannelMember {
extraData = [:]
}

var memberExtraData: [String: RawJSON] = [:]
if let dtoMemberExtraData = dto.extraData {
do {
memberExtraData = try JSONDecoder.default.decode([String: RawJSON].self, from: dtoMemberExtraData)
} catch {
memberExtraData = [:]
}
let memberExtraData: [String: RawJSON]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another interesting thing that we could try to improve is that ChatMember and ChatUser are both classes. And I think we should be able to convert them to structs without API changes. Could be interesting to see if there are any performance improvements by doing so 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you think structs would improve performance? Probably only the SDK size would increase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChatUser.init itself is visible in the time profiler, but it is like 1 ms which is nothing compared to other things happening. At some point it would be nice to only use structs for models (consistent).

Copy link
Member

@nuno-vieira nuno-vieira Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually structs are lightweight to create and destroy, so it should improve things, but maybe not significantly. @laevandus We not only need to check the init, but also how much time is spend destroying existing objects 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on time profiler I don't see this popping up really, a couple of ms for ChatUser.__deallocating_deinit and ChatChannelMember.__deallocating_deinit. We should do that for consistency, but in a separate ticket. No real performance improvement here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay nice 👍

do {
memberExtraData = try JSONDecoder.stream.decodeCachedRawJSON(from: dto.extraData)
} catch {
log.error(
"Failed to decode extra data for channel member with id: <\(dto.user.id)>, using default value instead. "
+ "Error: \(error)"
)
memberExtraData = [:]
}

let role = dto.channelRoleRaw.flatMap { MemberRole(rawValue: $0) } ?? .member
Expand Down
39 changes: 13 additions & 26 deletions Sources/StreamChat/Database/DTOs/MessageDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -1275,20 +1275,12 @@ extension MessageDTO {

/// Snapshots the current state of `MessageDTO` and returns its representation for the use in API calls.
func asRequestBody() -> MessageRequestBody {
var decodedExtraData: [String: RawJSON]

if let extraData = self.extraData {
do {
decodedExtraData = try JSONDecoder.default.decode([String: RawJSON].self, from: extraData)
} catch {
log.assertionFailure(
"Failed decoding saved extra data with error: \(error). This should never happen because"
+ "the extra data must be a valid JSON to be saved."
)
decodedExtraData = [:]
}
} else {
decodedExtraData = [:]
let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: self.extraData)
} catch {
log.assertionFailure("Failed decoding saved extra data with error: \(error). This should never happen because the extra data must be a valid JSON to be saved.")
extraData = [:]
}

let uploadedAttachments: [MessageAttachmentPayload] = attachments
Expand All @@ -1315,7 +1307,7 @@ extension MessageDTO {
pinned: pinned,
pinExpires: pinExpires?.bridgeDate,
pollId: poll?.id,
extraData: decodedExtraData
extraData: extraData
)
}

Expand Down Expand Up @@ -1365,17 +1357,12 @@ private extension ChatMessage {
moderationDetails = dto.moderationDetails.map { MessageModerationDetails(fromDTO: $0) }
textUpdatedAt = dto.textUpdatedAt?.bridgeDate

if let extraData = dto.extraData, !extraData.isEmpty {
do {
self.extraData = try JSONDecoder.default.decode([String: RawJSON].self, from: extraData)
} catch {
log
.error(
"Failed to decode extra data for Message with id: <\(dto.id)>, using default value instead. Error: \(error)"
)
self.extraData = [:]
}
} else {
do {
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: dto.extraData)
} catch {
log.error(
"Failed to decode extra data for Message with id: <\(dto.id)>, using default value instead. Error: \(error)"
)
extraData = [:]
}

Expand Down
19 changes: 7 additions & 12 deletions Sources/StreamChat/Database/DTOs/MessageReactionDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -207,17 +207,12 @@ extension MessageReactionDTO {
func asModel() throws -> ChatMessageReaction {
try isNotDeleted()

let decodedExtraData: [String: RawJSON]

if let extraData = self.extraData, !extraData.isEmpty {
do {
decodedExtraData = try JSONDecoder.default.decode([String: RawJSON].self, from: extraData)
} catch {
log.error("Failed decoding saved extra data with error: \(error)")
decodedExtraData = [:]
}
} else {
decodedExtraData = [:]
let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: self.extraData)
} catch {
log.error("Failed decoding saved extra data with error: \(error)")
extraData = [:]
}

return try .init(
Expand All @@ -227,7 +222,7 @@ extension MessageReactionDTO {
createdAt: createdAt?.bridgeDate ?? .init(),
updatedAt: updatedAt?.bridgeDate ?? .init(),
author: user.asModel(),
extraData: decodedExtraData
extraData: extraData
)
}
}
13 changes: 8 additions & 5 deletions Sources/StreamChat/Database/DTOs/PollDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -88,11 +88,14 @@ extension PollDTO {
func asModel() throws -> Poll {
try isNotDeleted()

var extraData: [String: RawJSON] = [:]
if let custom,
!custom.isEmpty,
let decoded = try? JSONDecoder.default.decode([String: RawJSON].self, from: custom) {
extraData = decoded
let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: custom)
} catch {
log.error(
"Failed to decode extra data for poll with id: <\(id)>, using default value instead. Error: \(error)"
)
extraData = [:]
}

let optionsArray = (options.array as? [PollOptionDTO]) ?? []
Expand Down
13 changes: 8 additions & 5 deletions Sources/StreamChat/Database/DTOs/PollOptionDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,14 @@ extension PollOptionDTO {
func asModel() throws -> PollOption {
try isNotDeleted()

var extraData: [String: RawJSON] = [:]
if let custom,
!custom.isEmpty,
let decoded = try? JSONDecoder.default.decode([String: RawJSON].self, from: custom) {
extraData = decoded
let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: custom)
} catch {
log.error(
"Failed to decode extra data for poll option with id: <\(id)>, using default value instead. Error: \(error)"
)
extraData = [:]
}
return PollOption(
id: id,
Expand Down
5 changes: 4 additions & 1 deletion Sources/StreamChat/Database/DTOs/ThreadDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,11 @@ extension ThreadDTO {

let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.default.decode([String: RawJSON].self, from: self.extraData)
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: self.extraData)
} catch {
log.error(
"Failed to decode extra data for thread with id: <\(parentMessageId)>, using default value instead. Error: \(error)"
)
extraData = [:]
}

Expand Down
4 changes: 2 additions & 2 deletions Sources/StreamChat/Database/DTOs/UserDTO.swift
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ extension UserDTO {
func asRequestBody() -> UserRequestBody {
let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.default.decode([String: RawJSON].self, from: self.extraData)
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: self.extraData)
} catch {
log.assertionFailure(
"Failed decoding saved extra data with error: \(error). This should never happen because"
Expand Down Expand Up @@ -235,7 +235,7 @@ extension ChatUser {

let extraData: [String: RawJSON]
do {
extraData = try JSONDecoder.default.decode([String: RawJSON].self, from: dto.extraData)
extraData = try JSONDecoder.stream.decodeCachedRawJSON(from: dto.extraData)
} catch {
log.error(
"Failed to decode extra data for user with id: <\(dto.id)>, using default value instead. "
Expand Down
5 changes: 0 additions & 5 deletions Sources/StreamChat/Database/DatabaseContainer.swift
Original file line number Diff line number Diff line change
Expand Up @@ -222,11 +222,6 @@ class DatabaseContainer: NSPersistentContainer, @unchecked Sendable {
try actions(self.writableContext)
FetchCache.clear()

// Refresh the state by merging persistent state and local state for avoiding optimistic locking failure
for object in self.writableContext.updatedObjects {
self.writableContext.refresh(object, mergeChanges: true)
}

Comment on lines -225 to -229
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why was this needed before, and now it is not? 🤔 What are the chances of this producing some regressions? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a chance that Data will be missing or the FRC won't trigger updates?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need more details on why was this removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into the history of this and it goes like this:

  1. In 2021 it started as an optimisation change for reducing context changes
// If you touch ManagedObject and update one of it properties to same value
// Object will be marked as `updated` even it hasn't changed.
// By reseting such objects we remove updates that are not updates.
for object in self.writableContext.updatedObjects {
    if object.changedValues().isEmpty {
        self.writableContext.refresh(object, mergeChanges: false)
    }
}

refresh(_:mergeChanges:false) = "Updates the persistent properties of a managed object to use the latest values from the persistent store. If flag is false, the context discards pending changes and the managed object becomes a fault. Upon next access, the context reloads the object’s values from the persistent store or last cached state."

  1. In early 2023 it gets changed to
    self.writableContext.refresh(object, mergeChanges: true)
    "If flag is true, the context reloads the object’s property values from the store or the cache. Then the context applies local changes over the newly loaded values. Merging the local values into object always succeeds, and never results in a merge conflict."
    I think this was done because change 1 was making FRC to lose track of changes.
    Reasoning:
    We use DTOs' willSave to force FRC updates by accessing the "parent" DTO and assigning a property and setting back the same value to the DTO. For example, UserDTO is getting saved and if it happens to be CurrentUserDTO.user, then UserDTO.willSave forces the CurrentUserDTO to change as well. Then FRC observing the CurrentUserDTO picks up the change. Discarding changes in these cases breaks FRC observing which I think the change number 2 tackled. The change in number 2 made sure these updates go through.
    From UserDTO:
if let currentUser = currentUser, !currentUser.hasChanges {
    let fakeNewUnread = currentUser.unreadChannelsCount
    currentUser.unreadChannelsCount = fakeNewUnread
}
  1. I removed the if check when testing if it fixes these some crashes in CoreData. It didn't.

Since the change number 2 some things have happened like we do not use viewContext anymore (except reading the state in DataStore, DB observers do not use it anymore). All the DB mutations go through the writableContext. Therefore, there should never be a case where writableContext itself would require to refresh the DTO from the persistent store (because it is the only one changing it). That is the main reason I believe we can remove it.
Interesting side of it is that refreshing an object seems to discard cached data in Core Data (cached data in other contexts) which leads to performance degradation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we also introduce a state layer context? Will this have an impact on it, have you checked that?

Copy link
Contributor Author

@laevandus laevandus Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For state layer, with similar test, the change is 1.60 s to 0.77 s (as expected)

if self.writableContext.hasChanges {
log.debug("Context has changes. Saving.", subsystems: .database)
try self.writableContext.save()
Expand Down
57 changes: 55 additions & 2 deletions Sources/StreamChat/Utils/Codable+Extensions.swift
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import Foundation
final class StreamJSONDecoder: JSONDecoder, @unchecked Sendable {
let iso8601formatter: ISO8601DateFormatter
let dateCache: NSCache<NSString, NSDate>
let rawJSONCache: RawJSONCache

override convenience init() {
let iso8601formatter = ISO8601DateFormatter()
Expand All @@ -17,12 +18,21 @@ final class StreamJSONDecoder: JSONDecoder, @unchecked Sendable {
let dateCache = NSCache<NSString, NSDate>()
dateCache.countLimit = 5000 // We cache at most 5000 dates, which gives good enough performance

self.init(dateFormatter: iso8601formatter, dateCache: dateCache)
self.init(
dateFormatter: iso8601formatter,
dateCache: dateCache,
rawJSONCache: RawJSONCache(countLimit: 500)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the performance benchmark table, we are not checking the memory usage. Is there any significant increase in memory by caching all the extra data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect it, but maybe we clean it up on memory warning notifications, just in case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally it is NSCache which clears itself on low memory warning. I used a wrapping class because of unit-tests. Can't mock NSCache from the Swift side (it is objc generic class and brings restrictions).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is significant decrease in memory usage used for RawJSON.
Before:
NoCache
After
Cache

)
}

init(dateFormatter: ISO8601DateFormatter, dateCache: NSCache<NSString, NSDate>) {
init(
dateFormatter: ISO8601DateFormatter,
dateCache: NSCache<NSString, NSDate>,
rawJSONCache: RawJSONCache
) {
iso8601formatter = dateFormatter
self.dateCache = dateCache
self.rawJSONCache = rawJSONCache

super.init()

Expand Down Expand Up @@ -50,6 +60,49 @@ final class StreamJSONDecoder: JSONDecoder, @unchecked Sendable {
}
}

extension StreamJSONDecoder {
class RawJSONCache {
private let storage: NSCache<NSNumber, BoxedRawJSON>

init(countLimit: Int) {
storage = NSCache()
storage.countLimit = countLimit
}

func rawJSON(forKey key: Int) -> [String: RawJSON]? {
storage.object(forKey: key as NSNumber)?.value
}

func setRawJSON(_ value: [String: RawJSON], forKey key: Int) {
storage.setObject(BoxedRawJSON(value: value), forKey: key as NSNumber)
}

final class BoxedRawJSON {
let value: [String: RawJSON]

init(value: [String: RawJSON]) {
self.value = value
}
}
}

/// A convenience method returning decoded RawJSON dictionary with caching enabled.
///
/// Extra data stored in models can be large, what can significantly slow
/// down DTO to model conversions. This function is a convenient way for
/// caching some of the data in DTO to model conversions.
func decodeCachedRawJSON(from data: Data?) throws -> [String: RawJSON] {
guard let data, !data.isEmpty else { return [:] }
let key = data.hashValue
if let value = rawJSONCache.rawJSON(forKey: key) {
return value
}
let rawJSON = try decode([String: RawJSON].self, from: data)
rawJSONCache.setRawJSON(rawJSON, forKey: key)
return rawJSON
}
}

extension JSONDecoder {
/// A default `JSONDecoder`.
static let `default`: JSONDecoder = stream
Expand Down
47 changes: 46 additions & 1 deletion Tests/StreamChatTests/Utils/JSONDecoder_Tests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ final class JSONDecoder_Tests: XCTestCase {
let dateFormatter = ISO8601DateFormatter_Spy()
dateFormatter.formatOptions = [.withFractionalSeconds, .withInternetDateTime]
let dateCache = NSCache_Spy()
let decoder = StreamJSONDecoder(dateFormatter: dateFormatter, dateCache: dateCache)
let decoder = StreamJSONDecoder(dateFormatter: dateFormatter, dateCache: dateCache, rawJSONCache: RawJSONCache_Spy())

// When we decode a payload with repeated dates
let repeatedDate = "2020-06-09T08:10:40.800912Z" // If you change this, make sure to change `actualDecodedDate` below
Expand Down Expand Up @@ -178,6 +178,33 @@ final class JSONDecoder_Tests: XCTestCase {
XCTAssertEqual(value, actualDecodedDate)
}
}

func test_extraDataIsCached() throws {
let extraData1: [String: RawJSON] = [
"key": .string("value")
]
let extraData2: [String: RawJSON] = [
"key2": .string("value2")
]
let data1 = try JSONEncoder().encode(extraData1)
let data2 = try JSONEncoder().encode(extraData2)

let cacheSpy = RawJSONCache_Spy()
let decoder = StreamJSONDecoder(
dateFormatter: ISO8601DateFormatter(),
dateCache: NSCache(),
rawJSONCache: cacheSpy
)
let decoded1 = try decoder.decodeCachedRawJSON(from: data1)
let decoded2 = try decoder.decodeCachedRawJSON(from: data2)
let decoded1Again = try decoder.decodeCachedRawJSON(from: data1)

XCTAssertEqual(extraData1, decoded1)
XCTAssertEqual(extraData2, decoded2)
XCTAssertEqual(extraData1, decoded1Again)
XCTAssertEqual(3, cacheSpy.numberOfCalls(on: "rawJSON(forKey:)"), cacheSpy.recordedFunctions.joined(separator: " "))
XCTAssertEqual(2, cacheSpy.numberOfCalls(on: "setRawJSON(_:forKey:)"), cacheSpy.recordedFunctions.joined(separator: " "))
}

// MARK: Helpers

Expand Down Expand Up @@ -251,3 +278,21 @@ final class JSONDecoder_Tests: XCTestCase {
}
}
}

private final class RawJSONCache_Spy: StreamJSONDecoder.RawJSONCache, Spy {
let spyState = SpyState()

init() {
super.init(countLimit: 3)
}

override func rawJSON(forKey key: Int) -> [String: RawJSON]? {
record()
return super.rawJSON(forKey: key)
}

override func setRawJSON(_ value: [String: RawJSON], forKey key: Int) {
record()
super.setRawJSON(value, forKey: key)
}
}
Loading