You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A lot of different models such as LLMs are large models, even when transformed into potentially on-device executable versions using CoreML. It is impractical to ship these models with a mobile application. Even when we download and abstract these models it requires some UI and progress indicators to communicated the implications to the user.
Solution
All building blocks to create a good integration into SpeziML are in place.
We can use SwiftUI to create a nice downloading progress API that track the progress of downloading the model and making it ready to be executed.
Similar to #18, we should add some sort of abstraction layer to the API to enable a reuse across different models, maybe initially focusing on the Hugging Face and LLM use case.
Testing this functionality is probably best done on a macOS machine. This might require some smaller changes to the framework.
Additional context
No response
Code of Conduct
I agree to follow this project's Code of Conduct and Contributing Guidelines
The text was updated successfully, but these errors were encountered:
Sadly, in its current state, CoreML is not optimized to run LLMs and therefore is way too slow for local LLM execution.
Currently, SpeziLLM provides local inference functionality via llama.cpp, but that may change if Apple updates CoreML in this year's WWDC.
Problem
A lot of different models such as LLMs are large models, even when transformed into potentially on-device executable versions using CoreML. It is impractical to ship these models with a mobile application. Even when we download and abstract these models it requires some UI and progress indicators to communicated the implications to the user.
Solution
All building blocks to create a good integration into SpeziML are in place.
Similar to #18, we should add some sort of abstraction layer to the API to enable a reuse across different models, maybe initially focusing on the Hugging Face and LLM use case.
Testing this functionality is probably best done on a macOS machine. This might require some smaller changes to the framework.
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: