Artificial intelligence (AI) techniques are getting popular and utilized in various products and services. While the cloud-based AI techniques have been used to perform compute/memory intensive inferences because of the powerful servers on cloud, on-device AI technologies are recently drawing attention from the mobile industry for response time reduction, privacy protection, and connection-less AI service. Big mobile players are investing their research effort on the on-device AI technologies and already announced hardware and software on-device AI solutions. We are not leading this trend currently, but since on-device AI area is just started and still in the initial state, there are still opportunities and possibilities to reduce the gap between pioneers and us. We believe on-device AI will become a key differentiator for mobile phone, TV, and other home appliances, and thus developing on-device AI software stack is of paramount importance in order to take leadership in the on-device AI technology.
Although the vision of on-device AI is promising, enabling on-device AI involves unique technical challenges compared to traditional cloud-based approach. This is because on-device AI tries to conduct inference tasks solely on device without connecting to cloud resources. Specifically, hardware resources on device, such as processor performance, memory capacity, and power budget, are very scarce and limit the compute capability, which is typically required to execute complicated neural network (NN) models. For example, in one product requirement, a mobile device should consume less than 1.2W and could use at most 2W only for 10 minutes due to thermal issue. Next, on-device AI software stack needs to support diverse device environments, since embedded platforms may consist of heterogeneous compute devices, such as CPU, GPU, DSP, or neural processing unit (NPU), and use different OS platforms, such as Tizen, Android, or various Linux.
To tackle the challenges above and to have the leadership on on-device AI technology, this project, as the first step, aims at developing a neural network inference framework specialized and optimized for on-device AI.