You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that the tokens are miscounted on the second call. This isn't the case for the iOS application. I haven't looked at the Android version of the demo located in the executorch repo. I haven't tested on other models yet, but I can start moving other llama models over to the Android device to see if I can reproduce this tokenizer bug.
Versions
Here's the info on my MBP.
Collecting environment information...
PyTorch version: 2.6.0.dev20241002
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 14.6.1 (arm64)
GCC version: Could not collect
Clang version: 16.0.0 (clang-1600.0.26.4)
CMake version: version 3.30.4
Libc version: N/A
Python version: 3.10.15 | packaged by conda-forge | (main, Sep 30 2024, 17:48:38) [Clang 17.0.6 ] (64-bit runtime)
Python platform: macOS-14.6.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Apple M1 Max
Versions of relevant libraries:
[pip3] executorch==0.5.0a0+72b3bb3
[pip3] numpy==1.26.4
[pip3] torch==2.6.0.dev20241002
[pip3] torchao==0.5.0
[pip3] torchaudio==2.5.0.dev20241007
[pip3] torchsr==1.0.4
[pip3] torchtune==0.4.0.dev20241010+cpu
[pip3] torchvision==0.20.0.dev20241002
[conda] numpy 1.26.4 py312h7f4fdc5_0
[conda] numpy-base 1.26.4 py312he047099_0
[conda] numpydoc 1.7.0 py312hca03da5_0
The text was updated successfully, but these errors were encountered:
I'm seeing the same issue. I get this crash log on the device:
Abort message: 'In function generate(), assert failed (num_prompt_ tokens < metadata_.at (kMaxSeqLen)): num_prompt_tokens 138 >= max_seq_len_ 128, Max seq length exceeded - please increase max seq len value in .../llama2/model-py'
infil00p
changed the title
Torchchat on Android crashes on second prompt with Llama-3.2-3b-instruct
Torchchat on Android crashes on second prompt
Nov 26, 2024
After testing the Software Maison React Native code and adding Executorch to my project, the issue appears to be with the Android app written for Torchchat and shared with the example app in Executorch.
🐛 Describe the bug
Device Info:
Device: Google Pixel 9
Android Version: 15
API Level: 35
Steps to reproduce the bug:
Expected:
The Llama model should produce output
What happened:
It seems that the tokens are miscounted on the second call. This isn't the case for the iOS application. I haven't looked at the Android version of the demo located in the executorch repo. I haven't tested on other models yet, but I can start moving other llama models over to the Android device to see if I can reproduce this tokenizer bug.
Versions
Here's the info on my MBP.
The text was updated successfully, but these errors were encountered: