You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have RTX4060 8GB in my laptop with 16gb ram and intel i7-12700H cpu when i run build-llama.sh or build-mistral.sh it gets killed automatically with below output and I found that my cpu gets 100% utilised when running build-llama.sh or build-mistral.sh attaching the ss of the same kindly Help me with that
(trtllm) vishwajeet@vishwa:~/Desktop/MYGPT/trt-llm-rag-linux$ bash build-mistral.sh
You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors.
[03/22/2024-20:50:37] [TRT-LLM] [I] Serially build TensorRT engines.
[03/22/2024-20:50:39] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2991, GPU +0, now: CPU 4121, GPU 1039 (MiB)
[03/22/2024-20:50:41] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1798, GPU +314, now: CPU 6055, GPU 1353 (MiB)
[03/22/2024-20:50:41] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
[03/22/2024-20:50:41] [TRT-LLM] [W] Invalid timing cache, using freshly created one
[03/22/2024-20:50:41] [TRT-LLM] [I] [MemUsage] Rank 0 Engine build starts - Allocated Memory: Host 7.1113 (GiB) Device 1.3216 (GiB)
build-mistral.sh: line 1: 6084 Killed python build.py --model_dir './model/mistral/mistral7b_hf' --quant_ckpt_path './model/mistral/mistral7b_int4_quant_weights/mistral_tp1_rank0.npz' --dtype float16 --remove_input_padding --use_gpt_attention_plugin float16 --enable_context_fmha --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --output_dir './model/mistral/mistral7b_int4_engine' --world_size 1 --tp_size 1 --parallel_build --max_input_len 1024 --max_batch_size 1 --max_output_len 1024
The text was updated successfully, but these errors were encountered:
Vishwa0703
changed the title
Not able to generate engine on RTX4060 Laptop 8GB
Not able to generate engine on RTX4060 Laptop 8GB 100% CPU being utilised
Mar 22, 2024
I have RTX4060 8GB in my laptop with 16gb ram and intel i7-12700H cpu when i run build-llama.sh or build-mistral.sh it gets killed automatically with below output and I found that my cpu gets 100% utilised when running build-llama.sh or build-mistral.sh attaching the ss of the same kindly Help me with that
(trtllm) vishwajeet@vishwa:~/Desktop/MYGPT/trt-llm-rag-linux$ bash build-mistral.sh
You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors.
[03/22/2024-20:50:37] [TRT-LLM] [I] Serially build TensorRT engines.
[03/22/2024-20:50:39] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2991, GPU +0, now: CPU 4121, GPU 1039 (MiB)
[03/22/2024-20:50:41] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1798, GPU +314, now: CPU 6055, GPU 1353 (MiB)
[03/22/2024-20:50:41] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
[03/22/2024-20:50:41] [TRT-LLM] [W] Invalid timing cache, using freshly created one
[03/22/2024-20:50:41] [TRT-LLM] [I] [MemUsage] Rank 0 Engine build starts - Allocated Memory: Host 7.1113 (GiB) Device 1.3216 (GiB)
build-mistral.sh: line 1: 6084 Killed python build.py --model_dir './model/mistral/mistral7b_hf' --quant_ckpt_path './model/mistral/mistral7b_int4_quant_weights/mistral_tp1_rank0.npz' --dtype float16 --remove_input_padding --use_gpt_attention_plugin float16 --enable_context_fmha --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --output_dir './model/mistral/mistral7b_int4_engine' --world_size 1 --tp_size 1 --parallel_build --max_input_len 1024 --max_batch_size 1 --max_output_len 1024
The text was updated successfully, but these errors were encountered: