Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to generate engine on RTX4060 Laptop 8GB 100% CPU being utilised #5

Open
Vishwa0703 opened this issue Mar 22, 2024 · 1 comment

Comments

@Vishwa0703
Copy link

I have RTX4060 8GB in my laptop with 16gb ram and intel i7-12700H cpu when i run build-llama.sh or build-mistral.sh it gets killed automatically with below output and I found that my cpu gets 100% utilised when running build-llama.sh or build-mistral.sh attaching the ss of the same kindly Help me with that

(trtllm) vishwajeet@vishwa:~/Desktop/MYGPT/trt-llm-rag-linux$ bash build-mistral.sh
You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors.
[03/22/2024-20:50:37] [TRT-LLM] [I] Serially build TensorRT engines.
[03/22/2024-20:50:39] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2991, GPU +0, now: CPU 4121, GPU 1039 (MiB)
[03/22/2024-20:50:41] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1798, GPU +314, now: CPU 6055, GPU 1353 (MiB)
[03/22/2024-20:50:41] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
[03/22/2024-20:50:41] [TRT-LLM] [W] Invalid timing cache, using freshly created one
[03/22/2024-20:50:41] [TRT-LLM] [I] [MemUsage] Rank 0 Engine build starts - Allocated Memory: Host 7.1113 (GiB) Device 1.3216 (GiB)
build-mistral.sh: line 1: 6084 Killed python build.py --model_dir './model/mistral/mistral7b_hf' --quant_ckpt_path './model/mistral/mistral7b_int4_quant_weights/mistral_tp1_rank0.npz' --dtype float16 --remove_input_padding --use_gpt_attention_plugin float16 --enable_context_fmha --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --output_dir './model/mistral/mistral7b_int4_engine' --world_size 1 --tp_size 1 --parallel_build --max_input_len 1024 --max_batch_size 1 --max_output_len 1024

Screenshot from 2024-03-22 20-50-43

@Vishwa0703 Vishwa0703 changed the title Not able to generate engine on RTX4060 Laptop 8GB Not able to generate engine on RTX4060 Laptop 8GB 100% CPU being utilised Mar 22, 2024
@Vishwa0703
Copy link
Author

Please explain how to generate engine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant