fix: more robust check whether the HF model is quantized (#11891)

Removes the check of `model.is_quantized` and adds more robust way of checking for 4bit and 8bit quantization in the `huggingface_pipeline.py` script. I had to make the original change on the outdated version of `transformers`, because the models had this property before. Seems redundant now. Fixes: #11809 and #11759
langchain-ai · Oct 16, 2023 · 5019f59 · 5019f59
1 parent efa9ef7
commit 5019f59
Showing 1 changed file with 2 additions and 3 deletions.
diff --git a/libs/langchain/langchain/llms/huggingface_pipeline.py b/libs/langchain/langchain/llms/huggingface_pipeline.py
@@ -109,9 +109,8 @@ def from_model_id(
             ) from e
 
         if (
-            model.is_quantized
-            or model.model.is_loaded_in_4bit
-            or model.model.is_loaded_in_8bit
+            getattr(model, "is_loaded_in_4bit", False)
+            or getattr(model, "is_loaded_in_8bit", False)
         ) and device is not None:
             logger.warning(
                 f"Setting the `device` argument to None from {device} to avoid "