Qwen-Image-Edit-2511作為一款性能出色的圖像編輯模型,在ComfyUI中部署時卻受限於顯存資源。本文針對4090顯卡(24G顯存)場景,分享量化模型的部署流程、關鍵避坑點,以及不同採樣步數下的效果對比,幫助大家快速落地實踐。

一、前置準備:ComfyUI安裝

ComfyUI基礎安裝流程此處不贅述,推薦參考官方中文指南,步驟清晰且適配Linux環境:ComfyUI Linux安裝官方指南

二、核心問題:顯存限制與解決方案

4090顯卡的24G顯存無法承載Qwen-Image-Edit-2511原始模型(顯存溢出),因此必須使用量化模型。考慮到外網訪問和下載限制,本文提供基於hugging-face鏡像modelscope的國內可訪問下載鏈接,同時明確各模型的存放路徑(直接關係到模型能否正常加載)。

2.1 量化模型下載清單(含路徑+命令)

所有模型需下載至ComfyUI對應目錄,以下是完整的路徑説明和wget下載命令(複製到終端直接執行即可):

1. LoRA模型(路徑:ComfyUI/models/loras)
wget https://hf-mirror.com/lightx2v/Qwen-Image-Edit-2511-Lightning/resolve/main/Qwen-Image-Edit-2511-Lightning-4steps-V1.0-bf16.safetensors
2. VAE模型(路徑:ComfyUI/models/vae)
wget https://hf-mirror.com/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors
3. UNet模型(路徑:ComfyUI/models/unet)
wget "https://modelscope.cn/api/v1/models/unsloth/Qwen-Image-Edit-2511-GGUF/repo?Revision=master&FilePath=qwen-image-edit-2511-Q4_K_M.gguf" -O qwen-image-edit-2511-Q4_K_M.gguf
4. CLIP模型(路徑:ComfyUI/models/clip)
# 主模型文件
wget -c "https://modelscope.cn/api/v1/models/unsloth/Qwen2.5-VL-7B-Instruct-GGUF/repo?Revision=master&FilePath=Qwen2.5-VL-7B-Instruct-Q4_K_M.gguf" -O Qwen2.5-VL-7B-Instruct-Q4_K_M.gguf

# 關鍵依賴文件(必下!)
wget -c "https://modelscope.cn/api/v1/models/unsloth/Qwen2.5-VL-7B-Instruct-GGUF/repo?Revision=master&FilePath=mmproj-F16.gguf" -O Qwen2.5-VL-7B-Instruct-mmproj-BF16.gguf

2.2 致命坑點:缺失mmproj文件導致的報錯解決方案

⚠️ 重點提醒:CLIP模型對應的mmproj文件是必下載項!缺失該文件會直接導致圖像編輯時出現「矩陣維度不匹配」的致命錯誤,排查過程耗時耗力。

我首次部署時因遺漏該文件,出現如下報錯(核心信息:mat1 and mat2 shapes cannot be multiplied):

Prompt executed in 10.11 seconds
got prompt
!!! Exception during processing !!! mat1 and mat2 shapes cannot be multiplied (748x1280 and 3840x1280)
Traceback (most recent call last):
  File "/root/comfy/ComfyUI/execution.py", line 516, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/execution.py", line 330, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/execution.py", line 304, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "/root/comfy/ComfyUI/execution.py", line 292, in process_inputs
    result = f(**inputs)
             ^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy_api/internal/__init__.py", line 149, in wrapped_func
    return method(locked_class, **inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy_api/latest/_io.py", line 1520, in EXECUTE_NORMALIZED
    to_return = cls.execute(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy_extras/nodes_qwen.py", line 103, in execute
    conditioning = clip.encode_from_tokens_scheduled(tokens)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/sd.py", line 207, in encode_from_tokens_scheduled
    pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/sd.py", line 271, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/text_encoders/qwen_image.py", line 62, in encode_token_weights
    out, pooled, extra = super().encode_token_weights(token_weight_pairs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/sd1_clip.py", line 704, in encode_token_weights
    out = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/sd1_clip.py", line 45, in encode_token_weights
    o = self.encode(to_encode)
        ^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/sd1_clip.py", line 297, in encode
    return self(tokens)
           ^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/sd1_clip.py", line 257, in forward
    embeds, attention_mask, num_tokens, embeds_info = self.process_tokens(tokens, device)
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/sd1_clip.py", line 219, in process_tokens
    emb, extra = self.transformer.preprocess_embed(emb, device=device)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/text_encoders/llama.py", line 593, in preprocess_embed
    return self.visual(image.to(device, dtype=torch.float32), grid), grid
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/text_encoders/qwen_vl.py", line 425, in forward
    hidden_states = block(hidden_states, position_embeddings, cu_seqlens_now, optimized_attention=optimized_attention)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/text_encoders/qwen_vl.py", line 252, in forward
    hidden_states = self.attn(hidden_states, position_embeddings, cu_seqlens, optimized_attention)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/text_encoders/qwen_vl.py", line 195, in forward
    qkv = self.qkv(hidden_states)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy-env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/ops.py", line 164, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 217, in forward_comfy_cast_weights
    out = super().forward_comfy_cast_weights(input, *args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/comfy/ComfyUI/comfy/ops.py", line 157, in forward_comfy_cast_weights
    x = torch.nn.functional.linear(input, weight, bias)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: mat1 and mat2 shapes cannot be multiplied (748x1280 and 3840x1280)

最終通過GitHubissue找到解決方案(感謝開源社區):TextEncodeQwenImageEdit mat1 and mat2 shapes cannot be multiplied 問題解決方案,核心就是補全mmproj文件。建議大家直接按上文清單下載,避免重複踩坑。

三、工作流配置與效果測試

模型部署完成後,需配置對應的工作流。以下是我測試用的工作流截圖(可直接參考復刻):

4090實戰:ComfyUI運行Qwen-Image-Edit-2511模型指南(含避坑要點)_人工智能

本次測試以「三圖編輯」為場景,重點驗證不同K採樣器步數對輸出效果的影響,測試環境為4090顯卡+Linux系統,具體結果如下:

3.1 20步採樣:速度快但效果差

  • 運行時長:1分40秒
  • 效果問題:人物手臂存在明顯割裂感;人物面部失真嚴重(如“馬爸爸”面部完全識別不出)
  • 效果截圖:

4090實戰:ComfyUI運行Qwen-Image-Edit-2511模型指南(含避坑要點)_人工智能_02

3.2 40步採樣:效果略有提升但仍有缺陷

  • 運行時長:4分37秒
  • 效果問題:手部與手臂的割裂感未完全解決,仍存在明顯銜接瑕疵
  • 效果截圖:

4090實戰:ComfyUI運行Qwen-Image-Edit-2511模型指南(含避坑要點)_人工智能_03

3.3 60步採樣:效果達標但耗時增加

  • 運行時長:6分57秒
  • 效果表現:手臂銜接問題基本解決;但人物面部仍與原角色有較大差異,且出現非預期的衣物顏色變化(如淺灰色衣物變為黑色)
  • 效果截圖:

四、總結與後續優化方向

  1. 4090顯卡運行Qwen-Image-Edit-2511需優先選擇量化模型,按本文提供的國內鏡像鏈接下載可規避網絡問題,且mmproj文件不可遺漏
  2. 採樣步數與效果、速度呈正相關:20步適合快速預覽,60步可解決核心瑕疵,但需接受更長耗時和人物臉部變更;
  3. 後續優化方向:可嘗試調整工作流中的提示詞精度、優化流程參數,或測試更高精度的量化模型(如Q2_K等),平衡效果與耗時。 如果大家在部署過程中遇到其他問題,歡迎在評論區交流~