Things I want to do
I’ll try using the Z-Image-Turbo image generation model on Alibaba in an environment without CUDA.
Environment setup
Create a working folder.
Move to venv environment (optional)
If necessary, run the following command in the command prompt to create and activate the Venv environment.
python -mvenv venv
venv\scripts\activate.batLibrary Installation
Execute the following command to install the necessary libraries.
pip install git+https://github.com/huggingface/diffusers
pip install torch torchvision
pip install transformers
pip install accelerateSave the following content as a file named run.py.
import torch
from diffusers import ZImagePipeline
pipe = ZImagePipeline.from_pretrained(
"Tongyi-MAI/Z-Image-Turbo",
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=False,
)
pipe.to("cpu")
prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."
# 2. Generate Image
image = pipe(
prompt=prompt,
height=256,
width=256,
num_inference_steps=9,
guidance_scale=0.0,
generator=torch.Generator().manual_seed(42),
).images[0]
image.save("example.png")The code is basically the same as the one on the following page (model card), but it has been modified to run on the CPU. Also, the size of the generated images has been reduced for testing purposes.

execution
Execute the following command to run the script.
A file named example.png will be created in the folder where you executed the command.
python run.py
Execution time
(The first run will be slow because the model will be downloaded. Depending on your environment, it may take an additional 1-2 hours. A download of over 20GB will occur.)
It takes about 20 minutes to complete an iteration.

However, it takes quite a while after the progress shown above reaches 100%. (I haven’t timed it precisely, but maybe an extra hour?)
thoughts
Since it took 20 minutes for Flux.1 to output an image, my honest impression is that it was slow.
However, I have never had a successful experience generating a decent 256×256 image using other image generation models. But with Z-Image-Turbo, I’ve been able to generate images of the same quality as the sample.
If the specifications are sufficient
bonus
Record of failed attempts to run it using DirectML
import torch
from diffusers import ZImagePipeline
import torch_directml
dml = torch_directml.device()
pipe = ZImagePipeline.from_pretrained(
"Tongyi-MAI/Z-Image-Turbo",
torch_dtype=torch.float,
low_cpu_mem_usage=False,
)
pipe.to(dml)
prompt = "Young Chinese woman in red Hanfu, intricate embroidery. Impeccable makeup, red floral forehead pattern. Elaborate high bun, golden phoenix headdress, red flowers, beads. Holds round folding fan with lady, trees, bird. Neon lightning-bolt lamp (⚡️), bright yellow glow, above extended left palm. Soft-lit outdoor night background, silhouetted tiered pagoda (西安大雁塔), blurred colorful distant lights."
# 2. Generate Image
image = pipe(
prompt=prompt,
height=256,
width=256,
num_inference_steps=9, # This actually results in 8 DiT forwards
guidance_scale=0.0, # Guidance should be 0 for the Turbo models
generator=torch.Generator().manual_seed(42),
).images[0]
image.save("example.png")
error
It’s from float to float, so no conversion should be necessary… memory?
Traceback (most recent call last):
File 'F:\projects\python\Qwen-Image\zi.py', line 11, in
pipe.to(dml)
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py', line 545, in to
module.to(device, dtype)
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\transformers\modeling_utils.py', line 4343, in to
return super().to(*args, **kwargs)
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\torch\nn\modules\module.py', line 1174, in to
return self._apply(convert)
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\torch\nn\modules\module.py', line 780, in _apply
module._apply(fn)
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\torch\nn\modules\module.py', line 780, in _apply
module._apply(fn)
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\torch\nn\modules\module.py', line 780, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\torch\nn\modules\module.py', line 805, in _apply
param_applied = fn(param)
File 'F:\projects\python\Qwen-Image\venv\lib\site-packages\torch\nn\modules\module.py', line 1160, in convert
return t.to(
RuntimeError


コメント