Background
Stable Diffusion is a text-to-image diffusion model that can create an AI image from a given prompt. I wanted to test it on my M1 Macbook.
Setup
Setup was fairly simple. I followed the instructions on Lincoln Stein’s fork of Stable Diffusion for M1 Macs, and despite all the disclaimers, the process worked like a charm. There is rapid development across Stable Diffusion and pytorch nightly, so my experience could be attributed to lucky timing, YMMV.
I do experience some of the bugs already reported as issues — occasionally my generated output is an black square.
Getting any output image was slow since I couldn’t take advantage of CUDA cores, and was running everything off CPU. I am lucky enough to have a friend with a beefy Nvidia GPU though — I’ve attached some of the interesting images generated at the end.
Images
”ocean”. First photo was created with only one pass.
data:image/s3,"s3://crabby-images/821c8/821c80dab623d2a09cb5fce1871cfcb963593aad" alt="ocean1"
data:image/s3,"s3://crabby-images/bcd5d/bcd5d48e1adffd0d2ca881ca1639c9f770659474" alt="ocean2"
data:image/s3,"s3://crabby-images/28206/28206665d873ba5aaac4ab439592ed9b8ba87588" alt="ocean3"
”game of thrones anime"
data:image/s3,"s3://crabby-images/f0807/f08077f53a9a18e954d8c79923dde9ec7b742a4b" alt="game-of-throne"
"thanos eating popcorn"
data:image/s3,"s3://crabby-images/0ef21/0ef215fb0c208bbc3fe637beae0f2eba6c4b4d35" alt="thanos-popcorn"
"dog king"
data:image/s3,"s3://crabby-images/01ced/01ced6c3b2322edf02d6ede84a793c6d9b40b13d" alt="dog-king"
"cat in armor high def”
data:image/s3,"s3://crabby-images/3553c/3553c7d1eb93307f8ed337e30318f2a8f82f4d3a" alt="cat-knight"
I also tried out the img2img
command, turning the last photo into a “drawing”.
data:image/s3,"s3://crabby-images/b3943/b3943efb7e572397fdb2cec7fdb8ae00256f47d1" alt="cat-knight-drawing"
And, a profile photo for myself.
data:image/s3,"s3://crabby-images/ac901/ac901a37790f94183378fd8a6bc8ca7327d4953f" alt="me"
More Images
These photos were generated with my friend’s desktop — he has an AMD 5950x, RTX 3080 FE. Super fast — much easier to iterate on good prompts and search for better seeds.
data:image/s3,"s3://crabby-images/7eea4/7eea4f61941b4cb4f33cf2b22e6b933a06cacae8" alt="cityscape"
The following is the original generated from txt2img
, the rest are
alternatives/improvements using img2img
.
data:image/s3,"s3://crabby-images/b8d61/b8d61d6394d1e27d09f517cfff53c371e6abb0a5" alt="cyborg"
data:image/s3,"s3://crabby-images/87114/87114fd79f252800ec3b7f4c41e26fa08651922d" alt="cyborg1"
data:image/s3,"s3://crabby-images/11a24/11a24530380b7d6a8acd66eace193c57d6cab0e3" alt="cyborg2"
data:image/s3,"s3://crabby-images/07458/07458193e0db5dba1d136cce9972c6476dc6bb94" alt="nicolas_cage_party"