Midjourney and Stable Diffusion have been two of the biggest names in AI image generation over the past few years. But while Midjourney is improving and going from strength to strength, Stability AI, the company behind Stable Diffusion, has been devolving into chaos, threatening its standing as one of the best AI art generators currently available.
Since Stable Diffusion is open source, it's still widely available and supported—though the original researchers have left Stability AI and released a new model called FLUX.1.
Given all this, let's look at what image generator you should use if you want the most powerful and customizable option: Midjourney, Stable Diffusion, or something else?
Table of contents:
Midjourney vs. Stable Diffusion at a glance
Stable Diffusion and Midjourney are both AI image generators, but that's about where the similarities end. Here's a short summary, but read on for the full rundown of how they differ.
Midjourney | Stable Diffusion | |
---|---|---|
Quality of images | ⭐⭐⭐⭐ Some of the best AI-generated images, though may not fully understand every prompt | ⭐⭐⭐ Different versions produce images of differing quality, but still generally great |
Ease of use | ⭐⭐⭐⭐⭐ An awesome web app makes everything easy | ⭐⭐ Stable Diffusion is available in so many ways that things get complicated |
Power and control | ⭐⭐⭐⭐⭐ Best-in-class prompting and editing options | ⭐⭐⭐⭐ The ability to train models on your own data sets it apart |
How do Stable Diffusion and Midjourney work?
From a technical standpoint, Stable Diffusion and Midjourney work in much the same way to create images—it's the user interface, their other features, and how you access them where things differ, not the underlying technology.
Both AI models were trained on millions or billions of text-image pairs, which is how they can understand what concepts like people, Canada, and impressionist paintings mean—and then put it all together to interpret a prompt like "an impressionist oil painting of a Canadian man riding a moose through a forest of maple trees."
Once Stable Diffusion or Midjourney has worked out what it thinks you're asking it to create, it generates an image using a technique called diffusion. The image generator starts with a randomly created field of noise and, in a number of discrete steps, edits the image to get it closer to its interpretation of your prompt.
Every time you generate a new image, the image generator starts with a different field of noise, which is why you can get different results, even if you use the same prompt. In a previous article comparing DALL·E 3 and Stable Diffusion, I described the process as "kind of like looking up at a cloudy sky, finding a cloud that looks kind of like a dog, and then being able to snap your fingers to keep making it more and more dog-like."
While that's not a perfect analogy, you can tell by the fact I'm reusing it here that I think it's a good way to think of it.
Even though the underlying tech of both models is pretty similar, both Stable Diffusion and Midjourney can produce completely different results. How a model interprets your prompt and weighs up all the different elements of it, the specific data it was trained on, and the underlying philosophies of the team developing it all have a huge effect on the final output. (Stable Diffusion, in particular, has a large community that has created different versions of the core models tailored to specific things.)
Let's use one of my favorite prompts. Here's what Stable Assistant using Stable Diffusion Ultra created for "an impressionist oil painting of a Canadian man riding a moose through a forest of maple trees":
Here's what NightCafé using SDXL 1.0 created:
And here's what Midjourney made:
While all three sets of images look relatively like impressionist paintings, the fact that there are three of them in this head-to-head comparison should make it clear that things are a bit confusing right now. So, let's dig in.
Stable Diffusion is a bit of a mess right now
What we're calling Stable Diffusion is actually a number of open models available in a few different ways and with differing levels of support. The three most relevant to our discussion are:
Stable Diffusion Ultra or Stable Diffusion 3. SD 3 was released earlier this year with a newly restrictive license and was largely considered worse than SDXL 1.0. Because of all this, it was banned on some of the leading AI image generation platforms. Stability AI has updated the license terms and it's available through a few places such as the Stable Assistant chatbot.
Stable Diffusion XL. In 2023, this was probably the best and most popular AI image generation model. It (and models based on it) are still widely available on art generation platforms like NightCafe.
Stable Diffusion 1.5. An older, smaller version of Stable Diffusion. Because it's cheap to fine-tune, it's still widely available on art generation platforms.
To make matters more confusing, a company founded by the original Stable Diffusion researchers after they left Stability AI, Black Forest Labs, has released a series of new models called FLUX.1 that are largely replacing Stable Diffusion as the go-to open text-to-image model.
This means that while Stable Diffusion is still relevant (at least for now), SD 3 is nowhere near as popular or widely available as its predecessors, SDXL and SD 1.5. The big problem is that Stable Diffusion's popularity and widespread availability was a huge part of its appeal. Without it, there is very little to recommend it over Midjourney—especially as it's matured and now has a real web app.
Midjourney has matured
When I last compared Midjourney and Stable Diffusion, Midjourney was the oddball only accessible through the chat app Discord. Oh how things have changed.
Midjourney now has a shiny new web app that gives you full access to almost all its features. It's still community based, so you see the images other people are generating (and they can see yours), but it's otherwise a mostly normal modern web experience.
From the web app, you can generate anything you want; create countless variations; use images as part of your prompt, to set a style reference, or as a character reference; edit small details; expand your images by zooming out, panning in any direction, or just changing the aspect ratio; and go really weird by diving deep into its parameters and other power user features.
Honestly, it's almost impossible to compare Midjourney's polished app to the complicated and convoluted controls available through Stability AI's mediocre ChatGPT competitor, Stable Assistant.
It doesn't matter that the SD Ultra model can generate some impressive-looking images and has some cool features—the process of using it is pretty poor.
And while this isn't only one way of accessing Stable Diffusion, it's the way that's promoted by Stability AI, and it's by far the simplest option for accessing its latest (and supposedly greatest) model.
Stable Diffusion isn't dead yet
For all that, Stable Diffusion is still available in a few forms I'm happy to recommend.
Third-party services like Clipdrop (now owned by Jasper) still integrate with various different Stable Diffusion models. If you like and use one of these services, there's no reason to switch things up.
Art generator platforms like NightCafe, Tensor.Art, and Civitai still support Stable Diffusion and dozens of other models based on it. These platforms typically have free trials or free image plans, so they're a great place to play around with different image models.
If you have the technical skills, you can download Stable Diffusion (including SD 3) and get it up and running locally on your own machine or through a cloud provider like Leap AI.
I'm not sure how long this will remain true. The new CEO, additional investment, and James Cameron joining the board may bring Stable Diffusion back from the brink—but even the r/StableDiffusion community on Reddit mostly talks about FLUX.1.
Commercial use continues to be complicated
Both Stable Diffusion and Midjourney allow you to use any images you generate commercially with very few limits, but that doesn't mean that things are simple. We're in a new era, and a lot of lawsuits are still working their way through the courts to determine what exactly is and isn't legal with AI-generated art.
Back in February 2023, the U.S. Copyright Office ruled that images created by Midjourney and other generative AIs can't be copyrighted. This means that even though you're free to use images you generate commercially, you have limited recourse if someone takes them and uses them without your permission. So, it's important to remember if you're designing logos or otherwise building a brand using Stable Diffusion or Midjourney, you don't have a very strong legal shield.
Until Stable Diffusion scared off its whole community by changing the terms of its license, it was the obvious choice for commercial use. It's rolled back the most onerous license terms, and they now only affect businesses making more than a million dollars per year, but the whole situation still makes things more complex. The flexibility to train your own models is important, but I would suggest considering other open models as well and deciding which works best for your needs.
A quick word about price
Given that Stable Diffusion is usable in a few different ways, pricing is a little all over the place. You can use various versions of it for free through different art generation platforms, though there are normally limits on the number of outputs you can generate or the size of the images you can create.
For example: Stable Assistant starts at $9/month with a three-day free trial, but as a chatbot, it isn't very good. If you limit yourself to just using Stable Diffusion through it, you might find it to be a good value, but it just doesn't compete with ChatGPT, Claude, or Gemini. And, of course, you can download it and run it on your own hardware for free—or at least without a subscription.
With Midjourney, things are a little simpler. There's no free option, but the Basic Plan starts at $10/month and entitles you to 200 minutes of GPU time. How many images is that? Well, Midjourney says it's good for roughly 200 prompts, but creating multiple variations or upscaling your images will burn through them faster than creating lots of low-res images. To make things a tad more complicated, starting with the $30/month Standard plan, you get enough fast GPU hours for roughly 900 prompts per month, but you can also generate unlimited images in Relax mode—which only runs when there's free GPU power.
My stance would be that Stable Diffusion is worth checking out for free, but if you're prepared to pay $10/month, Midjourney is by far the better choice. With that said, subscribing to a service like NightCafe can get you access to Stable Diffusion and lots of other models and features, so if you want to explore a range of open models, that's also an option.
Stable Diffusion vs. Midjourney: Which should you use?
Comparing Stable Diffusion and Midjourney now presents a fairly easy choice. If you want a super powerful AI image generator with a great web app and are prepared to pay $10/month, Midjourney is the best option by far.
On the other hand, if you accept the chaos and are happy to find a way to use Stable Diffusion that you like, it still can be a good option—though most people looking for an open AI image generator seem to be transitioning to FLUX.1.
Alternatively, if you want the easiest option going, check out DALL·E 3. It's a bit less powerful than either Stable Diffusion or Midjourney, but it integrates with ChatGPT (a chatbot that actually works), so it's really easy to use.
Related reading:
This article was originally published in January 2024. The most recent update was in October 2024.