Check out my Hugging Face space for the Image Prompt Generator (AKA Image Fungineer).
Get amazing image generations from simple 1-3 word prompts.
Models like Midjourney, Stable Diffusion and Dalle 2 are capable of generating beautiful images, but require detailed prompts that can be time consuming to imagine and write. The fungineer takes just a few words and prompt engineers them into something far more imaginative. The following examples show a simple prompt on the left and a re-imagined prompt on the right:
More examples can be found on the Hugging Face Space.
There are two models, one is 355 million parameters and another is 7 billion parameters. Both were trained on a dataset that used ChatGPT to convert boring image prompts in to exciting ones.
I found that the 355M parameter model tended to create some incoherence in the prompts, but still coherent enough to make beautiful images. The incoherence of the prompts often let to far more imaginative and interesting results.
The 7B parameter model consistently produced well-worded prompts, but that resulted in less interesting images from the image generators. Despite being larger, my preference for final images actually came from the smaller model.
I tested many different pre-trained base models along with LoRA and Prefix Tuning, with increasing number of parameters, I was able to minimize loss, but the cost and time for inference was much higher for only slightly better results. Here are the test results: