Skip to content

The project is an Al app that is built using stable diffusion multi-model for tasks such as UI/UX design and dataset generation. It can be accessed via web and understands natural language inputs. The model allows it to continuously improve & assist users in generating products by creating a user-friendly interfaces.

Notifications You must be signed in to change notification settings

Aryan-Deshpande/Diffusion-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Needs fixes

Latent Diffusion Multi-Model with CLIP guidance

This is a Web Application that pertains to "generating" Images based on the specific prompt. A machine learning model is implemented, which utilizes the input prompt as a guidance to the kind of image that should be denoised. An iterative process of Denoising a completely obscure image, until termination of the loop. Utilizes multiple Deep Learning Model Arcitectures

During Training

1) The idea is such that we use a Gaussian distribution to noise the training image at particular timestamps. Instead of sequentially utilizing the output of the previous timestep to apply noise until t timestamp, we directly sample the noised image at all timestamps Xt, this can be done because the sum of Gaussian distribution is nothing but Gaussian itself.

2) Consequently, the output of this process is then fed into a UNet Model along with the text label for the pertaining image. The goal of the UNet model is to transform the text label, and the image into a smaller dimensional space, famously known as the latent space. The latent space is the representation of compressed data, containing data that are similar, are closer together. ( Represents the Probability Distribution of the data )

3) Contrastive Loss and Cosine Similarity are utilized as guidance to optimize the generation of these images. The weights in the attention modules are shifted accordingly, until the loss reaches a minimal amount.

Run it Locally

git clone https://github.com/Aryan-Deshpande/Latent-Diffusion-AI
docker compose up

then go to localhost:3001

Papers Referenced / Used

About

The project is an Al app that is built using stable diffusion multi-model for tasks such as UI/UX design and dataset generation. It can be accessed via web and understands natural language inputs. The model allows it to continuously improve & assist users in generating products by creating a user-friendly interfaces.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published