DragGAN AI: Interactive Point-based Editing Tool (Code Released)

DragGAN

Interactive Point-based Manipulation on the Generative Image Manifold

Draggan AI is a tool that edited pictures that users want by moving objects in the image. Dragan lets users “drag” any points of the image.

The DragGAN tool has two parts:

1️⃣ a motion supervision that moves the point.

2️⃣ a point tracking approach that finds the point.

With Draggan AI, anyone can change an image by moving pixels, and change how objects look and where they are. The pictures are real even for hard cases. We show DragGAN is better than other methods in changing images and finding points. We also show how to change real images.

DragGAN Code will be released in June. (Update: Code has been released on Jun 27th, 2023)

Download Project Paper

Getting Started 🚀

Requirements

To use NVlabs/stylegan3, you need a CUDA-enabled graphics card. Check the specifications of your device and the software requirements before proceeding.

Alternatively, you can use GPU acceleration on MacOS with Silicon Mac M1/M2, or CPU only, by following these steps:

cat environment.yml | \
  grep -v -E 'nvidia|cuda' > environment-no-nvidia.yml && \
    conda env create -f environment-no-nvidia.yml
conda activate stylegan3

# On MacOS
export PYTORCH_ENABLE_MPS_FALLBACK=1

Download pre-trained StyleGAN2 weights

You can download the pre-trained weights by executing the following command:

sh scripts/download_model.sh

To use StyleGAN-Human and the Landscapes HQ (LHQ) dataset, you need to download the weights from these links: StyleGAN-Human, LHQ, and place them in the ./checkpoints directory.

You can experiment with different pre-trained StyleGAN models as well.

Run DragGAN GUI

To launch the DragGAN GUI, run the following command:

sh scripts/gui.sh

You can use this GUI to modify images created by GANs. For real images, you have to invert them to GAN space with methods like PTI. Then load the new latent code and model weights to the GUI.

A Gradio demo of DragGAN is also available for you to try.

python visualizer_drag_gradio.py

This code is a modification of StyleGAN3, which is a generative adversarial network for creating realistic images. Some of the code is adapted from StyleGAN-Human, which is a specialized version for human faces.