RVC AI – Retrieval-based Voice Conversion is a technique that uses a deep neural network to transform the voice of a speaker into another voice. It is based on the VITS model, which is a state-of-the-art end-to-end text-to-speech system. RVC can be used to create realistic and expressive voice conversions with minimal data and computational resources.
✅Minimize tone leakage by substituting source feature with training-set feature from top1 retrieval;
✅Train easily and quickly, even with low-end graphics cards;
✅Achieve decent results with little data (>=10min low noise speech recommended);
✅Support model fusion to alter timbres (use ckpt processing tab->ckpt merge);
✅User-friendly Webui interface;
✅Use the UVR5 model to separate vocals and instruments fast.
To begin, you can install the necessary core dependencies for PyTorch. If you already have them installed, you can skip this step. Please refer to the following link for more information:
Use the following command to install the required packages:
pip install torch torchvision torchaudio
For Windows users with Nvidia Ampere Architecture (RTX30xx), it is necessary to specify the CUDA version corresponding to PyTorch. You can refer to the experience shared on this GitHub issue:
Use the following command to install PyTorch with the specific CUDA version for Windows + Nvidia Ampere Architecture:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
Next, you will need to install the Poetry dependency management tool. If you already have it installed, you can skip this step. Please follow the instructions provided in the following link: https://python-poetry.org/docs/#installation
Use the following command to install Poetry:
curl -sSL https://install.python-poetry.org | python3 -
Finally, you can install the dependencies required for the project. Use the following command:
RVC AI depends on some pre-trained models for inference and training.
You can get them from Huggingface space.
These are the pre-trained models and other files that RVC uses:
hubert_base.pt ./pretrained ./uvr5_weights To use the v2 version model, which has a 12-layer Hubert input of 768 dimensions and 3 period discriminators, instead of a 9-layer Hubert+final_proj input of 256 dimensions, you need to download extra features. ./pretrained_v2 #If you are using Windows, you may also need this dictionary, skip if FFmpeg is installed ffmpeg.exe
Then use this command to start Webui:
For Windows users,
RVC-beta.7z is available for download and extraction to run RVC directly. To launch Webui, use
Join Guidady AI Mail List
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.