YuE AI Music Generator: Create Full Songs from Lyrics and Genre Prompts

Table Of Content
- How YuE AI Music Generator Works?
- YuE AI Music Generator Overview:
- Examples of Songs Generated by YuE AI
- Pop Genre Example
- Rap Genre Example
- Jazz Genre Example
- Heavy Metal Genre Example
- Multilingual Song Example
- How to Use YuE AI Music Generator?
- How to use YuE AI Locally using github?
- Step 1: Install Environment and Dependencies
- Step 2: Download Inference Code and Tokenizer
- Step 3: Run Inference to Generate Music
- Advanced: Music In-Context Learning (ICL)
- 1. Dual-Track ICL (Recommended):
- 2. Single-Track ICL:
- Final Thoughts
In this article, I’ll walk you through the YuE AI Music Generator, an open-source tool that can create full songs from just a few inputs. All you need to do is provide the lyrics and specify the genre, and it will generate a complete song for you. Let’s dive into how it works and explore some examples of what it can do.
How YuE AI Music Generator Works?
The YuE AI Music Generator is a tool that allows you to create full songs by simply inputting lyrics and selecting a genre. It’s similar to other tools like Udio or Suno, but what sets it apart is its open-source nature and the fact that it’s completely free to use.
Here’s how it works:
- Input Lyrics: Write or paste the lyrics you want the song to include.
- Specify Genre: Choose a genre or style for the song, such as pop, rap, jazz, or heavy metal.
- Generate Song: The AI will create a full song, complete with vocals and instrumentation, based on your inputs.
YuE AI Music Generator Overview:
Feature | Details |
---|---|
Model Name | YuE AI Music Generator |
Functionality | Generate full songs from lyrics and genre prompts |
Project Page | map-yue.github.io/ |
GitHub Repository | github.com/multimodal-art-projection/YuE |
Hugging Face Space | huggingface.co/m-a-p/YuE-s1-7B-anneal-en-cot |
Input | Lyrics and genre specifications |
Output | Complete musical compositions with vocals and instrumentation |
Examples of Songs Generated by YuE AI
To give you a better idea of what YuE AI can do, let’s look at some examples of songs it has generated across different genres.
Pop Genre Example
For the first example, I specified the genre as "inspiring female uplifting pop, airy vocal, electronic, bright vocal" and input the following lyrics:
"Staring at the sense colors the sky of you, keep swearing can't deny, I know you down and make me St, but I'm here to man the heart I didn't break, every word you take my the behind every dream you're a Chase, I reach you by the life you can't F this, and now I won't back down, you know you can't deny it now."
Here’s what the AI generated:
"Staring at the sense colors the sky of you, keep swearing can't deny, I know you down and make me St, but I'm here to man the heart I didn't break, every word you take my the behind every dream you're a Chase, I reach you by the life you can't F this, and now I won't back down, you know you can't deny it now."
If I generate the song again with the same inputs, here’s another variation:
"Come beat the sky, that's how you keep sweating, can't to high, I know I let you down, I made mistakes, but I'm here to man The, I didn't break, every you take, I'm eyes you."
As you can see, the AI creates unique variations each time, even with the same lyrics and genre.
Rap Genre Example
Next, let’s try a rap song. For this example, I specified the genre as "rap piano, street tough, piercing vocal, hip-hop synthesizer, clear vocal male" and input the following lyrics:
"This is my journey and I'm running this race, he the Cs about a base, said I couldn't do a, said I'd never rise, but now I'm soaring high, reaching for the skies, lessons that I learned made me who I am, standing T now I don't give a damn, echoes in the alley, music need to me, watch is cish, she be so me like me, let USA and SK get all to ca first, bir certain that b fing the from the world, cing the CH do and the witch and on this fool man on my wing and from my near the end there before the chorus."
Here’s what the AI generated:
"This is my journey and I'm running this race, he the Cs about a base, said I couldn't do a, said I'd never rise, but now I'm soaring high, reaching for the skies, lessons that I learned made me who I am, standing T now I don't give a damn, echoes in the alley, music need to me, watch is cish, she be so me like me, let USA and SK get all to ca first, bir certain that b fing the from the world, cing the CH do and the witch and on this fool man on my wing and from my near the end there before the chorus."
One thing to note is that the AI sometimes skips lines or jumps around with the lyrics. However, it’s impressive how it can continue rapping without any guidance, almost like freestyling.
Jazz Genre Example
Now, let’s explore a jazz example. For this, I specified the genre as "female blues, airy vocal, bright vocal, piano, sad romantic guitar, jazz" and input the following lyrics:
"Sh T to fall up the ne, echo through the hall, bre in the silent, I hear gentle voice, guarding me back Homeward, making my heart Rejoice, don't let this moment, they hold me close to with your he beside me, things around can't the Dem, don't want to let you go, stay with me forever."
Here’s what the AI generated:
"Sh T to fall up the ne, echo through the hall, bre in the silent, I hear gentle voice, guarding me back Homeward, making my heart Rejoice, don't let this moment, they hold me close to with your he beside me, things around can't the Dem, don't want to let you go, stay with me forever."
The result is a smooth, jazzy tune with a romantic vibe. It’s a great example of how versatile the YuE AI Music Generator can be.
Heavy Metal Genre Example
For those who prefer something more intense, YuE AI can also handle heavy metal. Here’s an example with the genre specified as "heavy metal" and the following lyrics:
"Step back all night without a fight, no SC get up with the first fight, up P your hands be light, might step back cuz I hold back F going going the on what thing."
Here’s what the AI generated:
"Step back all night without a fight, no SC get up with the first fight, up P your hands be light, might step back cuz I hold back F going going the on what thing."
The AI has no problem handling screaming and other extreme elements of heavy metal. Just a warning—you might want to turn your speakers down for this one!
Multilingual Song Example
YuE AI can also create songs in different languages. For this example, I input lyrics that include Japanese, English, and Korean:
"The only one I know, know the only know, love you, do you want me, and I want to be your number one, come a little closer."
Here’s what the AI generated:
"The only one I know, know the only know, love you, do you want me, and I want to be your number one, come a little closer."
Near the end of the song, the AI even added some autotune synth voice effects, which is a cool touch.
How to Use YuE AI Music Generator?
If you’re interested in trying out the YuE AI Music Generator for yourself, here’s what you need to know:
- Download the Tool: The GitHub link at the top of the blog post contains all the instructions on how to download and use the tool.
- System Requirements: The tool requires significant GPU memory. You’ll need at least 16 GB of VRAM, though 24 GB is recommended for optimal performance.
- Open-Source License: The tool is under the Apache 2 license, which means you’re free to use it for any purpose, including commercial projects.
How to use YuE AI Locally using github?
Step 1: Install Environment and Dependencies
-
Create a new Python environment using Conda:
- Open your terminal and run:
conda create -n yue python=3.8 conda activate yue
- Note: Python 3.8 or higher is recommended.
- Open your terminal and run:
-
Install PyTorch and CUDA (for GPU acceleration):
- Install the necessary libraries:
conda install pytorch torchvision torchaudio cudatoolkit=11.8 -c pytorch -c nvidia
- Install the necessary libraries:
-
Install YuE requirements:
- Run the following command to install the required dependencies:
pip install -r <(curl -sSL https://raw.githubusercontent.com/multimodal-art-projection/YuE/main/requirements.txt)
- Run the following command to install the required dependencies:
-
Save GPU memory by installing FlashAttention 2:
- FlashAttention 2 helps reduce VRAM usage. Install it with:
pip install flash-attn --no-build-isolation
- Note: Make sure the FlashAttention version matches your CUDA version to avoid issues.
- FlashAttention 2 helps reduce VRAM usage. Install it with:
Step 2: Download Inference Code and Tokenizer
-
Install Git LFS (Large File Storage):
- Update your system and install Git LFS:
sudo apt update sudo apt install git-lfs git lfs install
- Update your system and install Git LFS:
-
Clone the YuE repository:
- Download the YuE code with:
git clone https://github.com/multimodal-art-projection/YuE.git cd YuE/inference/
- Download the YuE code with:
-
Download the tokenizer:
- Clone the tokenizer model:
git clone https://huggingface.co/m-a-p/xcodec_mini_infer
- Clone the tokenizer model:
Step 3: Run Inference to Generate Music
Now you’re ready to generate music using YuE. Follow these steps:
-
Basic inference:
-
Use the following command to generate music:
python infer.py \ --cuda_idx 0 \ --stage1_model m-a-p/YuE-s1-7B-anneal-en-cot \ --stage2_model m-a-p/YuE-s2-1B-general \ --genre_txt ../prompt_egs/genre.txt \ --lyrics_txt ../prompt_egs/lyrics.txt \ --run_n_segments 2 \ --stage2_batch_size 4 \ --output_dir ../output \ --max_new_tokens 3000
-
Parameters explained:
--run_n_segments
: Number of lyric sections to generate.--stage2_batch_size
: Adjust based on your GPU memory.--output_dir
: Folder where the generated music will be saved.
-
-
Customizing prompts:
- Edit
genre.txt
andlyrics.txt
to customize the music’s genre and lyrics. - Refer to the prompt engineering guide for advanced tips.
- Edit
-
Avoid out-of-memory (OOM) errors:
- If you encounter OOM errors, reduce
--stage2_batch_size
.
- If you encounter OOM errors, reduce
Advanced: Music In-Context Learning (ICL)
YuE supports generating music based on reference audio tracks. There are two types of ICL:
1. Dual-Track ICL (Recommended):
-
Provide separate vocal and instrumental tracks for better results.
python infer.py \ --cuda_idx 0 \ --stage1_model m-a-p/YuE-s1-7B-anneal-en-icl \ --stage2_model m-a-p/YuE-s2-1B-general \ --genre_txt ../prompt_egs/genre.txt \ --lyrics_txt ../prompt_egs/lyrics.txt \ --run_n_segments 2 \ --stage2_batch_size 4 \ --output_dir ../output \ --max_new_tokens 3000 \ --use_dual_tracks_prompt \ --vocal_track_prompt_path ../prompt_egs/pop.00001.Vocals.mp3 \ --instrumental_track_prompt_path ../prompt_egs/pop.00001.Instrumental.mp3 \ --prompt_start_time 0 \ --prompt_end_time 30
2. Single-Track ICL:
-
Provide one audio track (mix, vocal, or instrumental).
python infer.py \ --cuda_idx 0 \ --stage1_model m-a-p/YuE-s1-7B-anneal-en-icl \ --stage2_model m-a-p/YuE-s2-1B-general \ --genre_txt ../prompt_egs/genre.txt \ --lyrics_txt ../prompt_egs/lyrics.txt \ --run_n_segments 2 \ --stage2_batch_size 4 \ --output_dir ../output \ --max_new_tokens 3000 \ --use_audio_prompt \ --audio_prompt_path ../prompt_egs/pop.00001.mp3 \ --prompt_start_time 0 \ --prompt_end_time 30
- You can use tools like Ultimate Vocal Remover GUI or python-audio-separator to extract vocals and instrumentals.
Final Thoughts
While the quality of the songs generated by YuE AI isn’t quite on par with tools like Udio or Suno AI yet, it’s still an impressive tool, especially considering that it’s completely free and open-source. As more users work on optimizing it, I’m sure we’ll see even better results in the future.