MusicLM: Turn Ideas Into Music

[vc_row][vc_column][vc_headings linewidth=”0″ borderwidth=”1″ borderclr=”#000000″ title=”MusicLM” google_fonts=”font_family:Comfortaa%3A300%2Cregular%2C700|font_style:700%20bold%20regular%3A700%3Anormal” titlesize=”60″ titleclr=”#000000″]

Turn Ideas Into Music

[/vc_headings][vc_single_image image=”3056″ alignment=”center” onclick=”custom_link” img_link_target=”_blank”][vc_column_text]MusicLM is a model that can generate music from text descriptions, such as “a calming guitar melody in 6/8 time signature riff”. It was developed by Google Research and can produce high-fidelity music at 24 kHz that remains consistent over several minutes. MusicLM uses a hierarchical sequence-to-sequence approach to model the structure and style of music. It can also be conditioned on both text and a melody, such as a whistled or hummed tune, and transform it according to the text caption. MusicLM is an experimental tool that aims to empower the creative process of musicians and music enthusiasts.[/vc_column_text][vc_separator][/vc_column][/vc_row][vc_row][vc_column width=”1/3″][vc_btn title=”Visit Project” color=”warning” align=”center” i_align=”right” i_icon_fontawesome=”fas fa-external-link-alt” add_icon=”true” link=”url:https%3A%2F%2Fgoogle-research.github.io%2Fseanet%2Fmusiclm%2Fexamples%2F”][/vc_column][vc_column width=”1/3″][vc_btn title=”Project Paper” color=”mulled-wine” align=”center” i_align=”right” i_icon_fontawesome=”fas fa-external-link-alt” add_icon=”true” link=”url:http%3A%2F%2Farxiv.org%2Fabs%2F2301.11325″][/vc_column][vc_column width=”1/3″][vc_btn title=”Project Dataset” color=”juicy-pink” align=”center” i_align=”right” i_icon_fontawesome=”fas fa-external-link-alt” add_icon=”true” link=”url:https%3A%2F%2Fwww.kaggle.com%2Fdatasets%2Fgoogleai%2Fmusiccaps”][/vc_column][/vc_row][vc_row][vc_column][vc_headings style=”theme4″ borderclr=”#000000″ style2=”image” title=”ScreenShots” google_fonts=”font_family:Comfortaa%3A300%2Cregular%2C700|font_style:700%20bold%20regular%3A700%3Anormal” lineheight=”2″ titlesize=”40″ titleclr=”#000000″ image_id=”2996″][/vc_headings][vc_separator][vc_media_grid element_width=”6″ gap=”30″ item=”masonryMedia_Default” grid_id=”vc_gid:1685401325611-2c12a536ea227d15a078683f5018b06c-3″ include=”4000,4001″][/vc_column][/vc_row][vc_row][vc_column][vc_separator border_width=”2″][vc_headings style=”theme4″ borderclr=”#000000″ style2=”image” title=”Features” google_fonts=”font_family:Comfortaa%3A300%2Cregular%2C700|font_style:700%20bold%20regular%3A700%3Anormal” lineheight=”3″ titlesize=”40″ titleclr=”#000000″ image_id=”2871″][/vc_headings][vc_separator color=”sandy_brown” border_width=”3″][vc_column_text]✅ Audio Generation From Rich Captions

✅ Long Generation

✅ Story Mode

✅ Text and Melody Conditioning

✅ Painting Caption Conditioning

✅ 10s Audio Generation From Text

✅ Generation Diversity[/vc_column_text][vc_separator][/vc_column][/vc_row]