ACE-Step Music Generator

v2.0.0 What's new

ACE-Step Music Generator

Overview

The ACE-Step Music Generator operator integrates the ACE-Step model into TouchDesigner, enabling powerful text-to-music, audio-to-audio, and audio editing workflows. It functions as a client for the SideCar operator, which handles the intensive processing.

Features

Automatic Repository Cloning: The first time you generate, the operator will automatically prompt you to download and clone the required ACE-Step code repository.
Full ACE-Step Integration: Access all core ACE-Step features, including text-to-music, audio-to-audio, repaint, retake, and extend.
SideCar Architecture: All intensive computation (model loading, inference, dependency management) is handled by the external SideCar process, ensuring TouchDesigner remains responsive.
Real-time Visualization: Includes a professional, real-time audio waveform visualizer.

Requirements

SideCar Environment Setup: The SideCar operator runs in its own Python environment. You are responsible for installing all necessary dependencies for the ACE-Step model within that environment. This includes torch, torchaudio, and all packages listed in the official ACE-Step requirements.txt. This operator does not manage Python packages.
Git: Git must be installed and accessible in your system’s PATH. The operator uses it to clone the ACE-Step repository.
Running SideCar: The SideCar server must be running and connected for this operator to function.

Input/Output

Input: Text prompts, lyrics, and optional reference audio files.
Output: Generated audio files (WAV format) and real-time audio waveform visualizations.

Parameters

ACE-Step Page

Status (Status) op('acestep').par.Status Str

Displays the current status of the operator.

Default:: -

Active (Active) op('acestep').par.Active Toggle

Indicates if a generation request is currently active.

Default:: Off

Currentaudio (Currentaudio) op('acestep').par.Currentaudio File

Path to the currently loaded audio file. Used by Load Settings.

Default:: "" (Empty String)

Playhead (Playhead) op('acestep').par.Playhead Float

Controls the playback position of the current audio (0.0 to 1.0).

Default:: 0

Autoplay (Autoplay) op('acestep').par.Autoplay Toggle

Automatically plays the audio after generation.

Default:: On

Generate (Generate) op('acestep').par.Generate Pulse

Triggers the music generation process based on current settings.

Default:: None

Core Generation Header

Prompt (Prompt) op('acestep').par.Prompt Str

Descriptive tags, genres, or scene descriptions. Used for text2music, audio2audio, and as a basis for edit/repaint.

Default:: upbeat pop, catchy melody, female singer

Lyrics (Lyrics) op('acestep').par.Lyrics Str

Enter lyrics with structure tags like [verse], [chorus]. Use \n for newlines. Used for text2music, audio2audio, and as a basis for edit/repaint.

Default:: [verse]\nSun is shining bright today\nFeeling happy, come what may

Duration (Duration) op('acestep').par.Duration Float

Desired duration of the generated audio in seconds.

Default:: 10

Infersteps (Infersteps) op('acestep').par.Infersteps Int

Number of inference steps. Higher can improve quality but takes longer.

Default:: 60

Manualseed (Manualseed) op('acestep').par.Manualseed Int

Seed for reproducibility. -1 for random. Affects initial generation.

Default:: -1

Guidancescale (Guidancescale) op('acestep').par.Guidancescale Float

Main classifier-free guidance scale. Used if CFG Type is not 'Double Condition'.

Default:: 15

Omegascale (Omegascale) op('acestep').par.Omegascale Float

Omega scale factor for APG guidance type.

Default:: 10

Guidancescaletext (Guidancescaletext) op('acestep').par.Guidancescaletext Float

Guidance scale for text prompt when CFG Type is 'Double Condition'.

Default:: 7.5

Guidancescalelyric (Guidancescalelyric) op('acestep').par.Guidancescalelyric Float

Guidance scale for lyrics when CFG Type is 'Double Condition'.

Default:: 7.5

Audio2Audio Mode [ Euler Scheduler Only ] Header

Audio2audioenable (Audio2audioenable) op('acestep').par.Audio2audioenable Toggle

Enable audio-to-audio generation. Uses Prompt & Lyrics as guidance if provided.

Default:: Off

Refaudioinput (Refaudioinput) op('acestep').par.Refaudioinput File

Path to the reference audio file for Audio2Audio mode.

Default:: "" (Empty String)

Refaudiostrength (Refaudiostrength) op('acestep').par.Refaudiostrength Float

Strength of the reference audio influence (0.0 to 1.0).

Default:: 0.6

Output Settings Header

Outputfolder (Outputfolder) op('acestep').par.Outputfolder Folder

Folder to save the generated WAV file. Relative to project or absolute.

Default:: audio_out

Outputfilename (Outputfilename) op('acestep').par.Outputfilename Str

Name of the generated WAV file.

Default:: ace_step_output.wav

Uniquesuffix (Uniquesuffix) op('acestep').par.Uniquesuffix Toggle

If True, appends a timestamp to the filename to prevent overwriting.

Default:: On

Initialize (Initialize) op('acestep').par.Initialize Pulse

Initializes the ACE-Step Model. This parameter is read-only and handled internally.

Default:: None

Unloadmodel (Unloadmodel) op('acestep').par.Unloadmodel Pulse

Releases the model from memory via SideCar.

Default:: None

Loadsettings (Loadsettings) op('acestep').par.Loadsettings Pulse

Load generation parameters from the JSON associated with the Current Audio file.

Default:: None

Edit Page

Editaudio (Editaudio) op('acestep').par.Editaudio Toggle

Master toggle to enable audio editing modes on this page.

Default:: Off

Audio Editing Configuration Header

Srcaudiopath (Srcaudiopath) op('acestep').par.Srcaudiopath File

Path to the source audio file for Edit, Repaint, Retake, Extend tasks.

Default:: "" (Empty String)

Extend / Repaint / Retake Header

Retakeseeds (Retakeseeds) op('acestep').par.Retakeseeds Int

Seed for retake/repaint/extend variations. -1 for random.

Default:: -1

Retakevariance (Retakevariance) op('acestep').par.Retakevariance Float

Amount of variance for retake/repaint (0.0 to 1.0).

Default:: 0

Repaintstart (Repaintstart) op('acestep').par.Repaintstart Float

Start time in seconds for repaint. For extend, negative values pad left. 0 for retake.

Default:: 0

Repaintend (Repaintend) op('acestep').par.Repaintend Float

End time in seconds for repaint. For extend, values beyond original duration extend right. Original duration for retake.

Default:: 5

Transitiontime (Transitiontime) op('acestep').par.Transitiontime Float

Duration of the transition/crossfade in seconds for repaint/extend modes. 0 for abrupt change.

Default:: 0

Edit Audio Content [ Slower ] Header

Editoriginalprompt (Editoriginalprompt) op('acestep').par.Editoriginalprompt Str

The original prompt used to generate the Source Audio. Required for 'Edit Audio Content' mode.

Default:: "" (Empty String)

Editoriginallyrics (Editoriginallyrics) op('acestep').par.Editoriginallyrics Str

The original lyrics used to generate the Source Audio. Required for 'Edit Audio Content' mode.

Default:: "" (Empty String)

Edittargetprompt (Edittargetprompt) op('acestep').par.Edittargetprompt Str

Target prompt for 'Edit Audio Content' mode. If empty, uses main prompt.

Default:: "" (Empty String)

Edittargetlyrics (Edittargetlyrics) op('acestep').par.Edittargetlyrics Str

Target lyrics for 'Edit Audio Content' mode. If empty, uses main lyrics.

Default:: "" (Empty String)

Editnmin (Editnmin) op('acestep').par.Editnmin Float

Min influence for audio editing (0.0 to 1.0).

Default:: 0.65

Editnmax (Editnmax) op('acestep').par.Editnmax Float

Max influence for audio editing (0.0 to 1.0).

Default:: 0.95

Editnavg (Editnavg) op('acestep').par.Editnavg Int

Averaging window size for editing.

Default:: 10

Loadsrccredentials (Loadsrccredentials) op('acestep').par.Loadsrccredentials Pulse

Loads prompt and lyrics from the _input_params.json associated with the Src Audio Path.

Default:: None

Advanced Page

Advanced Guidance Control Header

Guidanceinterval (Guidanceinterval) op('acestep').par.Guidanceinterval Float

Guidance interval for CFG.

Default:: 0.98

Guidanceintervaldecay (Guidanceintervaldecay) op('acestep').par.Guidanceintervaldecay Float

Decay rate for guidance interval.

Default:: 1

Minguidancescale (Minguidancescale) op('acestep').par.Minguidancescale Float

Minimum guidance scale.

Default:: 1

ERG Control Header

Usergtag (Usergtag) op('acestep').par.Usergtag Toggle

Enable ERG (Exponentially Smoothed Moving Average Guidance) for prompt/tags.

Default:: Off

Userglyric (Userglyric) op('acestep').par.Userglyric Toggle

Enable ERG for lyrics.

Default:: Off

Usergdiffusion (Usergdiffusion) op('acestep').par.Usergdiffusion Toggle

Enable ERG for diffusion process.

Default:: Off

Other Advanced Parameters Header

Useoss (Useoss) op('acestep').par.Useoss Toggle

Enable Optimal Step Size scheduling. Only effective if Scheduler Type is Euler.

Default:: Off

Osssteps (Osssteps) op('acestep').par.Osssteps Str

Steps for OSS scheduling, comma-separated. Active if 'Use Optimal Step Size' is ON and Scheduler is Euler.

Default:: 50,100,150,200

Device & Precision Header

Deviceid (Deviceid) op('acestep').par.Deviceid Int

GPU device ID to use (e.g., 0, 1). Requires re-initialize.

Default:: 0

Usebf16 (Usebf16) op('acestep').par.Usebf16 Toggle

Use bfloat16 for faster inference (if supported). Uncheck for macOS or if errors occur. Requires re-initialize.

Default:: On

Torchcompile (Torchcompile) op('acestep').par.Torchcompile Toggle

Optimize model with torch.compile() for faster inference (Not supported on Windows by ACE-Step). Requires re-initialize.

Default:: Off

Model Configuration Header

Modelpath (Modelpath) op('acestep').par.Modelpath Folder

ACE-Step Repository Path. This parameter is read-only and automatically set.

Default:: "" (Empty String)

Checkpointdir (Checkpointdir) op('acestep').par.Checkpointdir Folder

Optional directory for model checkpoints.

Default:: "" (Empty String)

About Page

Bypass (Bypass) op('acestep').par.Bypass Toggle

Bypass the operator's functionality.

Default:: Off

Showbuiltin (Showbuiltin) op('acestep').par.Showbuiltin Toggle

Show built-in TouchDesigner parameters.

Default:: Off

Version (Version) op('acestep').par.Version Str

Version of the operator.

Default:: None

Lastupdated (Lastupdated) op('acestep').par.Lastupdated Str

Date of the last update.

Default:: None

Creator (Creator) op('acestep').par.Creator Str

Creator of the operator.

Default:: None

Website (Website) op('acestep').par.Website Str

Website for more information.

Default:: None

Chattd (Chattd) op('acestep').par.Chattd OP

Reference to the ChatTD operator.

Default:: None

Sidecaroperator (Sidecaroperator) op('acestep').par.Sidecaroperator OP

Reference to the SideCar operator handling requests.

Default:: None

Usage Examples

Quick Start: Generating Music

Set up the SideCar: Ensure the SideCar is running and its Python environment is fully configured with all ACE-Step dependencies.
Press Generate: In the ACE-Step operator’s parameter panel, click the Generate Music pulse.
Clone the Repo: If this is your first time, a dialog will ask for permission to download the ACE-Step repository. Click Download.
Generate: The request will be sent to the SideCar for processing. The generated audio will appear in the visualizer and can be automatically played.

Integration Examples

The ACE-Step operator is designed to integrate seamlessly with the SideCar operator for offloading heavy computation. It also connects with the ChatTD operator for managing Python environments and asynchronous operations.

Best Practices

Dependency Management: Ensure your SideCar’s Python environment has all necessary ACE-Step dependencies installed. The operator does not manage these.
Git Installation: Have Git installed and in your system’s PATH for automatic repository cloning.
Responsible Use: Be mindful of the ACE-Step model’s disclaimer regarding potential copyright infringement, cultural sensitivity, and harmful content generation. Verify originality and disclose AI involvement.

Troubleshooting

SideCar Not Connected: If generation fails, ensure the SideCar server is running and connected. Check the SideCar Operator parameter on the About page to confirm it’s referencing the correct SideCar instance.
Missing Dependencies: If you encounter errors related to missing Python packages (e.g., torch, librosa), install them manually in your SideCar’s Python environment.
Repository Cloning Issues: If the repository fails to clone, check your internet connection and Git installation. Review the TouchDesigner console for detailed error messages.

Research Citation

The ACE-Step model is a significant contribution to the field of AI music generation. If you use this operator or the underlying model in your research, please consider citing the original work.

Research & Licensing

ACE-STEP Project

The ACE-STEP project is an open-source initiative focused on advancing AI music generation.

ACE-Step: A Step Towards Music Generation Foundation Model

ACE-Step is a foundation model for music generation designed to overcome limitations of existing approaches by integrating diffusion-based generation with advanced encoding and transformation techniques.

Technical Details

Combines diffusion with DCAE and linear transformer.
Uses MERT and m-hubert for semantic alignment (REPA).
Outperforms LLM-based models in speed and coherence.
Supports various music generation tasks including text-to-music and audio-to-audio.

Research Impact

Overcomes limitations of existing approaches in music generation.
Provides a holistic architectural design for state-of-the-art performance.
Enables original music generation across diverse genres for creative production, education, and entertainment.

Citation

@misc{gong2025acestep,
  title={ACE-Step: A Step Towards Music Generation Foundation Model},
  author={Junmin Gong, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo}, 
  howpublished={\url{https://github.com/ace-step/ACE-Step}},
  year={2025},
  note={GitHub repository}
}

Key Research Contributions

Novel open-source foundation model for music generation.
Integrates diffusion-based generation with Sana's Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer.
Leverages MERT and m-hubert to align semantic representations (REPA) during training for rapid convergence.
Achieves faster synthesis (up to 4 minutes of music in 20 seconds on A100 GPU) and superior musical coherence compared to LLM-based models.
Preserves fine-grained acoustic details, enabling advanced control mechanisms like voice cloning, lyric editing, remixing, and track generation.

License

Apache License 2.0 - This model is freely available for research and commercial use.

ACE-Step Music Generator

Ace Step Music Generator v2.0.0 [ July 31, 2025 ]

ACE-Step Music Generator

Overview

Features

Requirements

Input/Output

Parameters

ACE-Step Page

Edit Page

Advanced Page

About Page

Usage Examples

Quick Start: Generating Music

Integration Examples

Best Practices

Troubleshooting

Research Citation

Research & Licensing

ACE-STEP Project

ACE-Step: A Step Towards Music Generation Foundation Model

Technical Details

Research Impact

Citation

Key Research Contributions

License