IRPresswire

NEWS PROVIDED BY

Im A Stock Trader

August 15, 2025

Mureka V7.5 Goes Live: Elevating AI Music Creation to New Heights

Singapore, Aug. 15, 2025 — The SkyWork AI Technology Release Week officially kicked off on August 11. From August 11 to August 15, SkyWork launched one new model each day for five consecutive days, introducing cutting-edge models that advance core multimodal AI scenarios.

During the week SkyWork released SkyReels-A3, Matrix-Game 2.0, Matrix-3D, Skywork UniPic 2.0, and Skywork Deep Research Agent. On August 15 the company launched Mureka V7.5, marking the conclusion of the release week.

Mureka V7.5 makes major gains in interpreting and generating Chinese songs. The model shows significant improvements in vocal timbre and instrumental technique, and in lyric articulation and emotional expression.

Building on a strong understanding of Chinese musical styles — from traditional folk and opera to classic Mandopop and contemporary folk — Mureka’s comprehension module captures the cultural nuances and artistic subtleties required to reproduce authentic performances.

To produce more authentic and emotionally expressive AI vocals, SkyWork significantly enhanced its automatic speech recognition (ASR) technology and adapted it for musical characteristics. The ASR now analyzes vocal performances at a granular level, going beyond basic lyric recognition to examine breath control, emotional dynamics, articulation, phrase structure and natural breathing points. Combined with precise section recognition, this yields synthesized vocals with much greater structural coherence and perceptual realism.

High-resolution vocal data captured by the ASR is fed back into the generative model, improving naturalness, breath realism and emotional expressiveness while reducing mechanical artifacts. This feedback loop enables AI-generated songs to achieve near-human fluidity, especially when reproducing the rhythmic phrasing and breath control typical of Chinese vocal styles.

SkyWork says this culturally informed comprehension plus song-optimized ASR is its competitive advantage in Chinese music generation.

Mureka V7.5 not only understands melodic and rhythmic production requirements but can deeply interpret and replicate the nuanced emotions and artistic expressions inherent to different cultural contexts — providing a strong technical foundation for culturally authentic, aesthetically compelling music production.

For voice synthesis, SkyWork has introduced MoE-TTS, a Mixture-of-Experts-based character-descriptive text-to-speech framework designed for out-of-domain descriptions. MoE-TTS enables precise control of vocal characteristics via natural-language inputs (for example, “a crystal-clear youthful voice with magnetic vocal fry”). Using only open-source training data, the framework achieves character consistency comparable to or better than proprietary commercial systems.

Descriptive TTS has strong potential for virtual assistants, audio content creation and digital humans, but research has been held back by scarce description datasets and limited generalization to open-domain semantics. These limitations often cause mismatched vocal outputs when models are given figurative or novel descriptive language.

MoE-TTS addresses this by integrating a pre-trained textual large language model (LLM) with specialized speech expert modules. The transformer-based architecture features modality routing that decouples text and voice pathways so each can be optimized independently. By keeping text parameters frozen and aligning modalities efficiently, the framework delivers robust cross-modal generalization without degrading the LLM’s textual knowledge.

In benchmark evaluations across in-domain and out-of-domain description sets, MoE-TTS outperformed leading proprietary TTS models on several acoustic-control metrics, notably Stylistic Expressiveness Alignment (SEA) and Overall Alignment (OA). Those precision gains translate to better matching of complex linguistic descriptions in synthesized speech.

SkyWork presents MoE-TTS as the first reproducible out-of-domain descriptive TTS solution for the research community and highlights the effectiveness of modality-decoupled architectures with frozen knowledge transfer. The company expects this approach to accelerate the shift from closed-label control to natural-language free-form control across digital humans, virtual assistants and immersive content creation.

MoE-TTS is under active development and will be integrated into the Mureka-Speech platform as a foundational model for character voice synthesis, providing developers and creators with open, efficient and customizable descriptive speech capabilities.

Experience the all-new V7.5 Model — Unlock infinite possibilities in music creation!

Try it now: www.mureka.ai

Source: Skywork AI pte ltd