VIP VOICES

  1. Overview
  2. VIP VOICES

Voice Creation

When cloning a voice, it’s important to consider what the AI has been trained on: which languages and what type of dataset. In this case, the following are available &n ...

Style Exaggeration

With the introduction of the newer models, we also added a style exaggeration setting. This setting attempts to amplify the style of the original speaker. It does consume addi ...

Why is my voice monotonous / too chaotic / doesn't sound similar, etc.?

Try changing your voice settings; you'll find them in the "Voice Settings" tab. Each attempt to generate a voice will bring a different result (especially visible at low stability) ...

What audio formats do you support?

We only deliver audio in the MP3 format 44.1kHz/16bit MP3 in 96kbps  

Similarity

The similarity slider dictates how closely the AI should adhere to the original voice when attempting to replicate it. If the original audio is of poor quality and the similar ...

How to create voice ?

Like we are proud to say always  using our platform is easy for everyone ! Here is a quick tutorial about how to create voice -

How can I add pauses?

We are working on features that will enable pauses in text more easily and reliably.   There are a few ways to introduce pauses into the generated speech for now. The most r ...

Why does the voice start whispering or changing / audio degradation?

We know that the voices tend to degrade or start whispering during longer audio generations, and our team is working hard to develop the technology to improve this. This issue is m ...

Examples

Audio outputs and their corresponding text prompts. In this part, we’re highlighting what the text to speech AI can do, particularly in expressing variety of emotions. ...

Models

Multilingual v2   This model has good stability, great language diversity, and fantastic accuracy in cloning voices and accents. Its speed is rather remarkable consider ...

Voice Settings

A guide on using stability, similarity sliders for tailored voice performances in Voiceover Air. Learn how to strike a balance between emotive and consistent audio outputs. Our u ...

Why are some numbers and words not properly pronounced in the correct language?

Numbers, acronyms, and foreign words sometimes default to English when prompted in a different language. For instance, the number "11" or the word "radio", typed in a Spanish promp ...

Pacing

Based on varying user feedback and test results, it’s been theorised that using a singular long sample for voice cloning has brought more success for some, compared to u ...

How can I force a certain pronunciation of a word or name?

We do not have any integrated solution to force a certain pronunciation. However, we are developing a proper solution and the tools to force and fine-tune pronunciations. But, at t ...

Can I use the same cloned/designed voice across languages?

All created voices are expected to maintain most of their original speech characteristics across all languages, including their original accent.

Stability

The stability slider determines how stable the voice is and the randomness between each generation. Lowering this slider introduces a broader emotional range for the voice. A ...

Pause

There are a few ways to introduce a pause or break and influence the rhythm and cadence of the speaker. The most consistent way is programmatically using the syntax < break ...

Overview

A guide on how to generate voiceovers using your voice in Voiceover Air.   Now that you have your voice, it’s time to generate some voiceovers! To convert text to  ...

What characters are accepted when generating audio?

No textual-like characters and punctuation such as {,},<,>,[,] will usually result in low-quality speech generated by the model.

Can I slow down the pace of the voice?

We are working on features that will allow for speed optimization.   

Do you have a list of symbols that have an effect on the output audio?

Unfortunately, we don’t have any such list of symbols.   While the model responds to changes in pronunciation, there isn’t a predefined list of symbols that coul ...

How do you make the voice laugh?

We plan on introducing features that allow emotions such as laughter later in the year.  

Prompting

Effective techniques to guide Voiceover Air AI in adding pauses, conveying emotions, and pacing the speech.

Alternatives

These options are inconsistent and might not always work. We recommend using the syntax above for consistency. One trick that seems to provide the most consistence output - s ...

How does the AI model work?

The AI has been trained on a vast amount of audio. The type of audio varies, but the mostprominent is audiobooks.   This is the context it understands the best, and it provi ...

Volume drops mid-utterance (stability)

When the voice drops in volume, whispers, or distorts, this is most likely a stability issue. How prevalent this is also dependent on the voice used and how wide the dynamic range ...

Speaker Boost

This is another setting that was introduced in the new models. The setting itself is quite self- explanatory – it boosts the similarity to the original speaker. However, ...

How many characters can I use per export ?

You have a maximum of 15,000 characters per production and you can create multiple voices and voice segments with a maximum of 1500. The 1500 is per voice segment, as anything mor ...