We've cooked up a bunch of improvements designed to reduce friction and make the.


Generative AI makes developers lives much easier - but by how much?
I have been learning German for the past year, and one of the things I thought would be personally useful would be to generate many conversations in German - via voice, which be extremely useful for me to learn German. The audio I created can be found here – here's a rundown of what I learned while doing this.
This in and of itself had 2 major steps:
I used the following prompt to generate "speakers" via my LLM, for who will be talking to each other:
For a conversation (that you will write later), only give me some characters for the conversation, there should be
a maximum of 3 female speakers and 4 male speakers in the conversation
The conversation happens in Germany, so try to give German names.
Write down all the speakers in the conversation in the format:
```
---
number of female speakers : <num_female_speakers>
number of male speakers : <num_male_speakers>
<name> : <Male/Female>
<name> : <Male/Female>
<name> : <Male/Female>
....
---
For a conversation (that you will write later), only give me some characters for the conversation, there should be
a maximum of 3 female speakers and 4 male speakers in the conversation
write down all the speakers in the conversation in the format
```
---
<name> : <Male/Female>
<name> : <Male/Female>
<name> : <Male/Female>
....
---
```
Without me explicitly asking it to write down how many it speakers of a particular gender it would generate explicity before it generated the names and genders, it, often produced 4 female speakers even though I only requested 3.
I used the following code to create a chat transcript from the list of speakers:
with the following speakers
{speakers_raw}
write a conversation in the format
```
---
[DE] <speaker name> : <dialogue>
[EN] <speaker name> : <dialogue>
[DE] <speaker name> : <dialogue>
[EN] <speaker name> : <dialogue>
[DE] <speaker name> : <dialogue>
[EN] <speaker name> : <dialogue>
...
---
```
Ensure the English translation is always in the directly next line,
and dialogues between two participants have a empty line between them (as shown in the example) where the conversation is first given in german and then English.
Ensure you start and end the main part of the output with 3 minuses (---), as displayed above, which in this case will be the entire conversation.
The conversation should be about '{conversation_theme}'
Ensure the conversation gets into complex themes and narratives, and include a discussions of the problems people face, and what they like about the industry.
{speakers_raw}
was substituted by the characters generated by the previous step, and so was {conversation_theme}
which I got by asking to generate a list of conversations.
I used Bark's conversational code to generate the audio, you can find the code in the bottom part of the notebook here https://github.com/suno-ai/bark/blob/main/notebooks/long_form_generation.ipynb
After generating all the audio, I still found certain bits of audio, having major issues, often random screams or "tape scratches" within the audio, to the speaker saying completely unexpected phrases in the audio.
Neither generated text, nor audio, was ever 100% reliable, and needed a means to seperate good audio from bad audio, and keeping this in mind before making any assumptions and having constantly checked the audio would've saved me a lot of time.
I wasn't able to clean up the audio, however, I found it good enough for my learning purposes. You can find all the generated audio over here : https://german-audio-stuff.dreamymagic.art
The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.