this post was submitted on 26 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

I have a collection of audio files from comedy skits, and I’m looking to train a neural network to autonomously decide when to trigger a “laughing” sound effect. The catch? I want to avoid manually setting cue points for laughter. Instead, I’m aiming for the neural network to determine the right moments to insert laughter, based on the content of the skit.

you are viewing a single comment's thread
view the rest of the comments
[–] saintshing@alien.top 1 points 9 months ago

Does this work in real time or your model has access to the entire sequence so you can use context from before and after the current time point?

You have to be careful with leaking when you preprocess the training data if you remove the laughter and leave an silent time interval.

The text based approach may work but it may not give you a precise timing.