Speaker diarisation explained: how AI knows who said what

What is speaker diarisation?

Speaker diarisation is the process in which an AI model analyses audio and works out who is saying something — not just what is being said. In a transcript it looks like this:

Speaker A: Good morning, how are you? Speaker B: I'm well, thank you. And you?

It sounds simple, but it is one of the hardest tasks in speech technology.

How does it work?

The model listens for voice characteristics such as:

Pitch
Speaking pace
Timbre (tone colour)
Breathing patterns

It then clusters audio into segments that appear to belong to the same voice. The model does not recognise names or identities — it only knows that "this voice is different from that voice".

When does it work well?

Under ideal conditions our diarisation reaches over 95% accuracy. The conditions:

2 to 4 speakers
Clearly distinct voices (for example male and female)
Good recording quality without background noise
Speakers do not switch too quickly (no interruptions)

When does it fail?

Difficult situations include:

Two people talking over each other
Poor recording quality (distant microphone, noise)
Speakers with very similar voices
Many speakers (more than 5 already gets tricky)

In those cases the AI may swap speakers or split them incorrectly. That is exactly why we always provide an edit mode where you can correct speaker labels.

Tips for better results

Use a good microphone — preferably a lavalier or headset per speaker
Avoid cross-talk — ask people to let each other finish
For online meetings — use the "per-speaker recording" option in Zoom or Teams when available; we combine them automatically
Test first — for important work, do a short test recording to check the quality

New in ForgetLess: speaker colour coding

Starting this week you will see coloured speaker labels in our transcript view. Each speaker gets their own colour, so you can grasp the structure of the conversation at a glance. Especially handy for focus groups or panel discussions.

Give it a try on your next transcript!