AI Visual Speech Recognition

Video Lip Reading

Upload a video, select a time range, and get a text transcript from lip movements.

Tips for best results

For the best accuracy, it's better your videos meet the following criteria:

  • The speaker's face should be well-lit and clearly visible.
  • Both frontal and profile views work — for a profile view, at least half of the lips must be visible.
  • Ideally, record in good lighting conditions.
  • Avoid videos where the speaker's mouth is obscured (by masks, hands, or objects).
  • The closer the camera is to the speaker's face, the better — while keeping the full face in frame.
Mandatory: only one person's face should be visible at the same time.