Human Action Recognition (HAR)

We utilize state of the art human action recognition models to identify people doing things in videos.

Human action recognition is a new, active field of research in the deep learning. The goal is to identify human actions in video from various input streams (e.g. video or audio). We've applied human action recognition to the porn domain. The pornography domain is interesting from a technical perspective because of its inherent difficulties. Light variations, occlusions, and a tremendous variations of different camera angles and filming techniques (POV, dedicated camera person) make position (action) recognition hard. We can have two identical positions (actions) and yet be captured in such a different camera perspective to entirely confuse the model in its predictions. We use three different input streams in order to get the best possible results: rgb frames, human skeleton, and audio. Correspondingly three different models are trained on these input streams and their results are merged through late fusion. Using these techniques, and our massive library library of tagged content on, we've been able to achieve 90% accuracy. We'll be offering a commercially available API soon!