Helping you become a computer vision expert one term at a time!📖 🤖

March 30, 2021
3
 min read
By 
Trisha

Computer vision started as an MIT undergrad summer project in 1966 that was supposed to be done and dusted in that one summer. Although this did not pan out as planned, the technology has grown rapidly since then, and now finds its application in several industries.  

Here we take a look at the basic (and sometimes more complex) terminologies related to this field.

Computer Vision
A branch of tech which makes sense of visual content (images, videos, graphics). All visual content is basically a collection of pixel-value; computer vision considers these pixel values and tries to understand what they signify/represent.

Machine Learning
A method in which computers are taught by example.
In traditional programming, the “rules” have to be written; the program then converts inputs to outputs by following these rules. In machine learning, you give the program varied examples of inputs and desired outputs, and the program uses trial and error to write and learn the rules by itself.

Neuron - aka Parameter/Perceptron
A mathematical function that takes an assorted number of inputs and outputs, multiplies them together with its weights [the weights also change with time as the network learns] and gives a single value output. This output is then fed as an input into other neurons.

Neural Networks
An arrangement of neurons/perceptrons in a way that the architecture understands the underlying principles and relationships in a dataset. The working of these Neural Networks mimics the working of neurons in a human brain.

Dataset
The set of data and ground truth of outputs that are used to train a machine learning model. For instance, in case of object detection, the set of data would comprise images and the ground truth would be the annotations that you want your model to learn.

Machine Learning model
A mathematical model that recognises certain types of patterns. You train a model over a set of data, providing it with an algorithm that it can use to learn from those data.

Did you know?
A neural network is a subclass of machine learning models.

Image / Video keywording
The ability to detect concrete and abstract contents inside an image or video.

Facial recognition
The ability to identify faces in images and videos and provide valuable information about them.

Feature vector
The mathematical representation of remarkable qualities or features of an object in a data in the form of a list of numbers. This mathematical representation is used for statistical analysis.

On-Premise
On-premises software is technology that is installed and run on devices on the premises of the individual or organization using the software, rather than at a remote facility such as a server farm or cloud.

Metrics
Metrics are similar to a student’s markshee. They’re used to evaluate the performance of a machine learning system. Most commonly used metrics are accuracy, precision, F1-score, Area under the Curve (AUC) & Receiver Operating Characteristic (ROC).

Custom Concepts
The Mobius SDK provides nearly 11k keywords aka concepts out-of-the-box. However the user may want to create new and highly specific concepts. Therefore, Mobius Labs provides the ability to train any number of new concepts. This way users can define new concepts instead of using a predefined set of concepts.

Shot Detection
An important concept in videos is the one of ‘shots’. A shot is a sequence of frames where the semantics (that is, the content) only changes slowly. In order to perform a meaningful analysis of a video, it is highly beneficial to identify so-called ‘video shot boundaries’, or ‘shot boundaries’ for short.

Video Highlighting
The highlighting feature allows to obtain ‘highlight scores’ for video frames, which allows to identify the most important/interesting parts of a video. This can be very useful for example in order to create a summary of a video that can be shown if someone is browsing through a video database.

Similarity Search
The Similarity search module of the Mobius Vision SDK allows users to find visually similar images to an input image.

Learn more about Superhuman Vision™

Discover
Authors
Trisha
Trisha
 
Mandal
Content and Communications
Trisha manages all things related to content and communications at Mobius Labs. She recently completed a Master’s degree from Humboldt Universität, Berlin and now spends all her time writing, reading, film photography-ing and taking care of her very moody Westie named Fubi.
The Journey of Giving Superhuman Vision™ to Computers

The Journey of Giving Superhuman Vision™ to Computers

August 16, 2021
5
 min read
Discover how Superhuman Search™ can make your visual searches easy

Discover how Superhuman Search™ can make your visual searches easy

August 16, 2021
4
 min read
Automatic Facial Expression Analysis and Customized Expression Tagging

Automatic Facial Expression Analysis and Customized Expression Tagging

August 16, 2021
10
 min read