Encodings to Embeddings

 

From Encodings to Embeddings





In this article, we will talk about two fundamental concepts in the fields of data representation and machine learning: Encoding and Embedding. The content of this article is partly taken from one of my lectures in CS246 Mining Massive DataSet (MMDS) course at Stanford University. I hope you find it useful.

Introduction
All Machine Learning (ML) methods work with input feature vectors and almost all of them require input features to be numerical. From a ML perspective, there are four types of features:

Numerical (continuous or discrete): numerical data can be characterized by continuous or discrete data. Continuous data can assume any value within a range whereas discrete data has distinct values. Example of continues numerical variable is `height`, and an example of discrete numerical variable is `age`.

Post a Comment

0 Comments