March 2, 2018
Background
Machine learning is a field of computer science that gives computer systems the ability to “learn” with data, without being explicitly programmed. Machine learning can be broken down into three broad categories: Recommender, Classification, Clustering.
- Recommender—Recommender systems suggest items based on past behavior or interest. These items can be other users in a social network, or products and services in retail websites. There are some algorithm like Pearson correlation and euclidean distance.
- Classification—Classification (otherwise known as supervised learning) infers or assigns a category to previously unseen data, based on discoveries made from some prior observations about similar data. Examples of classification include email spam filtering and detection of fraudulent credit card transactions.
- Clustering—A clustering system (also known as unsupervised learning) groups data together into clusters. It does so without learning the characteristics about related data. Clustering is useful when you’re trying to discover hidden structures in your data, such as user habits.
Broadly, there are 3 types of Machine Learning Algorithms..
Supervised Learning This algorithm consist of a target / outcome variable (or dependent variable) which is to be predicted from a given set of predictors (independent variables). Using these set of variables, we generate a function that map inputs to desired outputs. The training process continues until the model achieves a desired level of accuracy on the training data. Examples of Supervised Learning: Regression, Decision Tree, Random Forest, KNN, Logistic Regression etc.
Unsupervised Learning In this algorithm, we do not have any target or outcome variable to predict / estimate. It is used for clustering population in different groups, which is widely used for segmenting customers in different groups for specific intervention. Examples of Unsupervised Learning: Apriori algorithm, K-means.
Reinforcement Learning: Using this algorithm, the machine is trained to make specific decisions. It works this way: the machine is exposed to an environment where it trains itself continually using trial and error. This machine learns from past experience and tries to capture the best possible knowledge to make accurate business decisions. Example of Reinforcement Learning: Markov Decision Process
More Details
Below is series of blog covering most populat machine learning algorithm with apache spark.