# Ensemble Modeling – Bagging

Introduction Ensemble learning is a machine learning paradigm where multiple models (often called “weak learners”) are trained to solve the same problem and combined to get better results. There are three most common types of ensembles: Bagging,Boosting,Stacking. In this post we will start with bagging, and then move on to boosting and stacking in separate … Continue reading Ensemble Modeling – Bagging

Gradient Descent is one of the most fundamental optimization techniques used in Machine Learning. But what is a gradient? On what do we descent down and what do we even optimize in the first place? Those might be some of the questions which come to mind when having the first encounters with Gradient Descent. Let’s … Continue reading Gradient Descent from Scratch

# What are Support Vector Machines?

Support vector machines are a type of machine learning classifier, arguably one of the most popular kinds of classifiers. Support vector machines are especially useful for numerical prediction, classification, and pattern recognition tasks. Support vector machines operate by drawing decision boundaries between data points, aiming for the decision boundary that best separates the data points … Continue reading What are Support Vector Machines?

# A short introduction to Time Series

Are you trying to predict time series but don’t know where to start? This blog post will provide a comparison of the most prominent techniques and show you how to implement them. Business Problem Time Series prediction can be used in a number of business areas. You can think of a number of areas and … Continue reading A short introduction to Time Series

# Interview Question: What Machine Learning Metric to Use

As part of our interview cycle, candidates work with some data and build a simple model. After we talk through the modeling and data work, I ask them to come up with a business case for the model. Once they have done so, I follow up with: How would you measure the success of this … Continue reading Interview Question: What Machine Learning Metric to Use

# Simple Guide to the confusion matrix

A confusion matrix is a table that is often used to describe the performance of the classification model (or “classifier”) on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing. Confusion matrix A classification problem can be evaluated … Continue reading Simple Guide to the confusion matrix

# Overfitting in Machine Learning

In this guide, we’ll walk you through exactly what overfitting means, how to spot it in your models, and what to do if your model is overfitting. By the end, you’ll know how to deal with this tricky problem once and for all. Table of Contents Examples of Overfitting Signal vs. Noise Goodness of fit … Continue reading Overfitting in Machine Learning

# Data Imputation Techniques in Machine Learning

Have you come across the problem of handling missing data/values for respective features in machine learning (ML) models during prediction time? This is different from handling missing data for features during training/testing phase of ML models. Data scientists are expected to come up with an appropriate strategy to handle missing data during, both, model training/testing phase and also model prediction time … Continue reading Data Imputation Techniques in Machine Learning

# Difference between classification and association algorithms

The term data mining refers loosely to finding relevant information or discovering knowledge from a large volumes of data. Like knowledge discovery in artificial intelligence, data mining attempts to discover statistical rules and patterns automatically from data. Knowledge discovered from a database can be represented by a set of rules. The following is an example … Continue reading Difference between classification and association algorithms

# Federated Learning

Introduction Federated Learning (FL) is a distributed machine learning approach which enables training  on a large corpus of decentralised data residing on devices like mobile phones. FL is one instance of the more general approach of “brining the code to the data, instead of the data to the code” and addresses the fundamental problems of … Continue reading Federated Learning