top of page
Here I blog on all topics related to Big Data and Data Science. Articles could be of types: Executive Summaries, Tools analysis, Tool Comparisons, Architectural Patterns, Introductions to complex topics, and 'How to' or Tutorial types that share code snippets.
Search

Decision Trees through an Example
We have so far seen what decision trees are, why we need them, what are certain measures that help in creating a decision tree and how...
Sai Geetha M N
Nov 7, 20225 min read
321 views
0 comments

Decision Trees - Feature Selection for a Split
In the previous two articles "Decision Trees- How to decide the split?" and "Decision Trees - Homogeneity Measures", I have laid the...
Sai Geetha M N
Sep 17, 20224 min read
963 views
1 comment

Decision Trees - Homogeneity Measures
Having had an introduction to what is homogeneity and what are the 3 basic types of measures that can be used in the previous article on...
Sai Geetha M N
Sep 4, 20224 min read
1,049 views
0 comments

Decision Trees - How to decide the split?
In the introduction to Decision trees, we have seen that the whole process is to keep splitting one node into two based on certain...
Sai Geetha M N
Aug 16, 20223 min read
140 views
0 comments

Why Decision Trees?
As we saw in the last article introducing Decision Trees, decision trees can be used for classification or regression. But the same can...
Sai Geetha M N
Jul 30, 20222 min read
116 views
0 comments

Decision Trees - An Introduction
Decision trees are an algorithm class that form the foundation for Random Forests, a class of algorithms that is extensively used in...
Sai Geetha M N
Jul 23, 20224 min read
279 views
0 comments

Hierarchical Clustering Through an Example
I have taken a problem statement of an NGO wanting to find the top 5-10 countries from a list of 169 who are in dire need of aid, in the...
Sai Geetha M N
Jan 23, 20224 min read
719 views
0 comments

Hierarchical Clustering - Types of Linkages
We have seen in the previous post about Hierarchical Clustering, when it is used and why. We glossed over the criteria for creating...
Sai Geetha M N
Jan 16, 20223 min read
4,129 views
0 comments

Hierarchical Clustering: A Deep Dive
In the last five blog posts, I have discussed the basics of Clustering and then, K-Means clustering in detail. In my "Introduction to...
Sai Geetha M N
Nov 26, 20215 min read
224 views
0 comments

K-Means Clustering through An Example
Now that we have understood the basics of K-Means Clustering, let us dive a little deeper today. Let us look at one practical problem and...
Sai Geetha M N
Nov 4, 202110 min read
341 views
0 comments

Steps towards Data Science or Machine Learning Models
Having completed the basics of K-Means clustering in the last 3 weeks, I was tempted to take you through an example problem through code....
Sai Geetha M N
Oct 21, 20214 min read
124 views
0 comments


K-Means Clustering: Part 3 of 3
Theoretically and mathematically, we have understood a great deal about K-Means Clustering through Part 1 and Part 2 of this series. If...
Sai Geetha M N
Oct 8, 20216 min read
59 views
0 comments

K-Means Clustering: Part 2 of 3
Last week, we looked at the basic understanding of how K-Means Clustering works through the 5-step process where the two steps of...
Sai Geetha M N
Oct 1, 20214 min read
40 views
0 comments

K-Means Clustering: Part 1 of 3
Having looked at Clustering in general and also having heard that K-Means is one of the simplest and most popular clustering algorithms,...
Sai Geetha M N
Sep 23, 20214 min read
99 views
0 comments

When can you use Linear Regression?
It's been a while since my last post, as I was caught up with a couple of talking engagements - one at a university for engineering...
Sai Geetha M N
Sep 9, 20215 min read
270 views
1 comment

Prediction Vs Forecasting in Supervised Learning
In supervised learning and especially in the context of Linear regression, we often use these two terms: Prediction and Forecast. We also...
Sai Geetha M N
Aug 3, 20214 min read
93 views
0 comments

Feature Selection in Machine Learning
Selecting the right features that contribute to your model is an art and a science. I call it art because much pain can be saved if you...
Sai Geetha M N
Jul 19, 20217 min read
897 views
0 comments

Machine Learning - Rendezvous Architecture
The Rendezvous architecture proposed by Ted Dunning and Ellen Friedman in their book on Machine Learning Logistics was a wonderful...
Sai Geetha M N
Jul 8, 20218 min read
522 views
0 comments

Big Data Architecture for Machine Learning
Machine Learning by itself is a branch of Artificial Intelligence that has a large variety of algorithms and applications. One of my...
Sai Geetha M N
Jul 1, 20217 min read
466 views
0 comments

MultiCollinearity
Multicollinearity is a concept relevant to all the input data that is used in a Machine learning Algorithm. This has to be understood...
Sai Geetha M N
May 27, 20215 min read
196 views
0 comments
bottom of page