top of page
Here I blog on all topics related to Big Data and Data Science. Articles could be of types: Executive Summaries, Tools analysis, Tool Comparisons, Architectural Patterns, Introductions to complex topics, and 'How to' or Tutorial types that share code snippets.
Search
Sai Geetha M N
Oct 21, 20214 min read
Steps towards Data Science or Machine Learning Models
Having completed the basics of K-Means clustering in the last 3 weeks, I was tempted to take you through an example problem through code....
120 views0 comments
Sai Geetha M N
Jul 19, 20217 min read
Feature Selection in Machine Learning
Selecting the right features that contribute to your model is an art and a science. I call it art because much pain can be saved if you...
828 views0 comments
Sai Geetha M N
Jul 8, 20218 min read
Machine Learning - Rendezvous Architecture
The Rendezvous architecture proposed by Ted Dunning and Ellen Friedman in their book on Machine Learning Logistics was a wonderful...
513 views0 comments
Sai Geetha M N
Jun 25, 20216 min read
Data Scientists, Data Engineers, ML Engineers And More - Demystified
As the world of Big Data, Machine Learning and Artificial Intelligence is taking off, there is an overlap of roles and responsibilities...
187 views0 comments
Sai Geetha M N
May 20, 20214 min read
Outliers and their treatment
Outliers in data analysis and data preparation are to be considered in specific ways so that the data that is fed to a machine learning...
122 views2 comments
Sai Geetha M N
Apr 29, 20218 min read
Linear Regression Through Code - Part 1
#Tutorial In an earlier blog post, I have spoken about "What is Regression?" and the basic linear equation too. This is one of the...
123 views0 comments
Sai Geetha M N
Apr 28, 20212 min read
Types of Variables - Definition
#Definition There are different characteristics of data that are used for analysis and machine learning. One very fundamental...
42 views0 comments
Sai Geetha M N
Apr 22, 20219 min read
Data Validation - During Ingestion into Data Lake
Any enterprise that wants to harness the power of data, almost always begins with building a data lake. By definition, a data lake is a...
3,357 views6 comments
Sai Geetha M N
Mar 19, 20214 min read
Hadoop for Analysts - Apache Druid, Apache Kylin and Interactive Query Tools
#ToolComparison #ArchitectureDecision Introduction Traditional Data Warehouses have existed in the industry for quite some time now. They...
674 views3 comments
bottom of page