top of page
Here I blog on all topics related to Big Data and Data Science. Articles could be of types: Executive Summaries, Tools analysis, Tool Comparisons, Architectural Patterns, Introductions to complex topics, and 'How to' or Tutorial types that share code snippets.
Search

Data Scientists, Data Engineers, ML Engineers And More - Demystified
As the world of Big Data, Machine Learning and Artificial Intelligence is taking off, there is an overlap of roles and responsibilities...
Sai Geetha M N
Jun 25, 20216 min read
192 views
0 comments


HBase Design - Guidelines & Best Practices
We have looked at HBase Fundamentals and HBase Architecture in the last two weeks. Today I will look at a few best practices and...
Sai Geetha M N
Jun 18, 202113 min read
1,357 views
1 comment

HBase Architecture
We looked at the basics of HBase in the previous article, last week. Today we will understand the Architecture of HBase. We all agree...
Sai Geetha M N
Jun 10, 20217 min read
559 views
0 comments

HBase Fundamentals
HBase is a NoSQL DB that uses some capabilities of the Hadoop ecosystem to provide its features. NoSQL DBs (a.k.a Not Only SQL) are...
Sai Geetha M N
Jun 3, 20219 min read
1,460 views
2 comments

MultiCollinearity
Multicollinearity is a concept relevant to all the input data that is used in a Machine learning Algorithm. This has to be understood...
Sai Geetha M N
May 27, 20215 min read
196 views
0 comments

Outliers and their treatment
Outliers in data analysis and data preparation are to be considered in specific ways so that the data that is fed to a machine learning...
Sai Geetha M N
May 20, 20214 min read
127 views
2 comments

Feature Scaling and its Importance
Feature Scaling is a very important aspect of data preparation for many Machine Learning Algorithms. Let us understand what is feature...
Sai Geetha M N
May 13, 20215 min read
98 views
1 comment

Linear Regression Through Code - Part 2
Last week, in Part 1, I walked through all the preliminary steps to be done before you can build a Linear Regression model. This week we...
Sai Geetha M N
May 6, 202110 min read
90 views
0 comments

Linear Regression Through Code - Part 1
#Tutorial In an earlier blog post, I have spoken about "What is Regression?" and the basic linear equation too. This is one of the...
Sai Geetha M N
Apr 29, 20218 min read
134 views
0 comments

Types of Variables - Definition
#Definition There are different characteristics of data that are used for analysis and machine learning. One very fundamental...
Sai Geetha M N
Apr 28, 20212 min read
43 views
0 comments

Data Validation - During Ingestion into Data Lake
Any enterprise that wants to harness the power of data, almost always begins with building a data lake. By definition, a data lake is a...
Sai Geetha M N
Apr 22, 20219 min read
3,407 views
6 comments

ACID Vs BASE - A definition
#Definitions ACID is a characteristic of RDBMS databases Atomic: Each task in a transaction succeeds or the entire transaction is rolled...
Sai Geetha M N
Apr 17, 20211 min read
76 views
0 comments

Making the Right Database Choice
#ArchitecturalDecision If someone were to ask, should I use SQL or NoSQL database, the obvious answer is "it depends". Depends on what?...
Sai Geetha M N
Apr 16, 20214 min read
856 views
3 comments

Regression Algorithms
#ExecutiveSummary #MLModels What is Regression? Regression is a statistical model/method used to determine the strength and character of...
Sai Geetha M N
Apr 8, 20212 min read
109 views
0 comments

Machine Learning Process - A Success Recipe
#MachineLearningProcess #ExecutiveSummary Introduction It is said that "The world's most valuable resource is no longer oil, but data"....
Sai Geetha M N
Apr 1, 20216 min read
94 views
0 comments

Hadoop for Analysts - Apache Druid, Apache Kylin and Interactive Query Tools
#ToolComparison #ArchitectureDecision Introduction Traditional Data Warehouses have existed in the industry for quite some time now. They...
Sai Geetha M N
Mar 19, 20214 min read
694 views
3 comments

Machine Learning Algorithms Categories
Machine Learning Algorithms learn from data as humans learn from experience. But the type of learning and the goal varies from algorithm...
Sai Geetha M N
Mar 16, 20214 min read
358 views
0 comments

The Machine Learning Landscape
If you are looking to start learning about the basics of Machine learning, you are at the right place. My blog will cover overviews of...
Sai Geetha M N
Mar 10, 20213 min read
611 views
3 comments
bottom of page