Foundations of Data Science

Foundations of Data Science

Data Science along with Artificial Intelligence (AI) and its various components such as Statistical Learning (SL), Machine Learning (ML) and Deep Learning Algorithms (DL) are recognized as main drivers of organizational value creation. According to Dr Jim Gray, Data Science is the fourth paradigm which drives innovative solutions to organizational problems.

The course will start with basic concepts in probability such as joint and conditional probabilities. It will discuss the implementation of these concepts in ML algorithms for Market Basket Analysis and Recommender Systems. After covering basic probability concepts, the course will move on to random variables, discrete and continuous probability distributions, sampling, estimation and central limit theorem.

An important step in ML model building is feature selection to avoid overfitting and underfitting. ML models such as regression and logistic regression use hypothesis testing to select features. The course will include discussions on various hypothesis tests and how they are used in feature selection. Every ML model has an optimization stage, either to fine-tune the feature weights, or to find an optimal set of features. Discussions will include important optimization techniques, and algorithms such as Gradient Descent, that play an important role in AI and ML model development. Data must be represented in a matrix for AI and ML model development. Matrix operations such as matrix inverse and multiplication are elementary steps in model development. These fundamental concepts in linear algebra will be discussed.

At the end of this course, participants will be able to:

  • Describe the role of probability theory, optimization, and linear algebra in the field of Artificial Intelligence
  • Define probability distributions such as binomial and normal and their applications in ML model development
  • Conduct hypothesis tests such as Z test and t-test and how that is used in ML model development
  • Explain optimization and linear algebra concepts and their applications in ML and AI
  • Conduct hypothesis testing, optimization, and linear algebra using Excel

Those interested in learning foundations of Data Science need to enrol for the course at

The course went live on 15th March 2021.