Home » Data Science Course using Python

Data Science Course Content

Module 1: Introduction to Data Science

  • What is Data Science?
  • Why Python for data science?
  • Relevance in industry and need of the hour
  • How leading companies are harnessing the power of Data Science with
    Python?
  • Different phases of a typical Analytics/Data Science projects and role of
    python
  • Anaconda vs. Python

Module 2: Python Essentials (Core)

  • Overview of Python- Starting with Python
  • Introduction to installation of Python
  • Introduction to Python Editors & IDE’s(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…
  • Understand Jupyter notebook & Customize Settings
  • Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
  • Installing & loading Packages & Name Spaces
  • Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
  • List and Dictionary Comprehensions
  • Variable & Value Labels – Date & Time Values
  • Basic Operations – Mathematical – string – date
  • Reading and writing data
  • Simple plotting
  • Control flow & conditional statements
  • Debugging & Code profiling
  • How to create class and modules and how to call them?

Module 3: Scientific Distributions used in Python for Data Science

  • Numpy
  • Scipy
  • Pandas
  • Scikitlearn
  • Statmodels
  • Nltk……. etc

Module 4: Accessing / Importing and Exporting Data using Python Modules

  • Importing Data from various sources (Csv, txt, excel, access etc)
  • Database Input (Connecting to database)
  • Viewing Data objects – subsetting, methods
  • Exporting Data to various formats
  • Important python modules: Pandas, beautifulsoup

Module 5: Data Manipulation – Cleansing – Munging using Python Modules

  • Cleansing Data with Python
  • Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
  • Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
  • Python Built-in Functions (Text, numeric, date, utility functions)
  • Python User Defined Functions
  • Stripping out extraneous information
  • Normalizing data
  • Formatting data
  • Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

Module 6: Data Analysis – Visualization using Python

  • Introduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
  • Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and scipy.stats etc)

Module 7: Basic Statistics & Implementation of Stats Methods in Python

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests (One sample, independent, paired), Anova, Correlation and Chi-square
  • Important modules for statistical methods: Numpy, Scipy, Pandas

Module 8: Python: Machine Learning – Predictive Modelling – Basics

  • Introduction to Machine Learning & Predictive Modeling
  • Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
  • Overfitting (Bias-Variance Tradeoff) & Performance Metrics
  • Feature engineering & dimension reduction
  • Concept of optimization & cost function
  • Concept of the gradient descent algorithm
  • Concept of Cross-validation(Bootstrapping, K-Fold validation etc)
  • Model performance metrics (R-square, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

Module 9: Machine Learning Algorithms & Applications – Implementation in Python

  • Linear & Logistic Regression
  • Segmentation – Cluster Analysis (K-Means)
  • Decision Trees (CART/CD 5.0)
  • Ensemble Learning (Random Forest, Bagging & boosting)
  • Artificial Neural Networks(ANN)
  • Support Vector Machines(SVM)
  • Other Techniques (KNN, Naïve Bayes, PCA)
  • Introduction to Text Mining using NLTK
  • Introduction to Time Series Forecasting (Decomposition & ARIMA
  • Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)
  • Fine-tuning the models using Hyperparameters, grid search, piping etc.

Quick Enquiry

Interested Already??

Students can fill up the form below and we will reach out to you

Summary
Review Date
Reviewed Item
Python with Data Science