Data Science Course using Python Archives -

What is Data Science?
Why Python for data science?
Relevance in industry and need of the hour
How leading companies are harnessing the power of Data Science with
Python?
Different phases of a typical Analytics/Data Science projects and role of
python
Anaconda vs. Python

Overview of Python- Starting with Python
Introduction to installation of Python
Introduction to Python Editors & IDE’s(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…
Understand Jupyter notebook & Customize Settings
Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
Installing & loading Packages & Name Spaces
Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
List and Dictionary Comprehensions
Variable & Value Labels – Date & Time Values
Basic Operations – Mathematical – string – date
Reading and writing data
Simple plotting
Control flow & conditional statements
Debugging & Code profiling
How to create class and modules and how to call them?

Cleansing Data with Python
Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
Python Built-in Functions (Text, numeric, date, utility functions)
Python User Defined Functions
Stripping out extraneous information
Normalizing data
Formatting data
Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

Introduction exploratory data analysis
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and scipy.stats etc)

Basic Statistics – Measures of Central Tendencies and Variance
Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
Inferential Statistics -Sampling – Concept of Hypothesis Testing
Statistical Methods – Z/t-tests (One sample, independent, paired), Anova, Correlation and Chi-square
Important modules for statistical methods: Numpy, Scipy, Pandas

Introduction to Machine Learning & Predictive Modeling
Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
Overfitting (Bias-Variance Tradeoff) & Performance Metrics
Feature engineering & dimension reduction
Concept of optimization & cost function
Concept of the gradient descent algorithm
Concept of Cross-validation(Bootstrapping, K-Fold validation etc)
Model performance metrics (R-square, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

Linear & Logistic Regression
Segmentation – Cluster Analysis (K-Means)
Decision Trees (CART/CD 5.0)
Ensemble Learning (Random Forest, Bagging & boosting)
Artificial Neural Networks(ANN)
Support Vector Machines(SVM)
Other Techniques (KNN, Naïve Bayes, PCA)
Introduction to Text Mining using NLTK
Introduction to Time Series Forecasting (Decomposition & ARIMA
Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)
Fine-tuning the models using Hyperparameters, grid search, piping etc.

Tag: Data Science Course using Python