Essential skills required for DS :
1. Extract and clean data using python/R
2. Analyse data using statistics
3. Present data using python ( numpy/pandas ) or tools like Tableau
4. Build predictive modles using machine learning algorithms
you should know :
1. pyhton
2. R
3. Statistics
4. Machine learning algorithms like Liner Regression , Logistical Regression etc
5. tools like Tableau
You can use platforms like Kaggle (https://www.kaggle.com/) to work on Data science projects .
1. Extract and clean data using python/R
2. Analyse data using statistics
3. Present data using python ( numpy/pandas ) or tools like Tableau
4. Build predictive modles using machine learning algorithms
you should know :
1. pyhton
2. R
3. Statistics
4. Machine learning algorithms like Liner Regression , Logistical Regression etc
5. tools like Tableau
You can use platforms like Kaggle (https://www.kaggle.com/) to work on Data science projects .
https://www.analyticsvidhya.com/blog/2017/01/the-most-comprehensive-data-science-learning-plan-for-2017/
3.2: Basics of Mathematics and Statistics
Time suggested: 8 weeks (February 2017 – March 2017)
Topics to be covered:
- Descriptive Statistics – 1 week
- Probability – 2 weeks
- Inferential Statistics – 2 weeks
- Linear Algebra – 1 week
- Structured Thinking – 2 weeks
Descriptive Statistics – 1 week
- Course (mandatory) – Descriptive Statistics from Udacity is a basic and must do course to get started.
- Books (optional) – Supplement your online course with online stats book. A good book for any one looking for learning basic statistics.
Probability – 2 weeks
- Course (mandatory) – Introduction to probability – The science of uncertainty is an excellent course on edX to learn concepts of probability like conditional probability and probability distributions.
- Books (optional) – The textbook Introduction to probability – Berkley’s stats 134 standard textbook will supplement the course above and can be used as a good reference material.
Inferential Statistics – 2 weeks
- Course (mandatory) – Intro to Inferential Statistics from Udacity – Once you have gone through the descriptive statistics course, this course will take you through statistical modeling techniques and advanced statistics.
- Books (optional) – Online Stats Book – This online book can be used for a quick reference for inference tasks.
Linear Algebra – 1 week
- Course (mandatory)
- Linear Algebra – Khan Academy : This concise and an excellent course on Khan Academy will equip you with the skills necessary for Data Science and Machine Learning.
- Books (optional)
- Linear Algebra/ Levandosky – This is an often cited book to Stanford graduates for Linear Algebra.
- The Manga guide to Linear Algebra – This is a fun filled Linear Algebra book which keeps Machine Learning in context. You will never forget these Algebra lessons for sure.
- Linear Algebra – Khan Academy : This concise and an excellent course on Khan Academy will equip you with the skills necessary for Data Science and Machine Learning.
- Linear Algebra/ Levandosky – This is an often cited book to Stanford graduates for Linear Algebra.
- The Manga guide to Linear Algebra – This is a fun filled Linear Algebra book which keeps Machine Learning in context. You will never forget these Algebra lessons for sure.
Structured Thinking – 2 weeks
- Articles (mandatory): These articles will guide you to structure your thinking process to approach problems in a better way so as to improve your efficiency.
- Competitions (mandatory): No amount of theory can beat practice. This is a strategic thinking problem which will test you on your thinking process. Also, keep an eye on business case studies as they help in structuring your thoughts tremendously.
3.3: Introducing the tool – R / Python
Time suggested: 8 weeks (April 2017 – May 2017)
Topics to be covered:
- Tools (R/Python) – 4 weeks
- Exploration and Visualization (R/Python) – 4 weeks
- Feature Selection/ Engineering
Tools
1. R
- Course – Interactive Intro to R Programming Language by DataCamp – An excellent course by DataCamp to give you hands-on experience in R. The course includes interactive examples You will never feel bored while learning R.
- Books – R for Data Science – This is your one stop solution for referencing basic materials on R.
- Blogs/Articles
- This article will serve a great point for collating the entire process of model building starting from installation of RStudio/R.
- R-bloggers – This is one of the most recommended blog for R- users. Every R practitioner should keep this blog bookmarked. It has some of the most effective and practical R tutorials. Bookmark it now.
- This article will serve a great point for collating the entire process of model building starting from installation of RStudio/R.
- R-bloggers – This is one of the most recommended blog for R- users. Every R practitioner should keep this blog bookmarked. It has some of the most effective and practical R tutorials. Bookmark it now.
2. Python
- Course (mandatory) – Intro to Python for Data Science – An interactive course developed by DataCamp to facilitate Data Science learning using Python.
- Books (mandatory) – Python for Data Analysis – This book covers various aspects of Data Science including loading data to manipulating, processing, cleaning and visualizing data. Must keep reference guide for Pandas users.
- Blogs/Articles (optional)
- A Complete Tutorial to Learn Data Science with Python from Scratch: This article will serve as a quick guide to learning Data Science using Python.
- A Complete Tutorial to Learn Data Science with Python from Scratch: This article will serve as a quick guide to learning Data Science using Python.
Exploration and Visualization
1. R
- Course
- Exploratory Data Analysis – This is an awesome course by Johns Hopkins University on Coursera. You will need no other course to perform visualization and exploratory work in R.
- Blogs/Articles
- Comprehensive guide to Data Exploration in R – This will be a one-stop article that I will suggest you to go through carefully and follow every step. This is because the steps mentioned in the article are the same steps you will be using while solving any data problem or a hackathon problem.
- Cheat sheet – Data Exploration in R – This cheat sheet contains all the steps in data exploration with codes. I suggest you to take out a print and paste it on your wall for quick reference.
- Exploratory Data Analysis – This is an awesome course by Johns Hopkins University on Coursera. You will need no other course to perform visualization and exploratory work in R.
- Comprehensive guide to Data Exploration in R – This will be a one-stop article that I will suggest you to go through carefully and follow every step. This is because the steps mentioned in the article are the same steps you will be using while solving any data problem or a hackathon problem.
- Cheat sheet – Data Exploration in R – This cheat sheet contains all the steps in data exploration with codes. I suggest you to take out a print and paste it on your wall for quick reference.
2. Python
- Course (optional)
- Intro to Data Analysis – This is an excellent course by Udacity on Data Exploration using Numpy and Pandas.
- Blogs/Articles (mandatory)
- Comprehensive guide to Data Exploration using Python NumPy, Matplotlib and Pandas – This is a sufficient and comprehensive article which uses the most popular Python libraries for exploration and visualization purposes.
- 9 popular ways to perform Data Visualization in Python – This article presents the most commonly used graphs and plots used in Data Exploration along with Python codes. This is a must bookmarked article for people working in Data Science using Python.
- Books (optional) – Python for Data Analysis – A one stop solution for your Data Exploration and Visualization in Python.
- Intro to Data Analysis – This is an excellent course by Udacity on Data Exploration using Numpy and Pandas.
- Comprehensive guide to Data Exploration using Python NumPy, Matplotlib and Pandas – This is a sufficient and comprehensive article which uses the most popular Python libraries for exploration and visualization purposes.
- 9 popular ways to perform Data Visualization in Python – This article presents the most commonly used graphs and plots used in Data Exploration along with Python codes. This is a must bookmarked article for people working in Data Science using Python.
Feature Selection/ Engineering
- Blog – A Comprehensive Guide to Data Exploration: This article will explain underlying techniques of feature engineering and different methods for feature creation
- Books (optional) – Mastering Feature Engineering: This book is master piece to learn feature engineering. Not only will you learn how to implement feature engineering in a systematic way. You will also learn different methods involved in feature engineering.
3.4: Basic & Advanced machine learning tools
Time suggested: 12 weeks (June 2017 – August 2017)
Topics to be covered (June 2017 – July 2017):
- Basic Machine Learning Algorithms.
- Linear Regression
- Logistic Regression
- Decision Trees
- KNN (K- Nearest Neighbours)
- K-Means
- Naïve Bayes
- Dimensionality Reduction
- Advanced algorithms (August 2017)
- Random Forests
- Dimensionality Reduction Techniques
- Support Vector Machines
- Gradient Boosting Machines
- XGBOOST
- Linear Regression
- Logistic Regression
- Decision Trees
- KNN (K- Nearest Neighbours)
- K-Means
- Naïve Bayes
- Dimensionality Reduction
- Random Forests
- Dimensionality Reduction Techniques
- Support Vector Machines
- Gradient Boosting Machines
- XGBOOST
Linear Regression
- Course
- Machine Learning by Andrew Ng – There is no better resource to learn Linear Regression than this course. It will give you a thorough understanding of linear regression and there is a reason why Andrew Ng is considered the rockstar of Machine Learning.
- Blogs/Articles
- Books
- The Elements of Statistical Learning – This book is sometimes considered the holy grail of Machine Learning and Data Science. It explains Machine Learning concepts mathematically from a Statistics perspective.
- Machine Learning with R – This is a book I personally use to have a brief understanding of Machine Learning algorithms along with their implementation code.
- Practice
- Black Friday – Like I already said – No amount of theory can beat practice. Here is a regression problem that you can try your hands on for a deeper understanding.
- Machine Learning by Andrew Ng – There is no better resource to learn Linear Regression than this course. It will give you a thorough understanding of linear regression and there is a reason why Andrew Ng is considered the rockstar of Machine Learning.
- The Elements of Statistical Learning – This book is sometimes considered the holy grail of Machine Learning and Data Science. It explains Machine Learning concepts mathematically from a Statistics perspective.
- Machine Learning with R – This is a book I personally use to have a brief understanding of Machine Learning algorithms along with their implementation code.
- Black Friday – Like I already said – No amount of theory can beat practice. Here is a regression problem that you can try your hands on for a deeper understanding.
Logistic Regression
- Course (mandatory)
- Machine Learning by Andrew Ng– The week 3 of this course will give you a deeper understanding of the one of the most widely used classification algorithm.
- Machine Learning: Classification – Week 1 and 2 of this practical oriented Specialization course using Python will satiate your knowledge thirst about Logistic Regression.
- Blogs/Articles (optional)
- Logistic Regression by Machine Learning Mastery – This is an excellent non-code based approach to Logistic regression to deepen your knowledge. I suggest you to have a look at it.
- Books (optional)
- Introduction to Statistical Learning – This is an excellent book with a quality content on Logistic Regression’s underlying assumptions, statistical nature and mathematical linkage.
- Practice (mandatory)
- Loan Prediction – This is an excellent competition to practice and test your new Logistic Regression skills to predict whether loan status for a person was approved or not.
- Machine Learning by Andrew Ng– The week 3 of this course will give you a deeper understanding of the one of the most widely used classification algorithm.
- Machine Learning: Classification – Week 1 and 2 of this practical oriented Specialization course using Python will satiate your knowledge thirst about Logistic Regression.
- Logistic Regression by Machine Learning Mastery – This is an excellent non-code based approach to Logistic regression to deepen your knowledge. I suggest you to have a look at it.
- Introduction to Statistical Learning – This is an excellent book with a quality content on Logistic Regression’s underlying assumptions, statistical nature and mathematical linkage.
- Loan Prediction – This is an excellent competition to practice and test your new Logistic Regression skills to predict whether loan status for a person was approved or not.
Decision Trees
- Course (mandatory)
- Machine Learning: Classification – Week 3 and 4 in this course is about the working of decision trees, preventing overfitting and handling missing values
- Blogs/Articles (mandatory)
- Technical Overview of decision trees – This is a quick overview of decision trees and a must read for anyone new to decision trees.
- Complete tutorial on tree based modeling – This is a python based tutorial on decision trees. For the sake of decision trees, read only sections 1-6 in this article.
- Books (mandatory)
- Introduction to Statistical Learning – Section 8.1 and 8.3 explain the basics of decision trees through theory and practical examples.
- Machine Learning with R – Chapter 5 of this book provides you the best explanation of Machine Learning Algorithms available in the market. Here, the decision trees are explained in an extremely non-intimidating and easier style.
- Practice (mandatory)
- Loan Prediction – This is an excellent competition to practice and test your new Logistic Regression skills to predict whether loan status for a person was approved or not.
- Machine Learning: Classification – Week 3 and 4 in this course is about the working of decision trees, preventing overfitting and handling missing values
- Technical Overview of decision trees – This is a quick overview of decision trees and a must read for anyone new to decision trees.
- Complete tutorial on tree based modeling – This is a python based tutorial on decision trees. For the sake of decision trees, read only sections 1-6 in this article.
- Introduction to Statistical Learning – Section 8.1 and 8.3 explain the basics of decision trees through theory and practical examples.
- Machine Learning with R – Chapter 5 of this book provides you the best explanation of Machine Learning Algorithms available in the market. Here, the decision trees are explained in an extremely non-intimidating and easier style.
- Loan Prediction – This is an excellent competition to practice and test your new Logistic Regression skills to predict whether loan status for a person was approved or not.
KNN (K- Nearest Neighbors)
- Course (mandatory)
- Machine Learning – Clustering and Retrieval: Week 2 of this course progresses to k-nearest neighbors from 1-nearest neighbor and also describes the best ways to approximate the nearest neighbors. It explains all the concepts of KNN using python.
- Blogs/Articles (mandatory)
- Introduction to k-nearest neighbors: simplified – This basic article describes when to use KNN, the ways in which k can be chosen and the way in which KNN algorithm works.
- Learning KNN algorithm using R – This article is a comprehensive guide to learning KNN with hands-on codes for future references.
- Machine Learning – Clustering and Retrieval: Week 2 of this course progresses to k-nearest neighbors from 1-nearest neighbor and also describes the best ways to approximate the nearest neighbors. It explains all the concepts of KNN using python.
- Introduction to k-nearest neighbors: simplified – This basic article describes when to use KNN, the ways in which k can be chosen and the way in which KNN algorithm works.
- Learning KNN algorithm using R – This article is a comprehensive guide to learning KNN with hands-on codes for future references.
K-Means
- Course
- Machine Learning Course – Unsupervised Learning with K-means algorithm: Week 8 of this discusses how to use course how K-means algorithm is used for handling unstructured data.
- Blog
- An Introduction to Clustering and different methods of clustering: In this article, you will learn what is k-means clustering and the intricacies involved in that. It will give you a step by step approach how K-means algorithm works.
- Machine Learning Course – Unsupervised Learning with K-means algorithm: Week 8 of this discusses how to use course how K-means algorithm is used for handling unstructured data.
- An Introduction to Clustering and different methods of clustering: In this article, you will learn what is k-means clustering and the intricacies involved in that. It will give you a step by step approach how K-means algorithm works.
Naive Bayes
- Course
- Intro to Machine Learning: Take this course to see Naive Bayes in action. In this course, Sebastian Thrun has explained Naive Bayes in Simple English.
- Blog / Article
- 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python) : This article will take you through Naive Bayes algorithm in detail. In this guide, you will learn how Naive Bayes algorithm works, applications and many more. It will also give you hands-on knowledge of building a model using Naive Bayes.
- Naive Bayes for Machine Learning : This is one of the most comprehensive articles I have come across. Go through this article to have a complete understanding of why naive bayes algorithm is important for machine learning.
- Intro to Machine Learning: Take this course to see Naive Bayes in action. In this course, Sebastian Thrun has explained Naive Bayes in Simple English.
- 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python) : This article will take you through Naive Bayes algorithm in detail. In this guide, you will learn how Naive Bayes algorithm works, applications and many more. It will also give you hands-on knowledge of building a model using Naive Bayes.
- Naive Bayes for Machine Learning : This is one of the most comprehensive articles I have come across. Go through this article to have a complete understanding of why naive bayes algorithm is important for machine learning.
Dimensionality Reduction
- Course
- Machine Learning – Dimensionality Reduction: Week 8 of this course will walk you through dimensionality reduction and how Principal Components Analysis can be used for data compression of complex data.
- Blog / Article
- Beginners Guide To Learn Dimension Reduction Techniques: In this article, you will learn why dimension reduction is important in machine learning and the various techniques of dimension reduction.
- Machine Learning – Dimensionality Reduction: Week 8 of this course will walk you through dimensionality reduction and how Principal Components Analysis can be used for data compression of complex data.
- Beginners Guide To Learn Dimension Reduction Techniques: In this article, you will learn why dimension reduction is important in machine learning and the various techniques of dimension reduction.
Random Forests
- Videos (mandatory)
- How Random Forest algorithm works? – Watch this video to have a visual perspective of how the Random Forest algorithm works.
- Books (optional)
- Introduction to Statistical Learning – Section 8 explains the basics of Random Forests including bagging and boosting through theory and practical examples.
- Applied predictive modeling – Chapter 8
- Blogs/Articles (mandatory)
- A tutorial on tree based modeling from scratch – This is an excellent article on trees based modeling using python. I suggest you to bookmark it right now.
- Random Forests – This blog explains the entire working, nuts and bolts of Random Forest.
- How Random Forest algorithm works? – Watch this video to have a visual perspective of how the Random Forest algorithm works.
- Introduction to Statistical Learning – Section 8 explains the basics of Random Forests including bagging and boosting through theory and practical examples.
- Applied predictive modeling – Chapter 8
- A tutorial on tree based modeling from scratch – This is an excellent article on trees based modeling using python. I suggest you to bookmark it right now.
- Random Forests – This blog explains the entire working, nuts and bolts of Random Forest.
Gradient Boosting Machines
- Blogs/Articles (mandatory)
- Presentation (mandatory): Here is an excellent presentation on GBM. It contains the prominent features of GBM and the advantages and disadvantages of using it to solve real-world problems. It is must see article for somebody trying to understand GBM.
XGBOOST
- Blogs /Articles (mandatory)
- Official Introduction XGBOOST – Read the documentation of hackathons winning algorithm. It is an improvement over GBM and is right now the most widely used algorithm for winning competitions.
- Using XGBOOST in R – An excellent article on deploying XGBOOST in R using a practical problem at hand.
- XGBOOST for applied Machine Learning – An article by Machine Learning Mastery to evaluate the performance of XGBOOST over other algorithms.
- Official Introduction XGBOOST – Read the documentation of hackathons winning algorithm. It is an improvement over GBM and is right now the most widely used algorithm for winning competitions.
- Using XGBOOST in R – An excellent article on deploying XGBOOST in R using a practical problem at hand.
- XGBOOST for applied Machine Learning – An article by Machine Learning Mastery to evaluate the performance of XGBOOST over other algorithms.
Support Vector Machines
- Course (mandatory)
- Machine Learning by Andrew Ng – Week 7 of this course is an interesting place to start your SVM journey.
- Books (mandatory)
- Introduction to Statistical Learning – Chapter 9 of the book contains a detail discussion about SVMs and the ways to deploy them.
- Blogs/Articles (optional)
- Understanding support vector machines – This is an excellent article to understand an algorithm practically using examples.
- SVM by Machine Learning Mastery – This article discusses the different types of kernels employed in SVM and their uses.
- Machine Learning by Andrew Ng – Week 7 of this course is an interesting place to start your SVM journey.
- Introduction to Statistical Learning – Chapter 9 of the book contains a detail discussion about SVMs and the ways to deploy them.
- Understanding support vector machines – This is an excellent article to understand an algorithm practically using examples.
- SVM by Machine Learning Mastery – This article discusses the different types of kernels employed in SVM and their uses.
Jadwal Tarung Cockfight SV388 17 Maret 2019 di Situs Judi Sabung Ayam Online Melalui Agen Resmi Taruhan Sabung Ayam Live Asli Thailand.
ReplyDeleteJadwal Tarung Cockfight SV388 17 Maret 2019 - Minggu, Batam 17 Maret 2019 – Pada Hari Tersebut Akan Di Laksanakan Berbagai Pertandingan Sabung Ayam Secara Live di Arena Sabung Ayam Thailand.
Untuk Info Lebih Lanjut Bisa Hub kami Di :
wechat : bolavita
line : cs_bolavita
whatsapp : +628122222995
BBM: BOLAVITA
Awesome post. You Post is very informative. Thanks for Sharing.
ReplyDeleteData Science course in Noida
Data Science Courses in Gurgaon at APTRON have been designed with the objective of developing in the candidates, the capacity to master the professional techniques towards acquiring the best and desirable value for the companies.
ReplyDeleteFor More Info: Data Science Course in Gurgaon
Thanks for such a great blog.
ReplyDeleteData Science Online Training
Hi, Thank you for this informative blog, I learn oracle dba online training and this particular blog made my vision clear to proceed further in my career. Thanks to the author.
ReplyDeleteNice blog. Thanks for sharing with us. Keep sharing more with us.
ReplyDeleteData Science Institute in Hyderabad