As part of my MLND Nano-degree, I have completed this capstone project. The goal of this project was to explore time series forecasting. What better way to test that out than sales forecasting, one of the most well-known applications for time series forecasting.
Detailed information about the project and the data used can be found in the proposal file.
Run the following commands in the terminal to clone the repository and install the dependencies.
git clone https://github.com/meshari343/MLND_capstone
cd MLND_capstone
pip install -r requirements.txt # Install the dependencies
Using Python 3.9 and the following packages:
- pandas: is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool.
- numpy: is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.
- matplotlib: is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. Seaborn: is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
- SciPy: (pronounced “Sigh Pie”) is open-source software for mathematics, science, and engineering.
- statsmodels: is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.
- scikit-learn: is an Open source, commercially usable - BSD license that provide Simple and efficient tools for predictive data analysis.
- XGBoost: is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework.
- math: is a python built-in module that provides access to the mathematical functions defined by the C standard.