Understanding Life Expectancy Predicors Through Modeling

Life expectancy rates vary all over the globe, so it would be ideal to understand what impacts it. This project uses EDA and data transformation techniques to prepare a dataset for model assessment. A single linear regression model is created to evaluate the effects of EDA techniques. The final model takes into consideration death rates, health status and income.

The Data

The data is provided by Kaggle, found here.


Completed in Python, the following packages are used

  • Pandas
  • NumPy
  • Matplotlib
  • SKLearn
  • Yellowbrick
  • Seaborn


All necessary code is included in the Jupyter notebook. The data file can be found in the data folder.