top of page

Used Car Price Prediction

The project aimed to develop a multiple regression model and a program to estimate used car prices in the market. The team obtained a dataset from Kaggle, focused on six popular car brands, and performed data cleaning and exploration. Two regression models were created—one for individual car brands and another for all brands combined—with an average R-squared value of 87%. A Python program was developed to allow users to input car variables and receive estimated prices, providing potential buyers with a tool to check prices before making purchasing decisions.

Python was used to check, validate and filter the data. In order 
to ensure the quality of our data, we first used Python to process the data with basic data cleaning procedures, for example, checking invalid values and checking null values. 

We noticed several unreasonable observations in the dataset, such as used cars from the year 
2060, cars that can run 470 miles per gallon (mpg), or cars with zero engine size. 

Here are images that show the unreasonable observations and the code used to rectify the issue limiting the years from 2000 - 2022, MPG range from 10 - 70, engine size more than 0 and tax more than 0.
Variable Selection Using SPSS
Heatmap Correlation
Scatter plot for Individual Car Brand multiple regression results

FULL REPORT BELOW

P

J

  • White LinkedIn Icon

© 2023 by Prithvi Raj Juloori. Powered and secured by Wix

GET IN TOUCH

bottom of page