Projects

Predicting S&P 500 index price

Predicting stock prices is a popular usecase for machine learning and predictive analytics. S&P 500 (Standard and Poor 500) is the stock market index that measures the stock performance of 500 large companies in the US.
In this project, I built models to predict the stock prices for it based on the available variables and historical data. I used machine learning algorithms like Decision trees, Random forests, Gradient boosting, Neural networks, and Facebook’s prophet in this project.
`

Github Code

NYC Airbnb Data Analysis

Airbnb offers lodging, primarily homestays and tourism experiences. Just like Uber, the company does not own any of the real estate nor does it host any events. It acts as a middleman between the customer and the owner providing the accommodation and Airbnb receives commission from each booking. Problem statement is to explore Airbnb cost in different areas of New York. What are the variables influencing the price of Airbnb stay.

Github Code

OpenWeatherForecast API

This program fetches weather forecast data using OpenWeatherMap APIs. It prompts user to enter a city name or zip code and API will be called with these parameters. Error message is displayed the user if connection fails or input parameters are incorrect. On success, weather forecast data is displayed for next 5 days in 3 hour intervals.

Github Code

Customer Churn in Banking Industry

Customers are the most valuable assets of any business. Retaining customers has become the basic need of the organizations regardless of the industry. This project explores the various steps for predicting customer churn in banking industry using CRISP-DM methodology.

Github Code

Suicide Rate Analysis

Suicide Rate Analysis - This analysis is carried out by comparing socio-economic info with suicide rates by year and country.

I got the dataset from Kaggle. I gave the link below for reference. This dataset is basically a compiled dataset pulled from four other datasets. These 4 datasets are linked by time and place. This dataset is built to find signals correlated to increased suicide rates among different cohorts globally across the socio-economic spectrum.

Kaggle data

Github Code

Predict Credit Card Approval

Predicting credit card approval is a classical usecase in finance industry. Analyzing the credit card applications data for a handful of applications can be done manually but if credit institutions get a large number of requests then analyzing them manually is not feasible. The applications are going to increase with growing number of digital devices and overall digital transformation across the globe. Credit card companies would need a program to analyze the historical patterns of the data and understand the impacting factors in an application and decide whether to approve the application or not based on the risk score of the applicant.

In this project, I created a machine learning program to read the data for existing approval/rejection patterns of the credit card applications and build a ML algorithm to predict the approval status for future credit card applications. I used predictive analytics and machine learning algorithms like logistic regression to predict whether the application is approved or not.

Github Code

Bike sharing programs are an elegantly simple answer to urban gridlock, air pollution, and healthier lifestyles. It sounds like a simple system but its not. Keeping hundreds of bikes tuned up at the right stations and making them available to users requires the help of technology.

Bike sharing networks are present in cities of all sizes, they allow everyone (including visitors) a fun, cheap and easy way to explore the city they are based in and have even become part of regular commuter routines. Bike in itself doesn’t use a lot of tech, however, technology plays a major role in the rapid expansion of the bike sharing networks. Technology helps in building the smart bike sharing networks. The bike sharing companies use GPS sensors to track the bikes, and smartphones. They also keep the credit card on the file till the bike is returned (same like rental car) and to penalize the customer if the wheels of the bike go missing. Riders, meanwhile, can use apps to track down available rides or bike-share stations when they need them.

In this project, I will be exploring the various features of the bike sharing network and the relationships among them. I would then predict the number of trips from each station. I will split the dataset into two parts, one is training dataset and the other one is testing dataset. I will use the training dataset to build the model. I will use decision tree and other algorithms for modeling. I will use the test dataset to test the accuracy of the target variable and conclude the best model based on the accuracy.

Github Code

Predicting Loan Default Risk

Predicting loan default risk is a critical part of money lending because lenders must know whether giving out a loan will result in profit or loss. Generally, loans are profitable and generate revenue for banks because of interest. But, sometimes a borrower may default which results in a loss of money for lending banks. So, it is important that the lender is able to estimate the risk of a borrower being defaulted before borrowing him/her the money.

Given the several factors that might affect borrower default rate, it may be infeasible to come up with good estimates manually. The objective of this project is to explore whether or not we can employ machine learning models to better predict the loan default risk of borrower. Using exploratory data analysis, we may be able to describe loans and the financial situations of their borrowers, we may also determine the key relationships between default rates and a few other features. Also, we will investigate key relationships between loan default risk and customer behavior.