DATA PROJECTS

Flight fare prediction by linear regression model

Project Summary: Conducted Linear Regression Analysis on flight fare data for JFK-bound flights from various origins. The dataset, originally 32 GB with 6M rows, was analyzed using Tableau and Python with Dask, Pandas, scikit-learn, and NumPy. 

Analysis: Achieved 58% R-squared and $141 RMSE. Non-stop flights positively influenced prices, while non-refundable tickets had a negative impact. Holiday season is identified as key factor influencing the flight fare. 

Tools and Techniques: Tableau and Python with Dask, Pandas, scikit-learn, and NumPy 

Data Source: Flight fare dataset 

Data visualization: Tableau 

Data cleaning:  Python code 

Model : Python code

 Exploratory data analysis on vulnerability data

Project Summary: Common Vulnerability Exposures (CVE) is the common knowledge base of application vulnerabilities shared by vendors. The goal of the project is to explore key insights by analyzing 89,000 vulnerabilities of 40,500 products belonging to 16,000 software vendors.

Analysis: Microsoft had the most vulnerabilities, while Adobe faced more critical ones. 'SQL injection' ranked as the most severe CWE category, with 'Debian Linux' as the top target. Effective mitigation measures included multifactor authentication and local network protection. The impact on availability is increased over time.

Tools: Tableau, MySQL , Data Visualisation                                                                    

Data Source: CVE database