DATA PROJECTS
Flight fare prediction by linear regression model
Project Summary: Conducted Linear Regression Analysis on flight fare data for JFK-bound flights from various origins. The dataset, originally 32 GB with 6M rows, was analyzed using Tableau and Python with Dask, Pandas, scikit-learn, and NumPy.
Analysis: Achieved 58% R-squared and $141 RMSE. Non-stop flights positively influenced prices, while non-refundable tickets had a negative impact. Holiday season is identified as key factor influencing the flight fare.
Tools and Techniques: Tableau and Python with Dask, Pandas, scikit-learn, and NumPy
Data Source: Flight fare dataset
Data visualization: Tableau
Data cleaning: Python code
Model : Python code
Exploratory data analysis on vulnerability data
Project Summary: Common Vulnerability Exposures (CVE) is the common knowledge base of application vulnerabilities shared by vendors. The goal of the project is to explore key insights by analyzing 89,000 vulnerabilities of 40,500 products belonging to 16,000 software vendors.
Analysis: Microsoft had the most vulnerabilities, while Adobe faced more critical ones. 'SQL injection' ranked as the most severe CWE category, with 'Debian Linux' as the top target. Effective mitigation measures included multifactor authentication and local network protection. The impact on availability is increased over time.
Tools: Tableau, MySQL , Data Visualisation
Data Source: CVE database