Passionate about uncovering the story in every dataset. Specializing in data science, machine learning,
and transforming complex data into actionable insights.
As a passionate data science and machine learning enthusiast pursuing a Bachelor's in Computer Science and Information Technology, I am deeply committed to continuous learning and innovation. With a keen interest in uncovering patterns within complex datasets, I thrive in the world of algorithms and predictive models.
My academic journey has not only equipped me with a solid foundation in programming and data analytics but has also ignited a lifelong passion for transforming raw data into actionable insights, driving smarter decisions and impactful solutions.
This is a data analysis on Amazon top 50 best selling books between 2009 AD to 2019 AD. In this project a visualized analysis is done on the available data. After performing analysis thoroughly, the following results were gathered. The gif animation here shows the changes of user rating over the period of time.
Key Features:
Top 10 books with highest rating.
Top selling Authors
Exploratory Data Analysis (EDA) on top selling books.
Key Findings
Top Selling Authors and Books
With the thorough analysis, we can conclude that Jeff Kinney stands out to be the best selling author over the 10 year period. The graph shows the increasing user rating when the price of the book increases. This means, with increasing book price, user satisfaction increased which is a good sign for sustaining business.
Here is the list of Authors and number of books they sold in a decade:
Author
Books Sold (2009-2019)
Jeff Kinney
12
Gary Chapman
11
Rick Riordan
11
Suzanne Collins
11
American Psychological Association
10
Relation Between Genre and User Ratings Each Year
The bar chart depicts the preference of users for non-fiction books over fiction books. Over the 10-year period from 2009 to 2019, the highest user rating for non-fiction books was recorded in 2014, while the lowest rating was observed for fiction books in the same year. Interestingly, fiction books managed to outperform non-fiction in 2014 despite this trend.
The overall user rating was better for Non-Fiction books every year when exceptional events are ignored. Over the 10 year period, the performance of Non-Fiction increased while it decreased for Fiction books.
This project demonstrates the power of data visualization in understanding market trends. The insights gained can help publishers, authors, and marketers make informed decisions about pricing strategies, genre selection, and market positioning.
Titanic Disaster Data Analysis
Project Overview
This project presents a comprehensive exploratory data analysis of the Titanic disaster, one of history's most tragic maritime incidents. By analyzing passenger demographics, socioeconomic factors, and survival rates, this study reveals the factors that influenced survival during the disaster.
Analysis Goals
Examine survival rates across different passenger classes
Analyze the impact of gender and age on survival
Investigate the relationship between fare prices and survival
Identify patterns in family size and survival outcomes
Visualize demographic distributions and survival statistics
Key Discoveries
Class and Survival
First-class passengers had significantly higher survival rates compared to second and third-class passengers, highlighting the stark socioeconomic disparities that influenced rescue priorities during the disaster.
Gender Dynamics
The "women and children first" protocol was evident in the data, with female passengers showing dramatically higher survival rates across all classes. This analysis quantifies the impact of this emergency protocol.
Age Factor
Children had higher survival rates overall, though this advantage was more pronounced in higher passenger classes. The data reveals age-based patterns in rescue priorities.
This analysis showcases essential data science skills including data cleaning, exploratory data analysis, statistical interpretation, and effective visualization. The project demonstrates how data analysis can provide historical insights and reveal societal patterns.
Pokemon Data Analysis
Project Overview
This comprehensive data analysis explores the Pokemon universe through a statistical lens, examining battle statistics, type distributions, and evolutionary patterns across multiple generations. The project combines gaming data with data science techniques to reveal interesting patterns and insights.
Research Questions
What are the most common and rare Pokemon types?
How do base statistics vary across different generations?
Which Pokemon types have the highest average stats?
What patterns exist in legendary vs. regular Pokemon?
How have Pokemon designs evolved across generations?
Key Insights
Type Distribution
Water, Normal, and Grass types are the most common, while unique type combinations like Ice/Flying and Dragon/Ground are relatively rare. This distribution reflects both game balance and thematic design choices.
Statistical Analysis
Legendary Pokemon show significantly higher average base stats across all categories. The analysis reveals that Dragon and Psychic types tend to have the highest overall base stat totals among regular Pokemon.
Generation Trends
Earlier generations featured simpler designs and type combinations, while later generations introduced more complex dual-type Pokemon and higher base stat distributions, showing the evolution of game design over time.
Technologies & Tools
Python Pandas Plotly Matplotlib NumPy Jupyter
Project Value
This project demonstrates the versatility of data analysis skills by applying them to gaming data. It showcases abilities in data manipulation, statistical analysis, interactive visualization, and insight generation - all valuable skills transferable to business and research contexts.
Get In Touch
Have a project in mind? Let's work together!
Let's talk about your project
Feel free to reach out if you are interested in AI and ML. Let's get CONNECTED.