Ahmed Shehata

Logo

View the Project on GitHub AhmedShehata2002/Portfolio-

Data Scientist

Technical Skills, Libraries and Software Tools

Python; C; Command Line; SQL; Seaborn; Exploratory Data Analysis; NumPy; Pandas; Matplotlib; Scikit-learn; SciPy; Power BI; Tableau; Canva, Microsoft Excel; Microsoft PowerPoint; Microsoft Office; Database Management; Data Preprocessing; Data Wrangling; Data Cleaning; Predictive Modelling; Generative AI; Deep Learning; Model Employment; Machine Learning; Hyperparameter tuning; Exploratory Data Analysis; Hypothesis A/B testing; Business Intelligence; Ensemble Techniques; Supervised Learning; Classification; Unsupervised Learning; Statistical Analysis; AdaBoost, Gradient Boosting, XGBoost; Dashboard Development; Decision Tree Algorithms; Linear Regression; K-means and Hierarchical Clustering

Education

Work Experience

Data Analyst and Marketing Intern @ FHS Law Consultants (Sept 2022)

Healthcare Data Analyst Intern @ Al Zaabi Healthcare (ADAM & EVE) (Aug 2022)

Property Analyst Intern @ AMIGO Properties (Real Estate) (Jul 2022)

Leaderboard Coordinator and Tournament Oversight Manager @ Abu Dhabi Golf Club (Dec 2019 - Dec 2020)

Special Needs Cycling Instructor and Volunteer @ GMS Special Olympics (Sep 2018 - May 2019)

Event Management and Media Awareness Coordinator @ Emirates Red Crescent (Sep 2018 - May 2019)

Health and Ergonomics Research Assistant @ Perfect Balance Rehabilitation Center (Sep 2018 - May 2019)

Projects

note: I am unable to provide specific statistics or data regarding the project publicly as they are copyrighted by PGP. For further information about the project, please feel free to contact me directly.

Food Hub Images

FoodHub Order Analysis: Python Foundations

Project Summary:

In the FoodHub Order Analysis project, conducted during the PGP Course, I analyzed data from a food aggregator company to derive actionable insights using Python. The project aimed to enhance customer experience by understanding the demand for various restaurants in New York, facilitated by the company’s online portal. Through extensive exploratory data analysis, including Variable Identification, Univariate analysis, and Bi-Variate analysis, I addressed key questions posed by the Data Science team to improve business strategies.

Key Highlights:

Skills List:

E News Image

E-news Express Project: A/B Testing

Project Summary:

Within the Business Statistics course, the E-news Express Project focused on evaluating the effectiveness of a new landing page for an online news portal, E-news Express. Utilizing statistical analysis, A/B testing, and visualization techniques, the project aimed to assess the page’s ability to attract new subscribers and examined the correlation between conversion rates and users’ preferred language.

Key Highlights:

Skills List:

Recell Image

ReCell Project: Supervised Learning Foundations

Project Summary:

The ReCell Project, conducted during the Supervised Learning - Foundations course, focused on developing a dynamic pricing strategy for used and refurbished devices. Leveraging Python and advanced analytical techniques, including Linear Regression, the project aimed to identify key factors influencing device prices. By meticulously evaluating model assumptions and offering actionable insights, the project underscored the significance of data-driven strategies in optimizing pricing strategies.

Key Highlights:

Skills List:

INN Hotel Images

INN Hotel Analysis: Supervised Learning Classification

Project Summary:

In the INN Hotels project, conducted during the Supervised Learning - Classification course, the primary objective was to develop a predictive model capable of preemptively identifying bookings likely to be canceled. Leveraging supervised learning techniques such as Logistic Regression and Decision Tree algorithms, along with exploratory data analysis and data preprocessing, the project aimed to identify influential factors on booking cancellations and formulate profitable policies for cancellations and refunds.

Key Highlights:

Skills List:

Actionable Insights and Recommendations:

Profitable policies for cancellations and refunds:

Easy visa Images

Easy Visa Project: Ensemble Techniques Bagging and Boosting

Project Summary:

In the EasyVisa project, conducted during the Ensemble Techniques course, the primary objective was to build a predictive model to streamline the visa approval process and recommend suitable applicant profiles for visa certification or denial. Leveraging ensemble techniques such as Bagging Classifiers (Bagging and Random Forest), Boosting Classifiers (AdaBoost, Gradient Boosting, XGBoost), and a Stacking Classifier, along with exploratory data analysis (EDA) and data preprocessing, the project aimed to identify influential factors on visa status and offer actionable insights for decision-making.

Key Highlights:

Skills List:

Actionable Insights and Recommendations:

Actionable Insights:

ReneWind Images

ReneWind Project: Model Tuning

Project Summary:

In the “ReneWind” project, our collaboration with a leading wind energy company focused on leveraging machine learning techniques to predict generator failures in wind turbines. By analyzing sensor data, our goal was to reduce maintenance costs and minimize downtime, ultimately enhancing operational efficiency and optimizing resource allocation.

Key Highlights:

Skills:

Actionable Insights and Recommendations:

Trade&Ahead Images

Trade&Ahead Project: Unsupervised Learning

Project Summary:

In the “Trade&Ahead” project, we collaborated with a financial consultancy firm to analyze stock data and group stocks based on their attributes. Our goal was to provide insights into the characteristics of each group, enabling personalized investment strategies for clients.

Key Highlights:

Skills:

Recommendations:

Personal Projects

Credit Card Default

Credit Card Default

Project

In our project, we aimed to develop a predictive model using machine learning classification techniques to assess the risk of default among loan applicants. By leveraging a range of data points including credit history and loan amount, we sought to enhance the accuracy of our model through both numeric and categorical attributes integration.

Key findings:

Conclusion: Our examination underscored the critical importance of certain features, such as credit history and checking balance, in determining default risk. Furthermore, the utilization of post-pruning decision trees significantly enhanced our model’s recall rates, emphasizing their effectiveness in this context. These insights provide valuable guidance for future enhancements in credit risk assessment methodologies.

Dashboard Default

Coffee Sales Interactive Dashboard
Dashboard

Developed an interactive coffee sales dashboard featuring a timeline, slicers, and charts. Key skills utilized during the project walkthrough included XLOOKUP, INDEX MATCH, multiplication formulas, multiple IF functions, date and number formatting, checking for duplicates, converting ranges to tables, working with pivot tables and pivot charts, and formatting. Additionally, the dashboard was built, updated, and enhanced with pro tips for a comprehensive user experience. Note: You may have to downlad the file to use all features.

SQLBankchurn

Analyzing Subscription Churn Rates: Leveraging Advanced SQL Techniques for Actionable Insights
SQL Project

In this project, we’re diving into subscription churn rates for Codeflix, a new streaming service. Our main goal is to figure out how many users are canceling their subscriptions over time. We’re also looking at two different groups of users to see if there are differences in how often they cancel. We’ll dig through the subscription data, following Codeflix’s rules like the minimum subscription length of 31 days. We’ll use some advanced SQL techniques like aggregates, unions, temporary tables, cross joins, case statements, and aliasing to do this. Our aim is to provide insights that help Codeflix keep users around and grow their business.

Oura Health Metrics

Personal Health Ring

This is my next upcoming project in this project I will download over 5 months of my personal activity and sleep metrics to assess my performance based on my “readiness level”, activity level and sleep. Other metrics will include body temperature, REM sleep, HRV, Steps, Equivalent Walking Distance and recovery index. This data will be used to assess how I could further improve my health and wellbeing as well as provide as overview of my physical performance over the last couple of months.