Data Science Tutorials

Hypothesis testing and regression: Determining which marketing strategy will increase sales and conversion and how factors influence them

This post discusses a detailed analysis for a start-up that aims to make a data driven decision on optimal marketing strategy that maximizes sales and conversion as well as understanding factors driving them.

Predicting review sentiments using NLP and Deep Learning with Pytorch

NLP
Python
Deep Learning

This post describes an end to end project on predicting review sentiments by customers using Natural Language Processing and Deep Learning with Pytorch.

One-Hot representation: Encoding text for Natural Language Processing (NLP) project -Part I

NLP
Python

One Hot encoding is one of techniques employed in representing texts to natural language processing task. This post discusses one-hot encoding as part I of discussions on text representation in NLP

Term-Frequency Inverse-Documnent-Frequency (TF IDF): Encoding text for Natural Language Processing (NLP) project -Part II

NLP
Python

A short description of the post.

Online Marketing Ads click prediction: End-to-End workflow for machine learning solution

Machine learning
Python

This post demonstrates the process for predicting number of clicks with focus on procedures and logic behind the modeling process. The post details exploratory analysis undertaken to inform the modeling processing, implementation of non-parametric machine learning models such as decision trees, ensembel model, hyperparameter optimization, bagging and boosting techinques to improve the model performance. The post also provides insights on the influence of various features on model and more essentially whether or not the modeling exercise was worth it by comparing the rsults to a benchmark model. It is duly recognized that not all techniques, some of which are capable of further reducing error where implemented. Nonetheless, an unexhaustive highlight of how to improve the model is provided.

Interactive time series data visualization – Dygraphs in R

visualization
time series

This post demonstrates how to undertake customized interactive visualization using time series data.

Analyzing the impact of platform features on booking sales

Linear regression
revenue impact assessment

This post analyzes how various strategies employed by a business for growth actually influences their sales. The data driven approach aims to assess how a firm will progress forward. To gain such actionable insights, this post demonstartes how to undertake a linear regression with focus on result interpretability.

Product analysis: A/B testing and KPI analysis of products for a start-up

product analysis
R
A/B testing
Hypothesis testing
Product KPI analysis

This post discusses identifying and analyzing KPIs for products as well as undertaking A/B testing to make recommendations to that are relevant to product teams. The post is an end-to-end project that demonstrates how to support product development with statistical analysis.

Hierarchical clustering -- Which states in Nigeria have similar expenditure profile?

Unsupervised learning

Clustering analysis is popular unsupervised machine learning techniques that categorize data into groups of similarity.

Data visualization

visualization

This post entails the use of ggplot for data visualization.

Welcome to Datasiast!

Datasiast exists for enthusiatic data science knowledge sharing and practice.

More articles »

Data Science Tutorials