salesanalyzer_mds
A Python package designed to simplify retail sales data analysis for small to medium-sized businesses. This tool offers a set of pre-built functions that make it easy to identify market segments, predict future sales, and analyze seasonal revenue trends.
Why sales_analyzer?
Small to medium-sized businesses (SMBs) often lack the resources for in-house data teams or complex analytics tools. sales_analyzer is here to bridge that gap by providing easy-to-use, specialized functions that allow businesses to extract valuable insights from their sales data without requiring deep expertise in data science.
Key Benefits:
Tailored for SMBs: No need for expensive or complex tools. Our package is designed specifically for small to medium-sized businesses to help them make data-driven decisions with ease.
Easy-to-use functions: Simple, pre-built functions for common retail sales tasks so you can get started right away.
Cost-effective: Instead of hiring a full-time data analytics team or paying for expensive software, this package offers an affordable, one-stop solution to meet your business’s analytical needs.
Actionable Insights: Gain a better understanding of your market segments and sales trends, which can inform inventory management, marketing strategies, and customer outreach.
How It Fits into the Python Ecosystem
While existing Python packages such as Pandas and Scikit-learn provide powerful general-purpose tools for data manipulation and machine learning, they require significant customization and specialized knowledge to be applied effectively to retail sales analysis. sales_analyzer complements these tools by streamlining common retail-specific tasks. It provides a suite of pre-built, easy-to-use functions specifically tailored to sales data, so businesses don’t need to spend time customizing solutions for their needs.
Installation
$ pip install salesanalyzer_mds
Functions
segment_revenue_share: Segments products into three categories: cheap, medium, expensive, based on price, and calculates their respective share in total revenue.predictSales: Predicts future sales based on the provided historical data and the target.sales_summary_statistics: Calculates a variety of summary statistics that provide insights into overall sales performance, customer behavior, and product performance.
Usage
salesanalyzer_mds can be used to extract sales data insights from available data.
Set up imports
from salesanalyzer_mds.sales_summary_statistics import sales_summary_statistics
from salesanalyzer_mds.segment_revenue_share import segment_revenue_share
from salesanalyzer_mds.predict_sales import predict_sales
import pandas as pd # additional import to handle your sales data
Load your sales data as pandas DataFrame
Retrieve the insights:
Summary statistics
sales_summary_statistics(your_sales_data)
The sales_summary_statistics() function returns a pandas DataFrame with:
‘total_revenue’: The total revenue generated by all sales.
‘unique_customers’: The number of unique customers.
‘average_order_value’: The average value of an order (sum of revenue per invoice).
‘top_selling_product_quantity’: The product with the highest quantity sold.
‘top_selling_product_revenue’: The product with the highest total revenue.
‘average_revenue_per_customer’: The average revenue generated by each customer.
Segment revenue share
segment_revenue_share(your_sales_data,
price_col='UnitPrice',
quantity_col='Quantity',
price_thresholds=None) # replace column names with your data column names
The segment_revenue_share() funtion returns a pandas DataFrame showing the total revenue share for each price segment:
‘cheap’, ‘medium’, ‘expensive’. Custom price thresholds can be set by the user other set automatically.
Custom price thresholds can be set using the
price_thresholdsparameter.If not specified, thresholds are automatically determined based on the data.
Predict sales
predict_sales(your_sales_data,
new_data, # new sales data to base the predictions on
numeric_features = ['UnitPrice'],
categorical_features = ['Description', 'Country'],
target = 'Quantity',
date_feature = 'InvoiceDate')
The predict_sales() function returns a DataFrame with prediction values, and a printed out MSE score.
Developer notes:
Install Development Version
Clone the repository and navigate into the project root directory.
Create a new environment with Python 3.10:
conda create -n salesanalyzermds python=3.10 conda activate salesanalyzermds
Install Poetry by following these instructions, and then run the following bash command to install the necessary dependencies:
poetry install
Running The Tests
To test the salesanalyzer-mds package, follow the steps below:
Execute the tests using
pytestfrom the root project directory:
pytest tests/
To assess the branch coverage for this package:
pytest --cov=salesanalyzer_mds --cov-branch
Dependencies
This package relies on the following dependencies as outlined in pyproject.toml:
python = “>=3.10”
scikit-learn = “>=1.6.1”
pandas = “>=2.2.3”
pytest = “>=8.3.4”
jupyter = “>=1.1.1”
myst-nb = “>=1.1.2”
sphinx-autoapi = “>=3.4.0”
sphinx-rtd-theme = “>=3.0.2”
Contributors
Yeji Sohn
Daria Khon
Franklin Aryee
Contributing
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
License
salesanalyzer_mds was created by Yeji Sohn, Daria Khon, Franklin Aryee. It is licensed under the terms of the MIT license.
Credits
salesanalyzer_mds was created with cookiecutter and the py-pkgs-cookiecutter template.