salesanalyzer_mds.sales_summary_statistics ========================================== .. py:module:: salesanalyzer_mds.sales_summary_statistics Functions --------- .. autoapisummary:: salesanalyzer_mds.sales_summary_statistics.sales_summary_statistics Module Contents --------------- .. py:function:: sales_summary_statistics(sales_data: pandas.DataFrame, quantity_col: str = 'Quantity', price_col: str = 'UnitPrice', customer_col: str = 'CustomerID', invoice_col: str = 'InvoiceNo', description_col: str = 'Description') -> pandas.DataFrame Generate summary statistics for sales data, including total revenue, unique customers, average order value, top-selling products by quantity and revenue, and average revenue per customer. Args: ----- sales_data (pandas.DataFrame): A DataFrame containing sales data with at least the following columns: quantity_col, price_col, customer_col, invoice_col, and description_col. quantity_col (str): The name of the column containing the quantity sold. price_col (str): The name of the column containing the unit price of the product. customer_col (str): The name of the column containing the customer ID. invoice_col (str): The name of the column containing the invoice number. description_col (str): The name of the column containing the product description. Returns: -------- pandas.DataFrame: A DataFrame containing the calculated summary statistics. If no sales data is provided, returns an empty DataFrame. The function computes the following statistics: - 'total_revenue': The total revenue generated by all sales. - 'unique_customers': The number of unique customers. - 'average_order_value': The average value of an order (sum of revenue per invoice). - 'top_selling_product_quantity': The product with the highest quantity sold. - 'top_selling_product_revenue': The product with the highest total revenue. - 'average_revenue_per_customer': The average revenue generated by each customer. Example: -------- >>> df = pd.DataFrame({ >>> 'Quantity': [10, 5, 3, 15], >>> 'UnitPrice': [100, 200, 150, 100], >>> 'CustomerID': [1, 2, 1, 3], >>> 'InvoiceNo': ['INV001', 'INV002', 'INV003', 'INV004'], >>> 'Description': ['Product A', 'Product B', 'Product A', 'Product C'] >>> }) >>> summary_df = sales_summary_statistics(df) >>> print(summary_df) total_revenue unique_customers average_order_value 0 3950.0 3 987.5 top_selling_product_quantity top_selling_product_revenue 0 Product C Product C average_revenue_per_customer 0 1316.666667