@Jacob Moore stealing the analogy from a Custora post, imagine the Pareto distribution as being a "coin flip" if a customer will return or not, on a 0 to 1 scale. If it's above 0.5, we'll assume they're coming back. It is exponential because for most businesses, a large proportion of customers only purchase once. The mu parameter will control the severity of this curve, meaning that if your business has a high churn rate, the model will also predict most customers as churned.

So the dist is really saying "what is the likelihood that this customer will return ever again?"

