Answer to Question #279285 in Statistics and Probability for Madimet

Question

Answer to Question #279285 in Statistics and Probability for Madimet

Question #279285

1. Define the difference between a well-defined relationship and a poorly-defined relationship and draw graphs to depict the differences. 2. What are the unique characteristics of interpretation in logistic regression? 3. Explain the concept of odds and why it is used in predicting probability in a logistic regression procedure

Expert's answer

1.

A well-defined problem is one that has a clear goal or solution, and problem solving strategies are easily developed. In contrast, a poorly-defined problem is the opposite. It's one that is unclear, abstract, or confusing, and that does not have a clear problem solving strategy.

2.

Logistic regression models the probabilities for classification problems with two possible outcomes. It’s an extension of the linear regression model for classification problems.

The logistic regression model uses the logistic function to squeeze the output of a linear equation between 0 and 1. The logistic function is defined as:

"logistic(\\eta)=\\frac{1}{1+e^{-\\eta}}"

For classification, we prefer probabilities between 0 and 1, so we wrap the right side of the equation into the logistic function. This forces the output to assume only values between 0 and 1.

"P(y^{(i)}=1)=\\frac{1}{1+exp(-(\\beta_0+\\beta_1x_1^{(i)}+...+\\beta_px_p^{(i)}))}"

These are the interpretations for the logistic regression model with different feature types:

Numerical feature: If you increase the value of feature x_j by one unit, the estimated odds change by a factor of "exp(\\beta_j)"
Binary categorical feature: One of the two values of the feature is the reference category (in some languages, the one encoded in 0). Changing the feature x_j from the reference category to the other category changes the estimated odds by a factor of "exp(\\beta_j)"
Categorical feature with more than two categories: One solution to deal with multiple categories is one-hot-encoding, meaning that each category has its own column. You only need L-1 columns for a categorical feature with L categories, otherwise it is over-parameterized. The L-th category is then the reference category. You can use any other encoding that can be used in linear regression. The interpretation for each category then is equivalent to the interpretation of binary features.
Intercept "\\beta_0": When all numerical features are zero and the categorical features are at the reference category, the estimated odds are "exp(\\beta_0)". The interpretation of the intercept weight is usually not relevant.

3.

equation for the interpretation so that only the linear term is on the right side of the formula:

"ln(\\frac{P(y=1)}{1-P(y=1)})=log(\\frac{P(y=1)}{P(y=0)})=\\beta_0+\\beta_1x_1+...+\\beta_px_p"

We call the term in the ln() function “odds” (probability of event divided by probability of no event) and wrapped in the logarithm it is called log odds.

odds ratio is that an odds ratio greater than 1 is a positive association (i.e., higher number for the predictor means group 1 in the outcome), and an odds ratio less than 1 is negative association (i.e., higher number for the predictor means group 0 in the outcome

why odds is used in predicting probability:

The problem is that probability and odds have different properties that give odds some advantages in statistics. In logistic regression the odds ratio represents the constant effect of a predictor X, on the likelihood that one outcome will occur.

In regression models, we often want a measure of the unique effect of each X on Y. If we try to express the effect of X on the likelihood of a categorical Y having a specific value through probability, the effect is not constant.

Learn more about our help with Assignments: Statistics and Probability

Comments

No comments. Be the first!

Answer to Question #279285 in Statistics and Probability for Madimet

Comments

Leave a comment

Related Questions