1. Define the difference between a well-defined relationship and a poorly-defined relationship and draw graphs to depict the differences. 2. What are the unique characteristics of interpretation in logistic regression? 3. Explain the concept of odds and why it is used in predicting probability in a logistic regression procedure
1.
A well-defined problem is one that has a clear goal or solution, and problem solving strategies are easily developed. In contrast, a poorly-defined problem is the opposite. It's one that is unclear, abstract, or confusing, and that does not have a clear problem solving strategy.
2.
Logistic regression models the probabilities for classification problems with two possible outcomes. It’s an extension of the linear regression model for classification problems.
The logistic regression model uses the logistic function to squeeze the output of a linear equation between 0 and 1. The logistic function is defined as:
"logistic(\\eta)=\\frac{1}{1+e^{-\\eta}}"
For classification, we prefer probabilities between 0 and 1, so we wrap the right side of the equation into the logistic function. This forces the output to assume only values between 0 and 1.
"P(y^{(i)}=1)=\\frac{1}{1+exp(-(\\beta_0+\\beta_1x_1^{(i)}+...+\\beta_px_p^{(i)}))}"
These are the interpretations for the logistic regression model with different feature types:
3.
equation for the interpretation so that only the linear term is on the right side of the formula:
"ln(\\frac{P(y=1)}{1-P(y=1)})=log(\\frac{P(y=1)}{P(y=0)})=\\beta_0+\\beta_1x_1+...+\\beta_px_p"
We call the term in the ln() function “odds” (probability of event divided by probability of no event) and wrapped in the logarithm it is called log odds.
odds ratio is that an odds ratio greater than 1 is a positive association (i.e., higher number for the predictor means group 1 in the outcome), and an odds ratio less than 1 is negative association (i.e., higher number for the predictor means group 0 in the outcome
why odds is used in predicting probability:
The problem is that probability and odds have different properties that give odds some advantages in statistics. In logistic regression the odds ratio represents the constant effect of a predictor X, on the likelihood that one outcome will occur.
In regression models, we often want a measure of the unique effect of each X on Y. If we try to express the effect of X on the likelihood of a categorical Y having a specific value through probability, the effect is not constant.
Comments
Leave a comment