Basic linear regression interpretation
How to understand the result of regression

Data source

AustinApartmentRent

from Thomas Sager at University of Texas at Austin

https://www.dropbox.com/s/g04jhka2u2ubrnm/AustinApartmentRent.xls?dl=0

SAS regression code

libname BA "/home/jaekyoungkim0/BA";
PROC IMPORT DATAFILE="/home/jaekyoungkim0/BA/AustinApartmentRent.xls"
		    OUT=BA.apts
		    DBMS=XLS
		    REPLACE;
RUN;

* Get summary statistics on RENT in WORK.APTS;
proc means data=apts;
	var rent;
RUN;

* Test normality of RENT in BA.APTS;
proc univariate data=BA.apts normal;
	var rent;
RUN;

* Regress RENT (Y) on AREA (X) in WORK.APTS;
proc reg data=WORK.apts;
	model rent = area;
RUN;

* Regress RENT (Y) on AREA (X) in BA.APTS, add predicted and residual values to a WORK output dataset,
and test normality of the residuals;
proc reg data=BA.apts;
	model rent = area;
	output out=apts2 p=pRent r=residRent;
RUN;

proc univariate data=apts2 normal;
	var residRent;
RUN;

* To get the exact standard error for estimate of rent and mean rent of 1000 sq ft apt,
add a 61st apt with missing value for its rent, and area = 1000;
data BA.apts2; set BA.apts;
	/* Bring in data lines from BA.APTS */
	output;
	* Process the following code when reach the 60th observation;
	if _n_ = 60 then do; area=1000; bathrooms=2; rent = .;
		/* period is the missing value code */
		output;
	end;
RUN;

* Regress RENT (Y) on AREA (X) in BA.APTS2, calculate plug-in estimates of Y
and mean Y for every AREA, and print stdevs for mean Y and confidence intervals for individual
and mean Y for every area;
* The 61st apt is not used in the regression, but confidence intervals are calculated;
* CLI=Confidence Limits for Individual apts; * CLM=Confidence Limits for Means of apts;
* Default confidence percentage is 95%;
proc reg data=BA.apts2;
	model rent = area / CLI CLM;
RUN;

Regression result

SimpleLinearRegression

Interpretation

*****
Written by Jaekyoung Kim on 30 January 2018