** Commands we have used so far * summarize all the variables in the dataset sum * sorting data ascending or descending order and show the top 3 or bottom 3 gsort cars list in 1/3 gsort -cars list in 1/3 * generate a new variable * too crowded is 1 if there's more than 400 cars in the highway * it's 0 otherwise gen toocrowded = (cars>400) * this creates a squared version of cars gen cars2 = cars^2 * histogram hist traveltime * tabulate data by groups, for example by highways. Useful for strings. tab highway tab highway if highway !="SqHill" * look at correlation between two variables pwcorr cars traveltime * regress traveltime against cars *(fit a line through it and give us the intercept and slope) reg traveltime cars reg traveltime cars if highway =="SqHill" reg traveltime cars if highway =="SqHill" & cars>400 * make a scatter plot where traveltime is in the y axis and cars is in the x axis scatter traveltime cars scatter traveltime cars if toocrowded scatter traveltime cars if !toocrowded * compare two scatterplots twoway (scatter traveltime cars if toocrowded) || (scatter traveltime cars if !toocrowded) twoway (scatter traveltime cars if highway=="SqHill") || (scatter traveltime cars if highway=="Clarion")