Just as the original Titanic VHS was published in two video cassettes, this Titanic analysis is also being published in two posts. In this post–part 2–I’m going to be exploring random forests for the first time, and I will compare it to the outcome of the logistic regression I did last time. Random forest vs. Logistic regression Last time I explained how logistic regression uses a link function transforms non-linear relationships into linear ones.

Yes, this is yet another post about using the open source Titanic dataset to predict whether someone would live or die. At this point, there’s not much new I (or anyone) can add to accuracy in predicting survival on the Titanic, so I’m going to focus on using this as an opportunity to explore a couple of R packages and teach myself some new machine learning techniques. I will be doing this over two blog posts.

