Two weeks ago I claimed that women report higher job satisfaction when they work in countries where tech is more male-dominated. And then instead of backing up my claim last week, I got sidetracked by questions of sample size and statistical power. In a previous blog post I introduced the Kaggle survey on women in tech and I did some basic data cleaning for that survey. To save time and get to the point, I now pick up where I left off.
This week I am trying to embed a shiny app on a static website using blogdown. In a couple of weeks I get to present a short introduction of blogdown at the first ever R-ladies meetup in the Netherlands following a presentation on Rmarkdown and Shiny1. It will be a nice bonus if I can show how to embed shiny apps in blogdown! Kaggle tech survey For this demonstration I’m going to use data from the freely available Kaggle survey on data science and machine learning.
Just as the original Titanic VHS was published in two video cassettes, this Titanic analysis is also being published in two posts. In this post–part 2–I’m going to be exploring random forests for the first time, and I will compare it to the outcome of the logistic regression I did last time. Random forest vs. Logistic regression Last time I explained how logistic regression uses a link function transforms non-linear relationships into linear ones.
Yes, this is yet another post about using the open source Titanic dataset to predict whether someone would live or die. At this point, there’s not much new I (or anyone) can add to accuracy in predicting survival on the Titanic, so I’m going to focus on using this as an opportunity to explore a couple of R packages and teach myself some new machine learning techniques. I will be doing this over two blog posts.