This week I’m not doing a data analysis project so much as a data cleaning project. One of the most common problems I come across in data cleaning is how to get summary statistics for various groups in the data. And it’s also one of the most annoying problems, because I invariably forget how to do it, and end up having to go back to old code to copy and paste.