I presented tonight at the Bay Area Hadoop User Group, talking briefly about Twitter’s use of Hadoop and Pig. Here are the slides:
I’ve been doing a fair amount of helping people get started with Apache Pig. One common stumbling block is the GROUP operator. Although familiar, as it serves a similar function to SQL’s GROUP operator, it is just different enough in the Pig Latin language to be confusing. Hopefully this brief post will shed some light on what exactly is going on.