Cerebral Mastication

Something to chew on...

Virtual Conference: R the Language

On Tuesday May 4th at 9:30 PM central, 10:30 eastern, I’ll be giving a live online presentation as part of the Vconf.org open conference series. I’ll be speaking about R and why I started using R a couple years ago. This is NOT going to be a technical presentation but rather an illustration of how an R convert was created and why R became part of my daily tool set.

Simulating Dart Throws in R

Back in November 2009 Wired wrote an article about some grad students who decided to try to stochastically model throwing darts. Because I don’t actually read printed material I didn’t see the article until a couple of months ago. My immediate thought was, “hey, I drink beer. I throw darts. I build stochastic models. Why haven’t I done this?” Well we all know why I haven’t done this. I have a job and a 2 year old daughter and I like my wife.

I don't even know how wrong I am!

[caption id="attachment_705” align="alignleft” width="283” caption="“as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.” US Defense Secretary Donald Rumsfeld, February 12, 2002”][/caption] I’ve been a long time reader of the blog “Messy Matters” (which invokes terrible images now that I am potty training a toddler).

Chicago R User Group... It's for the sexy people!

[caption id="attachment_673” align="alignleft” width="169” caption="Morris Day, y’all! “][/caption] I think we all know that Morris Day was talking about when he wrote the lyrics to “The Bird”: That’s right, he was talking about the new R User Group in Chicago! a.k.a Chicago RUG! We know that R is sexy because statistical analysis is sexy. That is, if you’re doing it right! Even Mike Driscol at Dataspora knows that Data Geeks have to get their sexy on.

The Future of Math is Statistics

The future of math is statistics… and the language of that future is R: I’ve often thought there was way too little “statistical intuition” in the workplace. I think Author Benjamin would agree.

Lookup Performance in R

Rumor has it that Joe Adler, author of the O’Reilly Book R in a Nutshell, has joined Linked In as a data scientist. But that does not keep him from still pumping out some interesting content over at OReilly.com. His latest article is about lookup performance in R. He does a great job giving code samples and explaining what he is doing. Worth reading, for sure.

Real-World, Real-Time Analytics

Stop wasting time reading my drivel. You need to head over the the DataWrangling.com blog and read Peter Skomoroch’s interview with Bradford Cross of FlightCaster. Peter wrote up this interview back in August 2009, so I’m a little late to this party. There’s some really great quotes in this interview. Here’s a few of my fav quotes from Cross: Here’s a problem I think anyone who works with data and models can relate to:

You can Hadoop it! It's elastic! Boogie woogie woog-ie!

[caption id="attachment_594” align="alignleft” width="261” caption="This blog’s name in Chinese! “][/caption] I just came back from the future and let me be the first to tell you this: Learn some Chinese. And more than just cào nǐ niáng (肏你娘) which your friend in grad school told you means “Live happy with many blessings”. Trust me, I’ve been hanging with Madam Wu and she told me it doesn’t mean that. So how did I travel to the future to visit with Madam Wu, you ask?

Using the R multicore package in Linux with wild and passionate abandon

One of my primary uses for R is to build stochastic simulations of insurance portfolios and reinsurance treaties. It’s not uncommon for each of my simulations to take 20 seconds or more to complete (if you’re doing the math, that’s 55 hours for 10K sims or, approximately 453 games of solitaire) . Initially I ran my sims in R running on an Oracle VirtualBox (Oracle now owns Virtualbox! gasp ) running Ubuntu.

Remote Backup Fail and How to Silently Copy Files

Recently I’ve run into frustrations with Iron Mountain Connected Backup so I’ve been looking for alternatives. Alternatives: I’ve been running Jungle Disk at home and really like it. I could use that at work except I have not set up an Amazon or RackSpace account with my work credit card. But I am in Chicago and my database server/ file server is in Dallas TX. So I decided to just create a mirror on my laptop onto a shared drive on my server.