Friday, March 3, 2017

Takeaways

We've come to the end of a pretty fascinating class, and I have to say I've been a little overwhelmed by it all. The focus of the class was data - big data, specifically - and how we can start to make sense of it all. I admit that I came into the class with an agenda - wanting to know more about any tools that could make my life as a Prospect Research Analyst easier.

I won't touch on everything we covered in the class, but I did want to take stock and make a few final notes:

Data Warehouses

I learned a couple of hard lessons on this one. First all, OLAP and OLTP are two completely different ways of looking at information. Even though they may contain much the same data, the two systems are built with very different purposes in mind. I was also lucky to have the chance to take a glimpse at my own company's DW and see the extraction and transformation processes described in our textbook laid out in an honest-to-goodness working environment. I don't know that I'll ever wrap my mind completely around the OLAP layouts (25 years of OLTP is burned into my brain), but it's nice to understand, at least a little, how the process works, and why it works that way.

Tableau

Oh, how I want to spend more time with this! The possibilities of visualizations in Tableau are endless, as anyone can see pretty readily from checking any popular news or science website. I didn't realize it before, but Tableau is everywhere, lurking as the visual wizard in the background, helping to make sense of the vasts amounts of data in both public and private domains. The organization where I work will be implementing Tableau dashboards soon, so I'm really looking forward to spending more hands-on time with the software and digging deep into our data.

Also in this module we were encouraged to download DQ Analyzer, a tool that I'm not sure I can live without, now that I know about it. It makes data cleansing so much more manageable! I'm looking forward to spending more time with this tool, also, learning how to create the algorithms and parameters that will bring it more fully to life.

Google Analytics

Google Analytics is a powerful, if somewhat flawed, tool. We learned about traffic, bounces, conversions and all the rest, but like many of the other modules of this class, my main takeaway was that a website (or visualization) should serve its audience. For some sites, a conversion may mean an item purchase. For others it may simply mean the retrieval of a piece of information. The site should be built (and/or monetized) with the end goals in mind.

Network Analysis

This was another revelation - something I've always suspected might exist in the world, but didn't realize it actually was so easily available. In another blog post I mentioned that my workplace recently started working with a relationship management software vendor. Unfortunately, the visuals for the relationships on their platform are very simplistic. My hope is to someday access their API and bring the network of our donors (and potential donors) to life. With Gephi, this should be a snap.

A few final bookmarks for myself, just some random tutorials and examples that I picked up along the way, and that I hope to refer back to as I try to learn more about these great tools:

A fun example of a great visualization (with dinosaurs!).
To remind myself that Google Data Studio exists!
Mining twitter data (yet another thing I want to explore more).
A possibly interesting collaborative data science tool that I need to check out.
A book I want to study more thoroughly.
Egg users - yep, that's the term for them.
Gephi tutorial by Jen Golbeck that I found really helpful.

So - it's been a fun class. And it's been incredibly enlightening. I wish my classmates well in their future data science endeavors!