We've come to the end of a pretty fascinating class, and I have to say I've been a little overwhelmed by it all. The focus of the class was data - big data, specifically - and how we can start to make sense of it all. I admit that I came into the class with an agenda - wanting to know more about any tools that could make my life as a Prospect Research Analyst easier.
I won't touch on everything we covered in the class, but I did want to take stock and make a few final notes:
Data Warehouses
I learned a couple of hard lessons on this one. First all, OLAP and OLTP are two completely different ways of looking at information. Even though they may contain much the same data, the two systems are built with very different purposes in mind. I was also lucky to have the chance to take a glimpse at my own company's DW and see the extraction and transformation processes described in our textbook laid out in an honest-to-goodness working environment. I don't know that I'll ever wrap my mind completely around the OLAP layouts (25 years of OLTP is burned into my brain), but it's nice to understand, at least a little, how the process works, and why it works that way.
Tableau
Oh, how I want to spend more time with this! The possibilities of visualizations in Tableau are endless, as anyone can see pretty readily from checking any popular news or science website. I didn't realize it before, but Tableau is everywhere, lurking as the visual wizard in the background, helping to make sense of the vasts amounts of data in both public and private domains. The organization where I work will be implementing Tableau dashboards soon, so I'm really looking forward to spending more hands-on time with the software and digging deep into our data.
Also in this module we were encouraged to download DQ Analyzer, a tool that I'm not sure I can live without, now that I know about it. It makes data cleansing so much more manageable! I'm looking forward to spending more time with this tool, also, learning how to create the algorithms and parameters that will bring it more fully to life.
Google Analytics
Google Analytics is a powerful, if somewhat flawed, tool. We learned about traffic, bounces, conversions and all the rest, but like many of the other modules of this class, my main takeaway was that a website (or visualization) should serve its audience. For some sites, a conversion may mean an item purchase. For others it may simply mean the retrieval of a piece of information. The site should be built (and/or monetized) with the end goals in mind.
Network Analysis
This was another revelation - something I've always suspected might exist in the world, but didn't realize it actually was so easily available. In another blog post I mentioned that my workplace recently started working with a relationship management software vendor. Unfortunately, the visuals for the relationships on their platform are very simplistic. My hope is to someday access their API and bring the network of our donors (and potential donors) to life. With Gephi, this should be a snap.
A few final bookmarks for myself, just some random tutorials and examples that I picked up along the way, and that I hope to refer back to as I try to learn more about these great tools:
A fun example of a great visualization (with dinosaurs!).
To remind myself that Google Data Studio exists!
Mining twitter data (yet another thing I want to explore more).
A possibly interesting collaborative data science tool that I need to check out.
A book I want to study more thoroughly.
Egg users - yep, that's the term for them.
A Gephi tutorial by Jen Golbeck that I found really helpful.
So - it's been a fun class. And it's been incredibly enlightening. I wish my classmates well in their future data science endeavors!
Friday, March 3, 2017
Sunday, February 19, 2017
Six Degrees of Bill Gates
As a prospect researcher, I'm asked about once a week by a fundraiser how they might be able to get in touch with Bill Gates. Or Oprah. Or the King of Sweden. It's a legitimate question. Anyone who does fundraising knows that there are the donors - the passionate folks who want to make a difference with their financial currency - and then there are the connectors - often completely different folks who seem to know everybody and are willing to help out by using their social currency. One of the keys to successful fundraising is being able to collect and utilize both kinds of currency.
For years, we researchers in Arizona relied on what was known as "The Red Book". (Not related in any way to Mao Tse-tung. Honestly.) Our "The Red Book: A Community Directory" is a hardback book bound in bright red velveteen, which lists the Who's Who of Arizona business and society. Fundraisers and social climbers and others have been using it for years. In 1978, for instance, if you wanted to find out who might know somebody that worked for Governor Rose Mofford, because you really needed to catch Ms. Mofford's ear about an important project - you, as a prospect researcher, would go to the Red Book and start thumbing through it, trying to find anybody who might work in the governor's office, and hoping to heck you recognized a name.
Now everything's been simplified, of course. Or made more complex but also more accessible. Facebook lets you request a friendship with anyone in the world, almost, although that request might be ignored. And LinkedIn can tell you just how many "nodes" or connections lie between yourself and Bill Gates. (I just checked. I have no Links between myself and Bill Gates. Bill Gates and I are not connected. I don't know him. I don't know anyone who knows him, or anyone who knows somebody that might know him, ad infinitum.)
But the most exciting advance in social networking for philanthropy is the practical, proactive implementation of network science. At the University of Arizona, we've recently started using a network analysis tool that helps us uncover the relationships between our most important "social currency" philanthropists and the "financial currency" philanthropists we wish to meet. Nowadays when we're asked by a fundraiser how they might be able to contact Bill Gates, we type Mr. Gate's name into the network tool and we can see exactly who in our organization knows him, or who might know someone who knows him. (I just typed my name and Mr. Gates' name into our network analysis tool, and see that we have two degrees of separation. I could call somebody who could call him. Pretty nifty.)
So for me, this week has been one of the most interesting of all our modules in MIS 587, since I'm starting to understand how the visualizations in our new research tool at my job actually work, how the nodes and vertices give weight and meaning to the connections when we enter a name into the search box.
But I think there's still much more to explore in terms of philanthropy and network science. For instance, nonprofits have recently flipped the model around and have started to band together to effectively increase the size of their node and also their connectedness, bringing their potential donors together into a larger common pool. And crowdfunding has drastically changed the landscape for smaller donations, making it easier for a person to donate to any given cause, but harder for the cause to stand out in a crowded field. And I'm willing to bet that, sometime in the next year or two, some network scientist somewhere will most likely write a paper or create an app or a tool that will change the face of philanthropy once again. And I can't wait to see what happens next.
Additional reading:
Sunday, February 12, 2017
Web Analytics
This week's introduction to Google Analytics was interesting, but all too brief. There seem to be many factors that can corrupt or skew the data, such as ad blockers, private browsing, and especially lack of planning on the web developer's part. Does the site flow clearly to an end goal? If not, does it serve its purpose, informatively or otherwise? Is conversion well defined? Are engagement times meaningful? What kinds of engagement are meaningful? These all depend on the type of site being studied.
The metrics within Google Analytics provide various ways to slice and dice visitor data. There's also the ability to create dashboards, which I enjoyed exploring. But the thing I found most interesting was Google's Data Studio. Going from Tableau to Data Studio is a natural progression (or regression, maybe, since Data Studio seems somewhat less complex than Tableau). It's easy to create a basic dashboard to monitor site visits and to customize it, like the one I created for the MISonline site below:
That's all for this week. A thunderstorm is coming in, and I'd like to watch the rain approach over the Tucson desert. Web analytics can wait until tomorrow.
The metrics within Google Analytics provide various ways to slice and dice visitor data. There's also the ability to create dashboards, which I enjoyed exploring. But the thing I found most interesting was Google's Data Studio. Going from Tableau to Data Studio is a natural progression (or regression, maybe, since Data Studio seems somewhat less complex than Tableau). It's easy to create a basic dashboard to monitor site visits and to customize it, like the one I created for the MISonline site below:
However, I'm reminded of the old adage, GIGO (garbage in, garbage out). Without a thorough understanding of the numbers - what they actually mean, as opposed to what we might think they mean - even a dashboard could be misleading. Throughout the class, a few themes are emerging for me:
- Do I understand how the data was acquired and what it represents?
- Do I understand the audience that I'll be presenting the data to?
- Do I know how to present the data in a meaningful way?
I wonder if our organization uses Data Studio. I wonder if we understand our site visitors, where they come from, and for what reasons they use our site. Are we accurately filtering out spam visits, backlinks, and bots? We're a nonprofit organization, so have we set conversion goals for donation pages, etc?
I'm having lunch with our webmaster this week. I'll have to find a way to ask these questions without seeming like a know-it-all who has studied this stuff for all of a week. (Or maybe I'll stay quiet at lunch and read up on web analytics a little more before next month's lunch.)
Bookmarks for further reading (before next month):
That's all for this week. A thunderstorm is coming in, and I'd like to watch the rain approach over the Tucson desert. Web analytics can wait until tomorrow.
Sunday, February 5, 2017
Respect the Audience (or, the TEDapalooza post)
I've been thinking a lot about visual communication recently, and communication in general. Maybe that's because the organization where I work is preparing to implement intranet dashboards for our fundraisers to help them track their goals and progress. Or maybe it's because I'm preparing to give a PowerPoint presentation to a large audience in a few months, and part of that presentation covers the various ways to establish effective communication between Prospect Researchers (us behind-the-scenes folks) and frontline fundraisers, and also between fundraisers and potential donors.
Or maybe I'm thinking about visual communication because I'm taking a class in Data Analysis and our topic this week has been the many forms of visual representation that can be used to present various types of of data.
One of our class-assigned "readings" was actually a TED Talk by Hans Rosling. You may have seen the video already, as I had, although it didn't, at least the first time I saw it several years ago, have the impact it's had on me now.
This time around, I was struck by several points, such as:
Statistics can lie, of course. Everybody knows that. The critical thinkers among us, every time we read the Washington Post or The Nation, try to consider what might be influencing any statistics or visuals we see. The data sources are important, for instance. Are they primary sources, or secondary, or worse? Timeliness and accuracy of the data are important, too. Of utmost importance is the selection and presentation of the data. Have any facts been omitted? Has the context been fully rendered and disclosed?
Narratives, or stories, can also lie, and can do so more effectively than mere statistics. Facebook and the internet at large are full of one-off, tear-inducing (or anger-inducing) articles about individuals that have been insulted or victimized in some way. But even if the story is true, and illustrates an issue perfectly, it doesn't take much Googling to find an equally true story illustrating the other side equally as well. We live in a minefield of competing narratives.
In his book "Against Empathy," Paul Bloom argues that it makes more sense for us to look at issues from a higher level. "The concern about empathy is not that it's consequences are always bad ... It's that its negatives outweigh its positives." It's too easy to be deceived, he means, by a heart-wrenching photograph and a plea for action. Acting only on empathy, we often make bad decisions.
But this is where the audience comes in. And the storyteller, too, whether she be a DataViz wizard or a sappy romantic who prefers to sing sad songs of lost love.
Sometimes the audience needs only a snapshot (for instance, a Fundraiser needing to know his goals). Sometimes the audience needs an overview, over time (a Research Analyst, needing to see the correlation between his work and overall gift revenue). And sometimes the audience truly wants to be told a story (a potential donor, perhaps, looking for a worthy recipient of a charitable gift).
We, as the storytellers or visualizers, must learn to accommodate the needs of the audience, but also not discredit ourselves in the process. We should tell a story, if that's what's needed, but we must also support the story with strong data. And, as audience members - which all of us are in one way or another - we must demand more than a sad picture or a sappy song, or take a pretty chart at face value.
In thinking about my upcoming PowerPoint presentation to that large audience I mentioned earlier, I'd like to try to emulate what Mr. Rosling accomplishes in his TED Talk. I'd like to illustrate my points on a population-by-population level, drilling down to individual narratives only when it's needed to illustrate a larger point ... drilling down only when it's supported by a larger, valid, timely, unbiased set of facts.
That's all for this week.
Bonus content: For those of my classmates interested in the psychology of PowerPoint, here's one last TED Talk that nicely recaps many of the things we've covered in our class this past week (simplicity, clarity of communication, unclutteredness, one-screen dashboards, etc.):
Or maybe I'm thinking about visual communication because I'm taking a class in Data Analysis and our topic this week has been the many forms of visual representation that can be used to present various types of of data.
One of our class-assigned "readings" was actually a TED Talk by Hans Rosling. You may have seen the video already, as I had, although it didn't, at least the first time I saw it several years ago, have the impact it's had on me now.
This time around, I was struck by several points, such as:
- Visualization isn't always an end in itself. Sometimes it can be a means to an end. Sometimes it can be a beginning - a way to start thinking about deeper or more complex issues.
- Simply animating a visualization - making it move against time or other dimensions - can breathe emotional life into what would otherwise be a static chart or graph.
- Mr. Rosling employs storytelling throughout his presentation, but does so at a population-by-population level, instead of an individual-by-individual level. The human mind likes to be told a story. It likes to empathize. We, as humans, are naturally influenced by narrative. Even though the stories in his Talk deal with populations and statistics, Mr. Rosling's storytelling is extremely effective.
Statistics can lie, of course. Everybody knows that. The critical thinkers among us, every time we read the Washington Post or The Nation, try to consider what might be influencing any statistics or visuals we see. The data sources are important, for instance. Are they primary sources, or secondary, or worse? Timeliness and accuracy of the data are important, too. Of utmost importance is the selection and presentation of the data. Have any facts been omitted? Has the context been fully rendered and disclosed?
Narratives, or stories, can also lie, and can do so more effectively than mere statistics. Facebook and the internet at large are full of one-off, tear-inducing (or anger-inducing) articles about individuals that have been insulted or victimized in some way. But even if the story is true, and illustrates an issue perfectly, it doesn't take much Googling to find an equally true story illustrating the other side equally as well. We live in a minefield of competing narratives.
In his book "Against Empathy," Paul Bloom argues that it makes more sense for us to look at issues from a higher level. "The concern about empathy is not that it's consequences are always bad ... It's that its negatives outweigh its positives." It's too easy to be deceived, he means, by a heart-wrenching photograph and a plea for action. Acting only on empathy, we often make bad decisions.
But this is where the audience comes in. And the storyteller, too, whether she be a DataViz wizard or a sappy romantic who prefers to sing sad songs of lost love.
Sometimes the audience needs only a snapshot (for instance, a Fundraiser needing to know his goals). Sometimes the audience needs an overview, over time (a Research Analyst, needing to see the correlation between his work and overall gift revenue). And sometimes the audience truly wants to be told a story (a potential donor, perhaps, looking for a worthy recipient of a charitable gift).
We, as the storytellers or visualizers, must learn to accommodate the needs of the audience, but also not discredit ourselves in the process. We should tell a story, if that's what's needed, but we must also support the story with strong data. And, as audience members - which all of us are in one way or another - we must demand more than a sad picture or a sappy song, or take a pretty chart at face value.
In thinking about my upcoming PowerPoint presentation to that large audience I mentioned earlier, I'd like to try to emulate what Mr. Rosling accomplishes in his TED Talk. I'd like to illustrate my points on a population-by-population level, drilling down to individual narratives only when it's needed to illustrate a larger point ... drilling down only when it's supported by a larger, valid, timely, unbiased set of facts.
That's all for this week.
Bonus content: For those of my classmates interested in the psychology of PowerPoint, here's one last TED Talk that nicely recaps many of the things we've covered in our class this past week (simplicity, clarity of communication, unclutteredness, one-screen dashboards, etc.):
Sources for this post
TED Talks:
Dave Lieber, "The Power of Storytelling to Change the World"
David JP Phillips, "How to Avoid Death by PowerPoint"
Hans Rosling, "The Best Stats You've Ever Seen"
Book:
Sunday, January 15, 2017
My Data is Your Data
Photo courtesy of Frederico Cintra via Flickr |
Usually, I think of my job as something like "helping to bring donors together with causes they're passionate about" or "helping to support the University's hundreds of educational and social and cultural and scientific programs." But this week was my first week in the MIS 587 online class at the University of Arizona, and after hearing the lectures and reading the recommended articles, I realized that even though the ends of my professional work are donors and programs, the means of getting them together are, quite often, data.
In the first lecture of our course this week, we learned that Business Intelligence covers all parts of data and its uses: collecting data, cleaning it, reporting on it, storing and analyzing it.
And also this week, at work, I spent much of an entire day cleaning an incoming data set consisting of thousands of records which will soon be imported into our University's donor database. I spent another morning listening to a product vendor preach the benefits of receiving up-to-the-minute notifications of our donors' social media clicks. And I spent several more hours combing through online data sources (property assessor databases, state corporation commission databases, old newspaper records, archived internet pages) to compile individual profiles of potential donors so that our gift officers will be able talk with them in a more meaningful and directed way. Was this Business Intelligence? I think so. I hope at least that it's intelligent business - helping the University to connect with passionate donors. At our organization, as with just about any nonprofit of any size throughout the world, data is helping to drive these connections.
However.
Even though the nonprofit and for-profit worlds are both subject to scrutiny, nonprofits must often live up to a higher standard, especially when it comes to the use of personal data. In the UK, nonprofits have recently come under heavy criticism and even fines for using their donors' data in ways not specifically approved by the donors. The UK's Data Protection Act (DPA) covers which uses of personal information are and are not allowable. The DPA applies to all uses of personal data, not only to uses by nonprofits, and the list of violations is a lengthy one, affecting both commercial and non-commercial enterprises.
So far, in the United States, personal data isn't protected under a sweeping federal law such as the United Kingdom's DPA, but is instead protected sector-by-sector, with specific laws such as the Fair Credit Reporting Act (FCRA) for banking and credit data, the Health Insurance Portability and Accountability Act (HIPAA) for personal health and hospital data, and (the one I'm most familiar with) the Family Education Rights and Privacy Act (FERPA), which restricts how and when and by whom the personal data of students and alumni of educational institutions may be accessed and used. Interestingly, the U.S. is one of the few industrialized countries (obligatory PDF warning) not to have a comprehensive Federal privacy protection law.
It's no secret that we freely hand over our data each day to nameless strangers, through Facebook, LinkedIn, and other social media. Or when we sign up for an online video or shopping service. Or pay a bill online. Or sign up for an online newsletter. We've come to expect that our name and phone number and addresses (both email and home) will be sold and re-sold, traded and grouped and packaged and parsed. We'll be put into voting blocs, and buying blocs. Every click on a link is a wave of the hand, signaling our unique, personal interests. (I once made the mistake of going to the MeUndies site because I'd heard their ads on a certain podcast for months and finally wanted to know what all the fuss was about, and then for weeks afterwards I was followed everywhere I went online by MeUndies ads. I couldn't even browse the web if people I knew was standing nearby since I was too embarrassed to have them see images of the soft, comfortable, extremely affordable underwear on my screen.)
Anyway.
As technology's impact on our lives grows, I think we're becoming more aware that our personal data is doing untold, unknown things in the background of our lives. Some of those unknown things are useful (such as, I hope, connecting donors with their passions), and some are just annoying (robocalls), and some are downright dangerous. Personally, I suspect we're getting more and more comfortable with handing this information over to the world, knowing, or at least hoping, that the information will ultimately benefit us. Just today alone, for instance, I've shared my data with Firefox and Chrome and Google and Youtube and a couple of different ISPs, and Amazon, and Amazon's Alexa (seriously, the Echo is a pretty great little machine if you can get past the fact that it's constantly listening to everything going on in your home), and with the Wink smart home hub and its motion detectors, and various home appliances, and with Sprint and its GPS and data usage tracking, and with an unnamed fast food restaurant and my unnamed bank which helped to pay for the sausage biscuit and iced mocha, and with probably at least a dozen other companies and data aggregators I didn't even think about, since I'm fairly confident that the bits of information those companies know about me today will help to make my life tomorrow a little better.
But still. Knowing what I know, doing what I do for a living, researching people and their passions, and now learning even more through the MIS class about how our personal data is being used, I'm starting to think twice before giving a "like" to my Aunt Mabel's latest status update. I know just how quickly a simple "like" or a "poke" can turn around and either "like" - or possibly "poke" - me back.
(THIS POST IS NOT SPONSORED BY OR AFFILIATED IN ANY WAY WITH MEUNDIES, THE SOFT, COMFORTABLE, EXTREMELY AFFORDABLE UNDERWEAR PERFECT FOR EVERY BODY.)
Sources:
- Article: Information Commissioner's Office, (2016) "ICO investigation reveals how charities have been exploiting supporters"
- Article: MIT Technology Review (2013) "Big Data Gets Personal"
- Article (PDF): Jay, R. P., (2014)"Data Protection & Privacy 2014"
- Website: Information Commissioner's Office (2016) "Actions We've Taken"
- Website: United Kingdom Data Protection Act 1998
- Website: Privacy Rights Clearinghouse
Subscribe to:
Posts (Atom)