Defining & measuring “non-business” email

Something we help our customers measure and address is the flood of non-business/commercial email that corporate email systems and the users receive every day.  This is the load of wanted email that people receive daily – everything from Google alerts to joke of the day emails. 

Maybe this stuff is wanted, but hardly business-related and not worthy of the significant costs to not only archive but wade through when trying to find an email during the course of a review, investigation, or legal discovery exercise.

So rather than this becoming a commercial for our intelligent classification product, I wanted to provide some actual data based on my own inbox.

I started this on 3/16 with both my inbox and incoming email from our former alliance manager’s address (I picked up his email when he left the company).

The results:

Number of days:  22 including weekends
Number of emails: 419
Total size:  12.5MB

Wow.  That’s a lot of "informational" email filling up my box, clogging our network, and being archived off.  Wouldn’t it make more sense to get (most) of this information via RSS feed?  At a minimum this stuff should be tagged, routed, and saved for what it is – non-business email.

Maybe this isn’t a huge number to you, but if you have thousands of employees and extrapolate these numbers over a year you quickly need to begin measuring in terabytes.

This is by far not the most sophisticated scenario our email classification product can handle, but demonstrates the importance of differentiating between high-value and low-value correspondence.  Especially when the cost per email of legal review can be over $2.

So, how does this work?  Artificial intelligence, Bayesian analysis, proprietary algorithms?  Truth be told, a very practical methodology:

1.  Identify known sending addresses that distribute this kind of stuff (examples include googlealerts-noreply@google.com) – 100% confidence on these and we have a master list of hundreds that can be immediately deployed or reviewed by the legal folks if need be
2.  Identify known words/phrases in the senders address that are indicative of this kind of stuff (examples include alert, news, etc) – still high confidence but needs to be validated with an activity profile
3.  Identify known words/phrases in the body text (an example is boilerplate unsubscribe language) – prone to false positives if the list of words/phrases is too broad – we recommend less is more for starters

Can you accomplish this without email classification software?  Yes, on an individual level if you want to experiment.  Just set up a mailbox rule in Outlook using the framework above and you can see this in action.  Email classification software deployed on a company-wide basis can do this automatically for all users.

Jigsaw & contact information as currency

I certainly don’t claim to be an early adopter of technology (the irony about my chosen career path is not lost on me, but more on that later) and come across many things long after they burst on the scene and turn into widely known and used products.

Jigsaw is one of those services/products that I have recently begun to understand.  Mostly because my inbox and voice mail box are filling up with messages from folks that seem to know exactly how to reach me (vs. finding me through our office phone switch).  This is not too surprising given my role as external spokesperson for MessageGate, but one particular inquiry and follow-up cell phone call made me ask the person how they found me.

The answer – Jigsaw.

I was certainly intrigued that somewhere out there my contact vitals are posted and once I got to understand how they build their contact base was left with a thought about how they have turned business cards and address books into currency that is freely traded.

In a way Jigsaw has created a dynamic marketplace of contact information where those who need to reach people they don’t know can trade on the value of the people they do know or know how to reach.

Which leads me to my next thought – who "owns" the value of my contact information? 

Does frequency of contact correlate with importance?

So here’s a question that comes up often in helping our customers interpret information about their messaging traffic and is something for us to think about on an individual level.

Does the frequency (how often) someone emails, calls, or otherwise "pings" you define their level of importance in your social network?

I see two sides to this:

1.  Those that have not contacted you before or have done so again after a long period of time are not part of your frequent contact universe. 

  • This includes unsolicited emails, telemarketing calls (which I both create thru programs and receive as a favorite target for vendors), spam, spim, etc.
  • These *should* be defined as low-value contacts

2.  Those that contact you often are part of your frequent contact universe

  • This could include spouse, children, co-workers, supervisors, etc.
  • This should be defined as a high-value contact

But how do you determine a high-value new contact event from a low-value new contact event?  Or even how do you differentiate between a common low-value contact (newsletter, system alert, relentless salesperson) and a common high-value contact (supervisor, co-worker).

Maybe through some sort of validation – be it a referral like LinkedIn or social intelligence tool like Visible Path

Or the addressing or structure of the message could shed some light.  Where you are a cc or one of many cc’s, this could indicate low-value correspondence.

As more and more instant communications tools are made available the number of inquiries demanding our attention will only increase  Having a way to intelligently or at least pro-actively segment this traffic regardless of source seems the next logical step.

Opening Day

After tinkering with this for some time and wondering who in the world has time for this type of thing, I have now officially joined the blogosphere.

Whether anyone cares about my ramblings and musings is another thing, but I will do my best to create relevant and compelling content.  I read many other sites and will link shamelessly, but my motivation is not to make money from this.

So, what is my motivation?

I have spent my entire professional career focused on collaboration in some form or another from the days of telecommunications deregulation to low earth orbit satellites to cellular networks to electronic communications and the current thorn in the side of corporate america – email.

Most of that experience is through early stage or start-up companies anchored by several years of consulting with the now vanished Arthur Andersen organization.   

My selection of "Reply to All" as a title of this blog is indicative of the challenges and habits created by our technology "enablers" in our modern workplace.  I seek to touch on how we communicate as people and how young companies struggle and strive to make the process better, faster, cheaper, etc.