ReSoft International - email monitoring, email performance, content filtering, alerting, URL blocking, microsoft exchange server monitors, lotus notes domino monitors, service level agreements, SLA, acceptable use policies, viruses

TOPPER - Defining a Service Level for the Messaging System

Back to ReSoft Home List of Products Order/Contact Industry Resources News @ ReSoft Our Customers Learn About ReSoft International Subscribe to the Free ReSoft E-NewsBrief

Spill the Beans:
send this page to a friend or colleague
Subscribe FREE:
ReSoft E-NewsBrief


Extract from Electronic Messaging Association Messaging Magazine January/February 1996

Author: Clive Horton, President ReSoft International Defining a Service Level for the Messaging System

A few weeks ago, I was invited at the eleventh hour, to attend a show by the old Motown group, The Temptations at our local theater in Stamford CT. The show was terrific. One hundred percent entertainment by a group of professionals who really knew their stuff, went through all their old songs and dance routines. Most of the audience was representative of people growing up in the mid sixties and the whole thing took me back nostalgically to my youth as I thought about how easy things seemed to be in those days wishing, perhaps, we were back there again.

The changing role of e-mail

At the same time a thought popped into my head about some messaging throughput problems a particular customer had experienced earlier that same day which got me thinking about how it used to be in the 'old days' of messaging. In the not-so distant past, albeit not as far back as when The Temptations were top of the charts, e-mail was a much more simplistic method of communication.

Consider your own use of e-mail a few years ago. You probably had e-mail attached to your main system, maybe PROFS or All-In-One. With this came the ability to compose messages and send to colleagues on the same system and perhaps, if you had adopted an expensive mail switch, to other users in your own organization, regardless of their mail system, and maybe to selected external partners in remote locations. And if you were really advanced, you may have also had the ability to attach revisable documents to be able to share information with other users.

And supporting it was simple as pretty much everything was centralized. If the system went down, everybody, but everybody, knew about it.

But fast forward to today. The past few years have seen an explosion in a number of technology developments that directly affect the use of email - more affordable bandwidth in the network to send larger items - more affordable and more powerful intelligent desktops allowing the user to be in control of more functions - and as a result, more pressure from users to do more at the desktop. And, as we know this has encouraged users to want to share information electronically. Coupled with the mail-enabling capabilities available today from many applications and word-processing systems, the face of the e-mail system has changed - no longer a casual communication system, but the backbone for delivering many time-sensitive and mission-critical pieces of information to users internally and externally.

End users have now begun to realize the importance of the messaging network, and they are now demanding it get treated the same way as their other mission critical applications - the dreaded service level agreement from the IS department to guarantee availability. And why not? Why shouldn't the messaging system be treated the same as say, the accounting system or the banking system where one of the key critical success factors for the system is based around an agreed to service level which is defined, published and, most importantly, measured on a regular basis.

In these times of quality-measurement and empowerment, not only is this critical to the business, but it becomes critical to the Messaging manager and the measurement of the success of that role to deliver mail within some agreed guidelines.

But where do you start??

When we analyze the dynamics of measuring the availability of the messaging system, the issue is complicated by the differences between Messaging as an application and, say, the accounting or banking system.

Messaging is a distributed system. One of its major benefits is in being able to run at a local level without impacting other user groups. It can run without needing immediate interaction with other parts of the network. This localizes the application and reduces traffic across the network - not everyone needs to network up to the mainframe and back again every time . But at the same time, this distributed system is a burden to centralizing the support skills. If a local post office slows down or stops working, who will know about it first. The local user!

Messaging is an asynchronous process. Although local response time is important to the individual user, this is generally not a messaging issue. Once the message has left the user interface and entered the mail system, the issue then becomes how long does it take to get to the other end - wherever that may be - and how consistent this delivery is. Acceptable speed of delivery for such an asynchronous mode of delivery in the past could be measured in days (or not measured at all). Today it's importance is measured by the users in minutes!

Messaging is a communications system. And to do this requires connectivity to everyone else to whom the user needs to talk, both internally and to those users deemed necessary to attach to externally - vendors, sales reps, customers etc. In most cases this requires gateways between different systems, in different formats across different networks as well as interfaces to the Internet and X.400 networks. The resultant mix of multiple platforms creates challenges in maintaining a consistent method of measurement across the organization. Now the measurement problem starts to get bigger.

Messaging is a scalable system. Outside of the strengths and weaknesses of particular e-mail systems, their interoperability or migration capabilities, e-mail is a single application. Organizations need to be confident that the Messaging application will support them today and grow with them to meet their needs in the future. Therefore, the tools to manage the system need to scale from an initial implementation to the broadest user-population - encompassing company mergers, buyouts and downsizing; encompassing moving between different messaging platforms for strategic reasons. But always maintaining consistency in the background. It is not unusual to see a messaging system comprising several hundred different post offices of different types, each of which has the capability to fail at any time and for any reason.

Messaging is the single biggest application you support - compared to other applications such as Accounting, Banking or Manufacturing, the messaging system is one of the few systems that touches just about everybody across the organization. So its availability or lack of affects everyone.

Where do we find tools?

So how do you set and agree a realistic service level agreement for messaging availability - across many different Post Offices, gateways, routers and MTA's, running on many different platforms, across many different locations, both internal and external - with your users? And, of equal importance, how do you demonstrate compliance to what you have committed?

Before we can define what the Service Level for messaging should be, we need to find an accurate method for being able to measure it. The foundation for such a system can be found in the mail-monitoring system. A sound mail monitoring system has the ability to provide a robust and highly scalable way of pro-actively alerting the e-mail administrator to poorly and non-performing e-mail nodes across the largest and most diverse of corporate networks.

Without a mail monitoring system, the users typically find out about a problem in the chain of events before the Mail Administrator does - primarily because messaging is a regular and important part of their business process and some part of that process has noticeably failed to deliver. Mail monitoring can send out individual tests across the e-mail network to determine the availability of all the links in the e-mail chain. Severity thresholds can be set for each test to determine how severe the lateness of each test is. And routines can be incorporated to alert the appropriate person in the event of outages and tardiness and can report those in the appropriate manner (via e-mail, pager etc.) far earlier than the user will find out. The result - MIS is proactively managing the real-time availability of the e-mail system.

But the mail monitoring system can be leveraged in a very different way - to determine a history of the performance of the email network. For instance, consider all the mail monitoring tests going across the network. Each one comes back to a central point and contains time and date stamp information on how long that test took. Now wouldn't it be a good idea to capture that information and use it to historically determine the performance of all tests across a given time period.

For one thing, such information could be used to determine the performance history for individual nodes. Think of the power of being able to analyze all tests performed against a particular node over the past month, three months or six months. This would help you investigate the complaint by a user group in Peoria who claims their mail delivery seems to slow down on a Monday morning, where currently you have no method to trap that information. Such an analysis may help to point you in the direction of a solution - perhaps the problem is network bandwidth related. Perhaps doing some load balancing on the network will relieve this post office from some of the traffic passing through this location. Perhaps the post office disk space needs expanding or reorganizing.

But even more important to the users, if this information could be consolidated into a macro-level report for the whole messaging system, then a service level can be implemented because it can be measured. Consider being able to automatically create a report that relates the actual availability across the whole messaging system over the last week or the last month. A report that maintains accuracy by taking into account the characteristics of each of the post offices on the network, for instance, reporting only within that post office's stated operating hours and ignoring time allocated for regularly scheduled preventative maintenance. A report that takes into account issues such as overlapping mail tests when calculating outages and slowdowns.

Overlapping tests occur when there is an outage at a remote point which causes scheduled tests traveling to or through that point to stack up. The mail monitoring system should sense the delay, should alert the operator and suspend its testing, but probably not before a number of tests have been sent and become late. Once the problem is corrected and mail is flowing again, the outstanding tests will return and be time- stamped. But statistically, the service level reporting procedure should not add up the sum of those multiple tests or the statistical impact of the outage will be erroneously inflated. The reporting algorithm must subtract those times where tests overlap a specific period of time through a specific node, tallying that time only once, or the resultant report will be neither accurate nor credible.

What can go wrong will go wrong

So now we have found a way to accurately measure the service level, then how do we define the Service level for the organization? This is much more specific to each organization. In defining a service level , it is important to define something that is achievable, determined by the size, complexity and geographical spread of the network. The two key factors are - the time it should take for mail to be delivered if nothing goes wrong - and the percentage of mail that is delivered within this time. If you decide that mail should be delivered end to end within 45 minutes, it is not realistic for 100% of mail to be delivered within that time. Things will go wrong in a complex network, particularly where parts of the system are outside of your direct control. So negotiate a number less than 100% that both MIS and the Users can live with. Some organizations have found that 95% delivery within 30 minutes is acceptable, others use 98% within 2 hours. Whatever you choose to publish should reflect the complexities of your own network.

Once the service level has been established, the tool should be introduced to help you maintain compliance. Its mail monitoring tool should collect statistics on all of the node availability tests it sends out. And its reporting facilities should allow you to consolidate the collected data to report the actual service level to the end users as well as publishing statistics-on-demand at a global level and for each individual post office.

In summary, the benefits of applying a mail monitoring tool to help you measure your service level:

  • Provides you with early warning of potential messaging and network problems
  • Provides an opportunity to proactively manage and recover from problems
  • Helps maintain a higher level of user satisfaction.

I dusted off some of my old albums and dragged out my Best of The Temptations to play on an aging record deck that has lived in the attic. It was great to reminisce but the record was so scratched and the quality so poor compared to today's compact discs. It's good in many ways that things have moved on.

Author Profile

Clive Horton is President of ReSoft International LLC in New Canaan CT. He has been involved in the messaging industry for over 10 years and formed ReSoft International in 1994 to provide tools to help companies better manage multiple messaging systems across their organizations. He can be reached at clive.horton@re-soft.com 


Need more information or pricing?  Contact Us.
Home | Products | Order | Contact | Resources | News | Customers | About | Free E-News
Copyright © ReSoft International LLC 1997-1999.
All rights reserved. All trademarks, servicemarks are respected.
Comments on this site - send e-mail to info@re-soft.com
Privacy Statement
ReSoft International LLC · PO Box 124 · New Canaan CT 06840
Tel: 203 972 8462 · Email: info@re-soft.com