Simplifying Data Management

Wednesday, 20 May 2009

What’s your attitude to risk?

Last week, my son lost his USB flash drive, containing all of his school course work No problem, I thought, we’ll just use the backup copy on the home computer, or the one from the school network.

Guess what? That’s right, the only copy was on the flash drive.

Guess what else? Yep, the course work was due for submission 2 days later.

Cue much panic and frantic searching of every conceivable place to lose the flash drive, and once that was exhausted, two very late nights for my boy to recreate the work he’d lost.

And guess what else? The day after the course work was due, we found the flash drive in the rubber seal of the washing machine drum – and incredibly it still worked! Despite this lucky escape, we’re now doubly conscious of the issues using USB flash drives and are definitely changing the way we use them.

This whole 'data loss' episode reminded me of the depth of coverage in the press over the last 18 months or so about losing information like this and most of it of a much more serious nature than losing some exam course work. This sort of coverage has led to a widespread sense among businesses that they must act to improve their information security policies and systems.

Once a business decides to act, the vast majority of them look at beefing up their encryption and transference technologies. Often this is ideal in the circumstance, but can be pretty expensive and some businesses decide that the risk of inaction in this area is worth taking – just leave the information security processes as they are and wait and see what happens.

What if you could help mitigate your information risks another way? That is, without paying a sizable ransom to the various security vendors out there?

One of the biggest areas of information security risk is in developing data handling systems. I think it’s fair to say that most businesses, despite what their formal policy says, get hold of live production data – inclusive of real customer details – and copy it into their test or development environments.

Why do they do this? Simple, real data helps developer produce working code, because they can have confidence that the code they write works with real data – and likewise for testers in a test environment.

So if your developers and testers are using real customer data that means you need to provide production strength security protocols around those systems. This is where the information security vendors start to rub their hands with glee. On top of this, your dev and test platforms become much more rigid and inflexible, making changes more difficult, expensive and longer to deliver.

The conventional alternative to using real live data is to generate data to match the ‘planned for’ test conditions. The problem with this approach is that data in production systems is alive and it really does change constantly. Based purely on the analysts expectation of the data it’s nearly impossible to predict the conditions that the code will see when released into live service. This often leads to a high incidence of production faults.

Sounds like you’re caught between a rock and a hard place. Using real data means building in expensive and restrictive security protocols, but creating your own data leads to higher production faults.

There is a third way. It is possible to create realistic test data that truly mimics live conditions, maintains the utility of live customer data but contains no personally identifiable data within it. This includes maintaining the data quality errors of the system, demographic profile of an area, transactional profiles, etc.

How?

By taking an extract of real data, and securely analysing it, then randomising the data within agreed parameters and using locally defined reference material, it is possible to maintain all this data. If your business is smart enough, it will be doing this now.

My question is: if your business is not doing this now, what really is your attitude to risk?

Wednesday, 22 April 2009

Oracle and Sun - will this spawn a new DW appliance?

OK, I admit it - I'm a big, big fan of Solaris. Why? Well, it's rock solid, has great productivity features and is full of practical and pragmatic ways to deliver high value enterprise class systems management. And, best of all, it's free on x64 platforms.

But that's just one of the goodies waiting for Oracle following their purchase of the once mighty Sun Microsystems. Java and MySQL are also headliners in terms of software, but there's also the rich vein of hardware products, most notably (for Data Warehousing) the server storage line.

Having all this software and hardware experience and technology all in the same business looks very compelling in order to provide customers with a seamlessly integrated offering from a single supplier.

Oracle ventured into the DW appliance space alongside partner HP last October with the HP Oracle Exadata Storage Server and HP Oracle Database Machine. These can be compelling offerings, particularly if you have a lot of internal Oracle experience and DBAs. But they're still offerings from 2 separate vendors, and purchasing departments like to have one throat to choke.

Enter the Oracle Sun acquisition stage right. But it's not as simple as replicating the HP Oracle offering. For all the performance benefits of Exadata, there's is still a significant overhead in maintenance that typically comes with a row oriented storage DBMS.

This is one of the killer features of appliances like Netezza (among others like Greenplum, HP Neoview and Sybase IQ) - it really is plug and play as they demonstrate repeatedly in their proof of concept demonstrations. They turn up, plug it in, suck in a huge amount of data and leave it running. So if Oracle is really going to effectively compete in the appliance market, they need to make some serious improvements to the ease of setup and management of their appliance offerings.

The good news for Oracle is that, if Solaris is anything to go by, they now have the ability to take advantage of the huge brainpower and track record at Sun to build low maintenance, high productivity, holistic systems to serve the DW appliance market. The lingering doubt I have is that Netezza has already taken a lead, and I suspect Oracle will be playing catch up for some time.

Wednesday, 1 April 2009

Unnecessary data capture

Last week The Joseph Rowntree Reform Trust published a report us telling us that a quarter of all government databases are unnecessary and hold too much personal data, potentially breaching not only privacy laws but also human rights.

What’s the story behind this though? What drives governments to hold unnecessary information about us?

Like most large organisations, governments like to hold data ‘just in case’. For example, the UK Parliamentary Home Affairs Committee has strong reservations of the genuine need for more surveillance data being gathered by UK government agencies, but the defence here is that we might need it.

That is, there is no current need for holding the data, but there might be in the future. And because most of these systems take forever to change, preparing a system for ‘just in case’ saves time and money later, right?

Wrong.

The world’s leading manufacturing organisations have moved from a ‘just in case’ model to a ‘just in time’ model. Now, it’s far more complicated than that, but the these highly successful organisations do not just add features ‘just in case’. Instead, they have developed processes that allow them to develop products rapidly, allowing them to respond to changes in market conditions rather than hope that their best guesses work out.

In the software development world, this approach has been paralleled by the development and increasing adoption of ‘agile’ development practices. Scott Ambler’s Agile Unified Process (AUP) is a great starting place for businesses that are more comfortable with traditional rigid development methods and allows organisations to build their databases using a flexible framework that enables rapid adoption of new requirements when necessary.

This type of framework allows you to reduce the cost of delivery and maintenance, but allows you to respond quickly and add new features – even to highly complex systems – while maintaining and improving quality. It also bucks the rule that the later you add a requirement, the more it will cost. Agile software developments help to significantly reduce this ‘cost of escalation’ curve

So if it’s so good, why isn’t everyone doing it and succeeding at it?

Most large organisations have spent a great deal of time and effort developing consistent, repeatable delivery processes (most often waterfall methods) that capture learning form one project and make sure those lessons are applied to all future projects. On the face of it this sounds like they’re applying best practice. However, applying every lesson to every project is usually the wrong thing to do. Not every problem found in one project will re-occur in another one, and the delivery method can easily become a super heavyweight behemoth that frustrates every attempt to deliver at speed. I’ve worked in multiple large organisations where their fast track method took at least 9 months!

When faced with entrenched organisational culture that says 9 months is fast, the adoption of ‘agile’ methods is a real shock to the system. Many people struggle accepting that the new way is better, and this often leads to serious cultural clashes. Organisations seeking to reduce their delivery overheads and get to market quicker by adopting agile methods must do this in tandem with a top down cultural change programme, or the new methods will be seriously hampered.

And this is perhaps why governments still insist on capturing and keeping everything, because it’s too hard for them to introduce a change program that deals with the culture as well as the technology.

Other agile approaches like Lean Agile (my preferred approach when conditions allow), Scrum and Crystal Clear also have tremendous value (as well as AUP). Just remember that the full benefits of these methods will only come to fruition with strong executive support.

Monday, 16 March 2009

How do SMEs manage?

Data management has a long history within large enterprises. This is because they've long since realised that the data that they capture day to day can be used to provide insight into what their customers actually do with their products and services, and help to plan new and better ways to serve existing clients and attract new customers.

Or to put it another way - marketing.

OK - nothing new there - everyone's doing that aren't they?

But what about Small and Medium Sized enterprises? Data Management tools, methods, software, etc. all seem to be exclusively aimed at large enterprises, largely because the software is often very expensive, or it requires an army of developers and project managers. This pushes sophisticated data management out of the reach of most SMEs.

So what?

Well, people like Gartner and IBM estimate that information stored is doubling every 18 months or so (OK, so maybe IBM have a vested interest in saying that) so it won't be long before SME's face real problems coping with these volumes. For SME's to remain competitive, they're going to have to catch up with the big boys pretty fast, or find that their edge could be blunted by issues like data quality creep or systems that seem to conflict with rather than complement each other.

So what's out there to help?

Some vendors are starting to realise that there is a need. Vendors that traditionally serve SMEs for data management tools have often lacked the investment poured into the enterprise class technologies that are funded by the large bills payed by blue chip companies for the best technology. Consequently, there's not been a great deal to shout about for SMEs.

But that's changing. The enterprise vendors like IBM, Informatica and Ab Initio are now looking to draw in more clients in the 'Medium' part of SME. And new entrants to the market like Expressor are targetting everyone from Mom and Pop businesses to the very largest corporations. Even so, many of these tools require experts to apply them, and while they are all capable of using templates to get you started quickly, very few of them come with templates out of the box.

There's is therefore lot of potential opportunity for systems integrators (SI) to apply their skills in this market. But they can’t expect SME’s to handover wads of cash to develp these templates on site. SIs will have to come to the party with well formed, well targeted, robust offerings that can help SMEs drastically reduce risk, time to market, delivery and operational costs

Having worked in very large businesses for the last 17 years, the question I still have is: at what point will SMEs realize that they have problems managing their data, and then how easily can they deal with these problems with their traditional approaches?

Wednesday, 20 May 2009

What’s your attitude to risk?

Wednesday, 22 April 2009

Oracle and Sun - will this spawn a new DW appliance?

Wednesday, 1 April 2009

Unnecessary data capture

Monday, 16 March 2009

How do SMEs manage?

Simplifying Data Management

About Me

Other good Dm blogs

How much risk is there in your software testing of a customer data security breech?

Followers