Automating TempDB Configuration in Azure

One of the unique things about managing SQL Server on Azure VMs is that we use the local D: drive for TempDB. The D: drive (or /dev/sdb1 for those of you running on Linux) is a direct-attached solid state drive (on nearly all VM tiers) which offers the lowest latency storage in Azure, making it ideal for TempDB’s busy workload. There is only one catch–that temporary volume in Azure is ephemeral, meaning the data gets wiped whenever your VM is rebooted.

gear-472008_640

You may think this isn’t a big deal–TempDB gets recreated every time your instance restarts, so who cares if the files get wiped? Well, it’s not so much the files as the directory that the files live in. You could just place your files at the root of the D:\ drive, however that would require your SQL Server service start as admin. Since we like to follow security best practices, we aren’t going to do that. I usually follow this process as defined by Jim Donahoe b|t) in this post.

I was teaching Azure infrastructure last week, and decided that it might be a good idea to do this using Desired State Configuration (DSC) which is part of automation. DSC allows you to use PowerShell scripts and a specific template format to define the configuration a group of machines (or a single machine). Documentation on DSC is sporadic, this project is a work in progress, because I had a client deadline.

But before I can even think about DSC, I needed to code the process in PowerShell. I start out by calling Dbatools which greatly simplifies my TempDB config. By using Set-DBATempDBConfig I just need to pass in the volume size, which I can get from WMI–I’m allocating 80% of the volume to TempDB (I use 80% because it’s below the cutoff of most monitoring tools) and then the script does the rest. I have a good old tempdb script,  but by using DBATools I eliminate the need to figure how to run that in the context of automation.

You can import PowerShell modules into Azure Automation now–this is a relatively recent change. I don’t have this fully baked into DSC yet, but you can see the PowerShell to create the PowerShell script (yes, read that correctly) and the scheduled task in Github.

I’d welcome any feedback and I will add a new post when I finish the DSC piece of this.

Would You Fly a Plane with One Engine? Or Run Your Airline with One Data Center(re)?

For those of you who may of been in the US or outside of Europe this past weekend, you may not have heard about the major British Airways IT outage, that took down their entire operations for most of Saturday and into Sunday. Rumors, which were later confirmed, were that a switch from primary to backup power at their primary data centre (they’re a UK company, so I’ll spell it in the Queen’s English), lead to a complete operations failure. I have a bit of inside information, since my darling wife was stuck inside of Terminal 5 at Heathrow.

Image result for jet with missing engine

There’s a requirement for planes that travel across oceans call ETOPs, which stands for Extended Range Operation with Two-Engine Airplanes, however in parlance is know as Engines Turn or Passengers Swim. This protocol and requirements are a set of rules that ensure if a plane has a problem over a body of water, it can make it back to shore for a safe landing. As someone who flies across oceans a decent amount, I am very happy the regulatory bodies have these rules in place.

However, there are no such rules for data centers that run airline operations. In fact, in January, Delta Airlines had a major failure which took down most of its operations for a couple of days. Most IT experts have surmised that Delta was running a single data center for it’s operations. Based on the evidence from Saturday’s incident with BA, I have to assume that they are, as well. One key bit of evidence, was that BA employees were unable to access email. They are an Office 365 customer, so theoretically, even if on-premises systems were down e-mail should work. However, if they were using Active Directory Federation Services, so that all of their passwords were stored on-prem, then the data center being down, would mean they couldn’t authenticate, and therefore would not have email.

This was my biggest clue that BA was running with a single data center—was that email didn’t work. While some systems, particularly some of the mainframe systems that may handle flight operations, have a tendency to not do well with failover across sites, Active Directory is one of the best distributed systems there is, and is extremely resilient to failures. In fact, given BA’s global business, I’m really surprised they didn’t have ADFS servers in locations around the world.

Enter the Cloud

Denny and I sat talking yesterday and running some numbers on what we thought a second data center would cost a company like BA. Our rough estimate (and this is very rough) was around $30-40 million USD. While that is a ton of money, it is estimated that weekend’s mess may cost BA up to  £150 million (~$192MM USD). However, companies no longer have to build multiple data centers in order to have redundancy, as Microsoft (and Amazon, and Google) have data centers throughout the world. The cloud gives you the flexibility to protect critical systems, and at a much cheaper cost. I’ve designed DR strategies for small firms that cost under $100/month, and I’ve had real-time failover that supported 99.99% uptime. With the resources of a firm like BA, this should be a no-brainer given the risk profile.

What About Outsourcing?

Much has been made of the fact that BA has outsourced much of its IT functions to TCS and various other providers. Some have even tried to place blame on the providers for this outage. Frankly, I don’t have enough detail to blame anyone, and it seems more like the data center operator’s issue. However, I do think it speaks to the lack of attention and resources paid to technology at a company that clearly depends on it heavily. Computers and data are more important to business now than ever, and if your firm doesn’t value that, you are going to have problems down the road.

Conclusions

In the cloud era, I’m convinced no business, no matter how big or small should run with a single data center. It is way too cheap and easy to ship your backups to multiple sites, and be online in a matter of hours with a cloud provider. Given the importance and consolidation of airlines to our world economy, it probably wouldn’t be a terrible idea if their regulators created regulations requiring failover and failover testing. Don’t let this happen to your stock price.

//platform.twitter.com/widgets.js

Why Are You Still Running Your Own Email Server?

One of the things I tell customers when doing any sort of architectural consulting, is to identify their most important business systems. Invariably something that gets left off of that list is email. Your email is your most critical system. ERP may run your profit centers, but email keeps it moving.

With that in mind, and given all the security risks that exist in the world (see: Russian hacking scandal, other email leaks of the week) it doesn’t make a lot of sense for most organizations to run their own Exchange environments when Microsoft is really good at it.

I had a discussion with an attorney at a company in a heavily regulated industry recently. The attorney mentioned that after investigating, she determined that the company didn’t have journaling turned on for their Exchange servers. (For you DBAs, journaling is effectively full recovery mode for Exchange—it’s more complicated that, but that is a nice analogy). Given that we are Office 365 customers, I wanted to check the difficulty of enabling this in our environment. I found out, full e-discovery capabilities that integrate with e-discovery systems are as easy as one click of a mouse (and a credit card to make sure you are on the right service level).

Another great security feature that was really painful to integrate with email login is multi-factor authentication. Once again, this requires a mouse click or two, and your credit card. You can even quickly do things like whitelisting your office’s IP address so that your users don’t have to use MFA when in the office.

These features are great, but it doesn’t even cover all the threat protection that Microsoft has built into Office 365 and Azure. You can read about that here, but Microsoft can even protect you from threats like spearphising. (Hi Vlad!) . Just like encryption. Don’t be a news story—just be secure.

%d bloggers like this: