Analyzing Your Dump Files

I’m blogging about this, because A) It’s something really awesome that the SQL Server team built and B) it seems to have terrible SEO, because it took me like three google searches to find the page. With the introduction of SQL Server Management Studio 17, the Tiger team at Microsoft built a plugin that allows you to debug and resolve memory dumps you may have encountered during otherwise normal operations. This is really awesome, as it is something that usually requires a support case with CSS.

For those of you wearing aluminum hats, this does require you do upload the dump file to Azure, where it is analyzed for free (as in beer) on Microsoft’s software. You can even choose your region if you have data provenance concerns. And according to this blog post the memory dumps are stored in accorded with Microsoft’s Privacy Policy.

You will need SSMS 17 for this, as well as to install the plug in, which you can get here.

image

After that you can quickly get feedback on your dumps. Microsoft have even built an API, so if you want to built something automated to upload your dump files using Python or PowerShell you can.

The Self-Tuning Database?

There was a lot of talk on Twitter over the weekend, about automation, and the future of the DBA role. I’ve spoken frequently on the topic, and even though the PASS Summit program committee has had limited interest in my automation sessions, they have been amongst my most popular talks at other large conferences. Automating mundane tasks makes you a better DBA, and escalating the level of those tasks allows you to focus on activities that really help your business like better data quality, and watching cat videos on YouTube.

But Tuning is Hard

Performance tuning has always been something of a black art. I remember that Oracle called 9i the self-tuning database, because we no longer had to manually calculate how many blocks (pages) of memory where allocated to each pool. That was a joke—and those databases still required a lot of manual effort for tuning. However, that was then, and we’re in the future now. We’ll call the future August. Stay with me here, I’m going to talk about some theories I have.

So It’s August 2017

Let’s say you were a vendor of a major RDBMS, who also happened to own a major hyperscale cloud, and you had invested heavily in collecting query metadata in the last two releases of your RDBMS. Let us also accept the theory that the best way to get an optimal execution plan, is to generate as many potential execution plans as possible. Most databases attempt a handful of plans, before picking the best available plan—this is always a compromise as generating execution plans involves a lot of math, and is very expensive from a CPU perspective. Let us also portend that as owner of the hyperscale cloud, you also have a lot of available processing power, and you’ve had your users opt-in to reviewing their metadata for performance purposes. Additionally, you’ve taken care of all of the really mundane tasks like automating backups, high availability, consistency checks, etc. So now it’s on to bigger fish.

rube-goldberg-stamp

Still With Me?

Ok, so we have all the tools in place to build our self-tuning database, so let’s think about what we would need to do. Let’s take a somewhat standard heuristic I like to use in query tuning—if a query takes more than 30ms to execute or is executed more than 1000/times in a day, we should pay attention to it for tuning purposes. That’s a really big filter—so we’ve already narrowed down the focus of our tuning engine (and we have this information in our runtime engine, which we’ll call the Query Boutique). We have also had our users opt-in to use using their metadata to help improve performance.

So we identify our problem queries in your database. We then export the statistics from your database, into our backed tuning system. We look for (and attempt to apply) any missing indexes to the structure, to evaluate the benefit of a missing index. We then attempt to generate all of the execution plans (yes, all of them—this processing is asynchronous, and doesn’t need to be real time). We could even collect read/write statistics on given objects and apply a weighted value to a given index. We could then take all of this data and run it through our back end machine learning service, to ensure that our query self tuner algorithm was accurate, and to help improve our overall process.

We could then feed this data back into the production system as a pinned execution plan. Since we are tracking the runtime statistics, if the performance of the plan drops off, or we noticed that the statistics changed, we could force out our new execution plan, and start the whole process over again.

So there, I didn’t write a line of code, but I laid out the architecture for a self-tuning database. (I thought about all of this on a bike ride, I’m sure Conor could do a lot better than me Smile)  I’m sure this would take years and many versions to come into effect. Or not at all. To be successful in this changing world of data, you need to stay ahead of the curve, learn about clouds work, how to script, how to automate, and how to add value.

Drive By #$%^ing—er, Tweeting.

Note: This post may contain some vulgarities, but no obscenity, at least as defined by the Supreme Court in Miller v. California (1973)

So, my morning started off early, with a lovely bike ride. It was nice and cool in Southern California, where I am this week, so I had a lovely 20 miles. Then, I took my phone out of my pocket and was confronted with two really shitty posts. The first was on twitter, and the second was so shallow that it may as well been a tweet.

I love Twitter, I’ve been on for almost ten years, and as a data nerd, and sports fan, it is like the end all be all of services. However, for discourse, 140 characters leaves out the ability to for any nuance or subtlety. (See the President of United States, not that he had any nuance or subtlety to begin with). However, when you legitimately want to critique something, prose works far better than a 140 characters.

The First #$%, er Tweeter

So I got back from my ride, and the first thing I saw was:

Screen Shot 2017-07-26 at 8.09.53 AM

Richie, that tweet was school on Sunday dude. Not cool—I get that you may not like the sessions picked, or general direction of the organization (I certainly disagree with a ton of what PASS does, and am moderately vocal about it). But when you write a tweet like that, you are basically inferring that a bunch of shitty speakers, submitted a a bunch of shitty sessions, and the program committee organized a total shit show. You probably didn’t mean that—I personally think the new emphasis on development isn’t the right approach for a database conference. However, that’s a) not my decision, and b) a more nuanced thought than “Summit sucks, haha.”

The old saying about “if you don’t have anything nice to say, don’t say anything”, is wrong. However, if you don’t have anything constructive to say,don’t insult 150+ speakers,volunteers, and potential sponsors who might be reading your stream.

The Second #$%e, Blogger

I’m not linking to this guy’s shit. Because it’s shit.

Screen Shot 2017-07-26 at 9.11.10 AM

Here’s the gist of this post—Azure SQL Database and Azure Blob Storage are completely $%^&ing insecure because they have public IP addresses. Never mind, that you can completely lock them down to all IP addresses, and Azure. (Granted a database or storage account that no one can access is probably of limited value). However, these services are fully accessed controlled and have built-in security. Additionally, in the case of SQL DB, you have the option of built-in threat detection that will detect anomalous behavior like SQL injection, or rogue logins.

Currently, Azure doesn’t have the ability the put your database on a VNet. I’d be shocked if the Azure team is not working on said feature. In the article, the author makes a point of Amazon having the ability to do this for RDS. That’s cool, and probably why Microsoft is working on it. But instead of focusing on how to secure your app in a hybrid platform, he just shits on the vendor with a clickbait headline.

Wheaton’s Law

Wheaton’s Law, which also happens to be the core of our HR policy at DCAC, is simply “Don’t be a dick”. Think before you write and tweet—don’t write clickbait crap, and if you want to criticize an org, do it carefully, and write a blog post. Or send someone a DM.

24 Hours of PASS Summit Preview—Getting Started with Linux

Next week is the 24 Hours of PASS—which is a free event PASS puts on to showcase some of the sessions you will get to see at November’s PASS Summit. I’m going to be giving a preview on my session on Linux—I’ll be talking about how to get started learning Linux, installing a SQL test environment on a Docker container or a VM, and a little bit about some of the ancillary configuration of SQL Server running on the Linux platform.

Porsche 919 Six Hours of Spa

The Porsche 919 that won the recent 24 Hours of LeMans

While SQL Server on Linux is something that may not immediately impact you—this is definitely something you want to be ready for. Just this week I talked to some executives from a major Linux vendor, who have been amazed at the amount of interest this has had amongst their customers. A lot of folks in IT strongly dislike working with “the other major database vendor” (name withheld so I don’t get sued, bingle sailboats and databases), and a lot of those organizations like to run their database servers on Linux.

So grab a beer or a whisky (if you’re in the US) as I’ll be presenting at 9 PM (2100) EDT (GMT –5)/0100 GMT. You can sign up here. If you can’t make it (because you live in the EU, and would rather sleep than listen to me ramble about shell commands) there will be a recording, but you will still need to register.

24 Hours of PASS—Summit Preview Linux

Next week is the 24 Hours of PASS—which is a free event PASS puts on to showcase some of the sessions you will get to see at November’s PASS Summit. I’m going to be giving a preview on my session on Linux—I’ll be talking about how to get started learning Linux, installing a SQL test environment on a Docker container or a VM, and a little bit about some of the ancillary configuration of SQL Server running on the Linux platform.

Porsche 919 Six Hours of Spa

The Porsche 919 that won the recent 24 Hours of LeMans

While SQL Server on Linux is something that may not immediately impact you—this is definitely something you want to be ready for. Just this week I talked to some executives from a major Linux vendor, who have been amazed at the amount of interest this has had amongst their customers. A lot of folks in IT strongly dislike working with “the other major database vendor” (name withheld so I don’t get sued, bingle sailboats and databases), and a lot of those organizations like to run their database servers on Linux.

So grab a beer or a whisky (if you’re in the US) as I’ll be presenting at 9 PM (2100) EDT (GMT –5)/0100 GMT. You can sign up here. If you can’t make it (because you live in the EU, and would rather sleep than listen to me ramble about shell commands) there will be a recording, but you will still need to register.

Would You Fly a Plane with One Engine? Or Run Your Airline with One Data Center(re)?

For those of you who may of been in the US or outside of Europe this past weekend, you may not have heard about the major British Airways IT outage, that took down their entire operations for most of Saturday and into Sunday. Rumors, which were later confirmed, were that a switch from primary to backup power at their primary data centre (they’re a UK company, so I’ll spell it in the Queen’s English), lead to a complete operations failure. I have a bit of inside information, since my darling wife was stuck inside of Terminal 5 at Heathrow.

Image result for jet with missing engine

There’s a requirement for planes that travel across oceans call ETOPs, which stands for Extended Range Operation with Two-Engine Airplanes, however in parlance is know as Engines Turn or Passengers Swim. This protocol and requirements are a set of rules that ensure if a plane has a problem over a body of water, it can make it back to shore for a safe landing. As someone who flies across oceans a decent amount, I am very happy the regulatory bodies have these rules in place.

However, there are no such rules for data centers that run airline operations. In fact, in January, Delta Airlines had a major failure which took down most of its operations for a couple of days. Most IT experts have surmised that Delta was running a single data center for it’s operations. Based on the evidence from Saturday’s incident with BA, I have to assume that they are, as well. One key bit of evidence, was that BA employees were unable to access email. They are an Office 365 customer, so theoretically, even if on-premises systems were down e-mail should work. However, if they were using Active Directory Federation Services, so that all of their passwords were stored on-prem, then the data center being down, would mean they couldn’t authenticate, and therefore would not have email.

This was my biggest clue that BA was running with a single data center—was that email didn’t work. While some systems, particularly some of the mainframe systems that may handle flight operations, have a tendency to not do well with failover across sites, Active Directory is one of the best distributed systems there is, and is extremely resilient to failures. In fact, given BA’s global business, I’m really surprised they didn’t have ADFS servers in locations around the world.

Enter the Cloud

Denny and I sat talking yesterday and running some numbers on what we thought a second data center would cost a company like BA. Our rough estimate (and this is very rough) was around $30-40 million USD. While that is a ton of money, it is estimated that weekend’s mess may cost BA up to  £150 million (~$192MM USD). However, companies no longer have to build multiple data centers in order to have redundancy, as Microsoft (and Amazon, and Google) have data centers throughout the world. The cloud gives you the flexibility to protect critical systems, and at a much cheaper cost. I’ve designed DR strategies for small firms that cost under $100/month, and I’ve had real-time failover that supported 99.99% uptime. With the resources of a firm like BA, this should be a no-brainer given the risk profile.

What About Outsourcing?

Much has been made of the fact that BA has outsourced much of its IT functions to TCS and various other providers. Some have even tried to place blame on the providers for this outage. Frankly, I don’t have enough detail to blame anyone, and it seems more like the data center operator’s issue. However, I do think it speaks to the lack of attention and resources paid to technology at a company that clearly depends on it heavily. Computers and data are more important to business now than ever, and if your firm doesn’t value that, you are going to have problems down the road.

Conclusions

In the cloud era, I’m convinced no business, no matter how big or small should run with a single data center. It is way too cheap and easy to ship your backups to multiple sites, and be online in a matter of hours with a cloud provider. Given the importance and consolidation of airlines to our world economy, it probably wouldn’t be a terrible idea if their regulators created regulations requiring failover and failover testing. Don’t let this happen to your stock price.

//platform.twitter.com/widgets.js

SQL Data Warehouse—Size Limits? (No, Not Really)

One of the promises of Azure SQL Data Warehouse is the ability to have petabyte scale. The ability to quickly scale data, and have that data scale independently of compute resources. So when one I my clients emailed me yesterday with this screenshot, needless to say I was concerned.

image

As you can see, when  the properties screen shows a max size of 5 GB. So I sent a frantic email to one of my colleagues on the SQL Data Warehouse team (thanks, JRJ). Turns out, due to the distributed architecture of SQL DW, that max database size refers to the size of the control node, and not the underlying data nodes that service your queries. So feel free to ignore this.

SQLCLR in SQL Server 2017

Security of your databases has become the most important part of your job as a DBA. No one wants to be the next company to leak a list of customers, leak photos of children and their parents using your company’s product, or have a key marketing database released. Security is also a very complicated, multi-layered thing. There’s not a switch you can turn on to suddenly have good security. Additionally, you can be doing all the right things, and a few bad lines of application code can subvert all of the measures you have taken to protect your data.

With this is mind, Microsoft has made some big changes to CLR in SQL Server 2017. SQL CLR has always been an interesting area of the engine—it allows for the use of .NET code in stored procedures and user defined types. For certain tasks , it’s an extremely powerful tool—things like RegEx and geo functions can be much faster in native CLR than trying to do the equivalent operation in T-SQL. It’s always been a little bit of a security risk, since under certain configurations, CLR had access to resources outside of the context of the database engine. This was protected by boundaries defined in the CLR host policy. We had SAFE, EXTERNAL_ACCESS, and UNSAFE levels that we could set. SAFE simply limited access of the assembly to internal computation and local data access. For the purposes of this post, we will skip UNSAFE and EXTERNAL_ACCESS, but it is sufficed to say, these levels allow much deeper access to the rest of the server.

Code Access Security in .NET (which is used to managed these levels) has been marked obsolete. What does this mean? The boundaries that are marked SAFE, may not be guaranteed to provide security. So “SAFE” CLR may be able to access external resources, call unmanaged code, and acquire sysadmin privileges. This is really bad.

So What Now?

Microsoft is recommending enabling “CLR Strict Security” in SQL Server 2017, which means users cannot create any CLR assemblies unless they have elevated permissions. This could be a breaking change anywhere you want to use dynamic CLR. The user needs to have CREATE ASSEMBLY and one of the following options has to be true:

  • Database has the TRUSTWORTHY property on AND the database owner (DBO) has the UNSAFE ASSEMBLY permission on the server
  • Assembly is signed with a certificate or asymmetric key that has a corresponding login with the UNSAFE ASSEMBLY permission on the server

In short, the engine treats all CLR assemblies as unsafe, even if they are created with SAFE or EXTERNAL_ACCESS permission sets.

You will need to sign your assemblies. Also, if you try to restore a database with CLR assemblies into SQL Server 2017, you may run into issues if your assemblies are unsigned.

But Now I Have to Do Work?

What does this mean for you, the SQL DBA or Developer? This is a breaking change (and breaking changes are a BIG DEAL at Microsoft—they don’t like to break things unless there are really good reasons). This means if you are using CLR, and you want to move to SQL Server 2017 (or really, let’s face it, these security risks are in lower versions of the code), you need to work out a signing system for your CLR assemblies and possibly some key management infrastructure. You need to evaluate your older CLR code to ensure that none of it is running under UNSAFE or EXTERNAL_ACCESS (unless you want to turn off “CLR Strict Security”, which you really don’t want to do). Also, if you want to run SQL Server 2017 on Linux, you will be limited to the SAFE permission set.

We Only Have Third Party Apps…

Good luck with that. This is a big change, and we all know lots of smaller independent software vendors (ISVs) have just gotten around to supporting SQL Server 2014. In most of my work with ISVs, I haven’t seen a ton of CLR in most of the vendor apps. That being said, I’m  sure it exists in many ISV apps. The angle I would take as a DBA towards my ISV(s) is to use the “audit hammer” with the vendor. If you are not familiar with this approach to managing your vendors, you can mention that your internal audit process is now failing unsigned assemblies that can’t run under CLR Strict Security. This is probably accurate anyway—CLR has been an audit flag for years. I would also recommend doing a comprehensive testing process to ensure your CLR functions as you would expect.

Don’t Assume that Your Sessions Need to Be 500 Level

You know what people say about what happens when you assume something, right? Don’t be an ass and make assumptions. One trend I’ve seen in recent years from some conference is that the conferences tend to prefer super advanced, internals, deep-dive sessions. While these sessions are very cool, and we all like to totally geek out about internals and the inter-workings of the query optimizer, and secret trace flags that make SQL Server return all query results as emojis. (Dear Conor, can you work on that one Smile )

Screen Shot 2017-04-13 at 1.43.21 PM

Screen Shot 2017-04-13 at 1.41.42 PM

Anyway, while many of us in the SQL Server community have massive depth of knowledge about the product, and computer systems in general, it is important to realize that everyone attending SQL events is not an MCM, MVP, and MSCE. It’s great if you want to do a session on backup and restore, or how to install SQL Server (make sure to explain why you are doing each step). Topics like intro to T-SQL and Writing Your First Stored Procedure can be fantastic.

I’m not saying the whole event needs to have 100 level sessions, but making an effort to reach out to people who are just getting started could go a long way to broadening the reach of a professional association.

PASS Summit Speaker Selection Changes—My Take

Monday PASS announced its changes to the speaker selection process, you can read the details here. This is a big change—there will be preselected speakers from a pool have folks who have a long track record of successfully speaking at PASS Summit. Is this a good thing or bad thing? I think it is a good thing, so long as it is implemented carefully. I don’t want to see new speakers get locked out of the speaking at Summit, but I also want to see good sessions from good speakers.
This change will allow PASS to better structure the conference. A good example of this is Nic Cain’s (b|t) idea for curated tracks. In order to have a “new DBA” or “new BI analyst” track, you need to have speakers collaborating with each other in order to build the complimentary track.

Another consideration is NDA content—a good example of this was last year. I would have loved to have done a talk about SQL Server on Linux, which I knew was going to be un-NDAed by Summit, however since submission is public, I couldn’t submit an abstract on it.

My advice for new speakers who want to speak at Summit? It’s the same as it’s always been—blog (note—the blogging part of this is important!) and speak about whatever interests you. Of course, you need to be strategic about submitting to a large conference like Summit, but to a local SQL Saturday, or a user group? Those organizers are begging for speakers. Additionally, consider speaking at PASS Virtual Chapters—they are many, and they meet every month, and there is no cost (other than your time) involved with speaking there.

As you develop as a speaker and writer, you’ll get better known, and develop your own niche. You will also get rejected. Getting rejected sucks—trust me, I submitted to three Summits before I was chosen (I was also nominated for MVP like 10x before getting it, but I digress). When you get rejected look at your abstracts and try to understand how you can make them better. Have a friend or colleague review them. This is an ongoing process.

I don’t think most speakers will notice a big difference with this new process. The speakers who are preselected, were likely going to get selected anyway. The big difference is they will have chosen their topic versus being subject to the whim of the program committee. If you’re a new speaker–speak as much as you can. VCs are free, and your local user group needs speakers. If you live in the middle of nowhere, a lot of user groups will welcome remote presentations. Hone your skills. Write some blog posts (you may have noticed this is my third blogging mention, fire up your keyboard). There’s a new version of SQL Server this year. Get inspired!

%d bloggers like this: