SQL Server on Linux Licensing

Now that SQL Server 2017 has gone GA and SQL Server on Linux is a reality, you may wonder how it effects your licensing bill? Well there’s good news—SQL Server on Linux has exactly the same licensing model as on Windows. And Docker, if you are using it for non-development purposes (pro-tip: don’t use Docker for production, yet) is licensing just like SQL Server in a virtual environment, you can either license all of the cores on the host, or simply the cores that your container is using.

But What About the OS?

So let’s compare server pricing:

  • Windows Server 2016 Standard Edition– $882
  • Red Hat Enterprise Linux with Standard Support—$799 
  • SuSE Enterprise Standard Support—$799
  • Ubuntu—Free as in beer

As you can see most of the licenses are the same whether you are on Windows or Linux. I’m not going to get into the religious war to say which is the best distro or if you need paid support, just simply putting the numbers out there.

HA and DR

There’s still more to come on the HA and DR story, so stay tuned. But assume the standard free license with SA for an idle secondary.

The Self-Tuning Database?

There was a lot of talk on Twitter over the weekend, about automation, and the future of the DBA role. I’ve spoken frequently on the topic, and even though the PASS Summit program committee has had limited interest in my automation sessions, they have been amongst my most popular talks at other large conferences. Automating mundane tasks makes you a better DBA, and escalating the level of those tasks allows you to focus on activities that really help your business like better data quality, and watching cat videos on YouTube.

But Tuning is Hard

Performance tuning has always been something of a black art. I remember that Oracle called 9i the self-tuning database, because we no longer had to manually calculate how many blocks (pages) of memory where allocated to each pool. That was a joke—and those databases still required a lot of manual effort for tuning. However, that was then, and we’re in the future now. We’ll call the future August. Stay with me here, I’m going to talk about some theories I have.

So It’s August 2017

Let’s say you were a vendor of a major RDBMS, who also happened to own a major hyperscale cloud, and you had invested heavily in collecting query metadata in the last two releases of your RDBMS. Let us also accept the theory that the best way to get an optimal execution plan, is to generate as many potential execution plans as possible. Most databases attempt a handful of plans, before picking the best available plan—this is always a compromise as generating execution plans involves a lot of math, and is very expensive from a CPU perspective. Let us also portend that as owner of the hyperscale cloud, you also have a lot of available processing power, and you’ve had your users opt-in to reviewing their metadata for performance purposes. Additionally, you’ve taken care of all of the really mundane tasks like automating backups, high availability, consistency checks, etc. So now it’s on to bigger fish.

rube-goldberg-stamp

Still With Me?

Ok, so we have all the tools in place to build our self-tuning database, so let’s think about what we would need to do. Let’s take a somewhat standard heuristic I like to use in query tuning—if a query takes more than 30ms to execute or is executed more than 1000/times in a day, we should pay attention to it for tuning purposes. That’s a really big filter—so we’ve already narrowed down the focus of our tuning engine (and we have this information in our runtime engine, which we’ll call the Query Boutique). We have also had our users opt-in to use using their metadata to help improve performance.

So we identify our problem queries in your database. We then export the statistics from your database, into our backed tuning system. We look for (and attempt to apply) any missing indexes to the structure, to evaluate the benefit of a missing index. We then attempt to generate all of the execution plans (yes, all of them—this processing is asynchronous, and doesn’t need to be real time). We could even collect read/write statistics on given objects and apply a weighted value to a given index. We could then take all of this data and run it through our back end machine learning service, to ensure that our query self tuner algorithm was accurate, and to help improve our overall process.

We could then feed this data back into the production system as a pinned execution plan. Since we are tracking the runtime statistics, if the performance of the plan drops off, or we noticed that the statistics changed, we could force out our new execution plan, and start the whole process over again.

So there, I didn’t write a line of code, but I laid out the architecture for a self-tuning database. (I thought about all of this on a bike ride, I’m sure Conor could do a lot better than me Smile)  I’m sure this would take years and many versions to come into effect. Or not at all. To be successful in this changing world of data, you need to stay ahead of the curve, learn about clouds work, how to script, how to automate, and how to add value.

24 Hours of PASS Summit Preview—Getting Started with Linux

Next week is the 24 Hours of PASS—which is a free event PASS puts on to showcase some of the sessions you will get to see at November’s PASS Summit. I’m going to be giving a preview on my session on Linux—I’ll be talking about how to get started learning Linux, installing a SQL test environment on a Docker container or a VM, and a little bit about some of the ancillary configuration of SQL Server running on the Linux platform.

Porsche 919 Six Hours of Spa

The Porsche 919 that won the recent 24 Hours of LeMans

While SQL Server on Linux is something that may not immediately impact you—this is definitely something you want to be ready for. Just this week I talked to some executives from a major Linux vendor, who have been amazed at the amount of interest this has had amongst their customers. A lot of folks in IT strongly dislike working with “the other major database vendor” (name withheld so I don’t get sued, bingle sailboats and databases), and a lot of those organizations like to run their database servers on Linux.

So grab a beer or a whisky (if you’re in the US) as I’ll be presenting at 9 PM (2100) EDT (GMT –5)/0100 GMT. You can sign up here. If you can’t make it (because you live in the EU, and would rather sleep than listen to me ramble about shell commands) there will be a recording, but you will still need to register.

24 Hours of PASS—Summit Preview Linux

Next week is the 24 Hours of PASS—which is a free event PASS puts on to showcase some of the sessions you will get to see at November’s PASS Summit. I’m going to be giving a preview on my session on Linux—I’ll be talking about how to get started learning Linux, installing a SQL test environment on a Docker container or a VM, and a little bit about some of the ancillary configuration of SQL Server running on the Linux platform.

Porsche 919 Six Hours of Spa

The Porsche 919 that won the recent 24 Hours of LeMans

While SQL Server on Linux is something that may not immediately impact you—this is definitely something you want to be ready for. Just this week I talked to some executives from a major Linux vendor, who have been amazed at the amount of interest this has had amongst their customers. A lot of folks in IT strongly dislike working with “the other major database vendor” (name withheld so I don’t get sued, bingle sailboats and databases), and a lot of those organizations like to run their database servers on Linux.

So grab a beer or a whisky (if you’re in the US) as I’ll be presenting at 9 PM (2100) EDT (GMT –5)/0100 GMT. You can sign up here. If you can’t make it (because you live in the EU, and would rather sleep than listen to me ramble about shell commands) there will be a recording, but you will still need to register.

SQL Server on Linux Clustering—A Few Other Notes

So I was chatting with fellow MVP Allan Hirt (b|t) about the cluster build that I wrote about yesterday, and I had a few more realizations about the Linux HA process as it stands right now. I haven’t talked to the the Linux product team at Microsoft about this, but I hope to in the near future to get a better idea of where things are headed. So these are my notes as of now, strictly relating to failover cluster instances (FCI), AlwaysOn Availablity Groups are coming, but are not in the latest CTPs of SQL Server on Linux.

It was faster than building a Windows cluster

It took me a while, I laughed, I cried, I cursed a lot, but if I look at the time it took for me to actually build the cluster and install SQL Server, it was a much faster process. Much of this comes down to the efficiency of the SQL Server installation process on Linux, which is as simple as running yum install mssql-server (mostly). Which leads me to my next point..

Installation options would be nice

The cluster building process is a little kludgy. Basically, you install two standalone instances of SQL Server, and then remove the data files from one them, and copy them into your NFS share. Having the option to do the equivalent of an “Add Node” install, would mean you wouldn’t need to worry about cleaning up your second node.

There’s no cluster validation, explicitly

This is a bit scarier, or easier depending on your view point. There are tests at various parts of the process to make sure things are working. For example, the first step of building your Linux cluster is to authorize the nodes to take part in the cluster, which validates certain security and network settings. However, the storage validation consists of starting and stopping SQL Server on each node to make sure it can talk to the storage and startup. Given that Microsoft doesn’t own the clusterware for this solution, I’m not sure how much they can enhance that, or if they will. This is a good open question.

There’s no dns

(Happy Late Birthday Kris!) One interesting thing I realized after talking to Allan was that I did all of my networking setup through the /etc/hosts file on each individual node. I remember doing this for RAC, and I think it may be a requirement of Pacemaker, but you will still want to make a DNS entry for your cluster identifier. When you do this on Windows, if you are using Active Directory for DNS, the installation does this for you. Not in Linux, you will need to do this yourself.

Screen Shot 2017-01-04 at 11.40.26 AM

Get comfortable with command line and scripting

There’s no cluster wizard to get you through the process. I think this isn’t a huge deal—Denny and I were talking yesterday about how relatively easy it would be to script the whole process in bash (I’m holding off until I find out if Microsoft is doing this), and most Linux sysadmins are really comfortable with writing bash scripts. But if you aren’t comfortable with Linux and the command line, now is the time to brush up, before things go prod.

Summary

We are in the very early days of this process, there is much that will likely change. From a functional and conceptual perspective, this is very similar to the way a SQL Server Failover Cluster works in Windows, but the implementation is quite different. I’d like to see things resemble Windows a bit more, at least from a SQL Server perspective, but we’ll see where the product heads.

SQL Server on Linux–Clustering

First of word of warning on this post—if you are reading it and it isn’t January of 2017, I suspect things may have changed significantly in the months going forward.

Screen Shot 2017-01-03 at 3.47.08 PM

So I did It, I built a SQL cluster on Linux. The process is documented here on BOL, I’m not going to walk you through it, I’ll probably do that in a later post, I just wanted to mention some things I ran into during this build process. First, I did this using VMWare Fusion on my Mac, but I think any virtualization platform that allows virtual networks should work. Secondly, even though BOL says you need Red Hat Enterprise Linux (and you do if you are doing this in prod and require support), I was able to do all of this on CentOS, which is the free as in beer version of RHEL.

In my scenario, I built 3 VMs, one to serve as an NFS server, the other two to be my SQL Servers. Currently, there is no cluster version of the install, it’s the standard installation for standalone SQL on Linux, you then point SQL Server at the NFS mount you created which serves as your shared storage. I had an initial permissions problem on writing my data files there—I did a bad thing on the NFS server and opened up the directory to the world (777), and was then able to copy files there. I’ll follow up on that.

One other thing that wasn’t in BOL, that I had to troubleshoot my way through is that just like a cluster on Windows, you have a cluster identifier and floating IP address. I had to add that to /etc/hosts on each of my nodes to get it to resolve. The article mentions turning off fencing for non-prod environments—I had to do that in order to get failover working correctly in my environment.

Finally, failover was a bit wonky at first, and I had to spend too much time troubleshooting an odd problem. I wrote a connect item for it., but select @@servername and select name from sys.servers returns the name of the host, and not the cluster name. I’m sure the team will fix this in the near future.

%d bloggers like this: