Don’t Let Your Infrastructure Team Design Your Data Protection Strategy

In the last two days, I’ve been part of two discussions, one of which was about the need to run CHECKDB on modern storage (yes, the answer is always yes, and twice on Sundays), and then another about problems with third party backup utilities. Both of these discussions were born out of (at the end of the day) infrastructure teams wanting to treat database servers like web, application, and file servers, and have one tool to manage them all. Sadly, the world isn’t that simple, and database servers (I’m writing this generically, because I’ve seen this issue crop up with both Oracle and SQL Server). Here’s how it happens, invariably your infrastructure team has moved to a new storage product, or bought an add on for a virtualization platform, that will meet all of their needs. In one place, with one tool.

image

So what’s the problem with this? As a DBA you lose visibility into the solution. Your backups are no longer .bak and .trn files, instead, they are off in some mystical repository. You have to learn a new tool, to do the most critical part of your job (recovering data), and maybe you don’t have the control over your backups that you might otherwise have. Want to stripe backups across multiple files for speed? Can’t do that. Want to do more granular recovery? There’s no option in the tool for that. Or my favorite one—want to do page level recovery? Good luck getting that from a infrastructure backup tool. I’m not saying all 3rd party backup tools are bad—if you buy one from a database specific vendor like RedGate, Idera, or Quest, you can get value-added features in conjunction with your native ones.

BTW, just an FYI, if you are using a 3rd party backup tool, and something goes terribly sideways, don’t bother calling Microsoft CSS, as they will direct you to the vendor of that software, since they don’t have the wherewithal to support solutions they didn’t write.

Most of the bad tools I’m referring to, operate at the storage layer by taking VSS snapshots of the database after quickly freezing the I/O. In the best cases, this is non-consequential, Microsoft let’s you do it in Azure (in fact it’s a way to get instant file initialization on a transaction log file, I’ll write about that next week). However, in some cases these tools can have faults, or take too long to complete a snapshot, and that can do things like cause an availability group to failover, or in the worst case, corrupt the underlying database, while taking a backup, which is pretty ironic.

While snapshot backups can be a good thing for very large databases, in most cases with a good I/O subsystem, and backup tuning (using multiple files, increasing transfer size) you can backup very large databases in a normal window. I manage a system that backs up 20 TB every day with Ola Hallengren’s scripts, and not even storage magic. Not all of these storage based solutions are bad, but as your move to larger vendors who are further and further removed from what SQL Server or Oracle are, you will likely run into problems. So ask a lot of questions, and ask for plenty of testing time.

So if you don’t like the answers you get, or the results of your testing, what do you do? The place to make the arguments are to the business team for the applications you support. Don’t do this without merit to your argument—you don’t want to unnecessarily burn a bridge with the infrastructure folks, but at the end of the day your backups, and more importantly your recovery IS YOUR JOB AS A DBA, and you need a way to get the best outcome for your business. So make the argument to your business unit, that “Insert 3rd Party Snapshot Magic” here isn’t a good data protection solution and have them raise to the infrastructure management.

SQL Server on Linux Licensing

Now that SQL Server 2017 has gone GA and SQL Server on Linux is a reality, you may wonder how it effects your licensing bill? Well there’s good news—SQL Server on Linux has exactly the same licensing model as on Windows. And Docker, if you are using it for non-development purposes (pro-tip: don’t use Docker for production, yet) is licensing just like SQL Server in a virtual environment, you can either license all of the cores on the host, or simply the cores that your container is using.

But What About the OS?

So let’s compare server pricing:

  • Windows Server 2016 Standard Edition– $882
  • Red Hat Enterprise Linux with Standard Support—$799 
  • SuSE Enterprise Standard Support—$799
  • Ubuntu—Free as in beer

As you can see most of the licenses are the same whether you are on Windows or Linux. I’m not going to get into the religious war to say which is the best distro or if you need paid support, just simply putting the numbers out there.

HA and DR

There’s still more to come on the HA and DR story, so stay tuned. But assume the standard free license with SA for an idle secondary.

Monitoring Availability Groups—New Tools from Solarwinds

As I mentioned in my post a couple of weeks ago, monitoring the plan cache on a readable secondary replica can be a challenge. My customer was seeing dramatically different performance, depending on whether a node was primary or secondary. As amazing as the Query Store in SQL Server 2016 is, it does not allow you to view statistics from the readable secondary. So that leaves you writing xQuery to mine the plan cache DMVs for the query information you are trying to identify.

My friends at Solarwinds (Lawyers: see disclaimer at bottom of post) introduced version 11.0 of Database Performance Analyzer (DPA, a product you may remember as Ignite) which has full support for Availability Group monitoring. As you can see in the screenshot below, DPA gives a nice overview of the status of your AG, and also lets you dig into the performance on each node.

image

There are a host of other features in their new releases, which you can check out some of their new hybrid features in their flagship product Orion. Amongst these features, a couple jumped out at me—there is now support for Amazon RDS and Azure SQL Database in DPA, and there is some really cool correlation data that will let your compare performance across your infrastructure. So, when you the DBA is arguing with the SAN, network, and VM teams about where the root cause of the performance problem, this tool can quickly isolate the root cause of the issue. With less fighting. These are great products, give them a look.

Disclaimer: I was not paid for this post, but I do paid work for SolarWinds on a regular basis.

An “Ask” for Microsoft—A Global Price List

And yes, I just used ask as a noun (I feel dirty), I wouldn’t do that in any other context, but this one. In reviewing my end of year blog metrics, my number one post from last year was a post that listed the list price of SQL Server. I wrote this post because a) I wanted clicks and b) I knew what a pain it was to find the pricing in Microsoft documents. However, the bigger issue is that to really figure out what a SQL Server cost, you need to go to another site to get Windows pricing, and probably another site to find out what adding System Center to your server might cost.

This post came up because Denny and I were talking the other night, as someone had posted to the Data Platform MVP list asking how much the standalone R Server product cost. We found a table on some Microsoft site:

IMG_06012017_194009

I’m not sure what math is required to translate “Commercial Software” into a numeric value, but it is definitely a type conversion and those perform terribly. Eventually I found this on an Azure page:

This image is charged exactly like SQL Server 2016 Enterprise image, but it contains no Database elements and has the core ScaleR and DeployR functionality optimized for Windows environments. For production workloads we recommend that you use a virtual machine size of DS4 or higher.

This leads me to believe that R Server has the same pricing as SQL Server, but with the documents I have I am not certain of that fact.

What Do I Want?

What I want, is pricing.microsoft.com, a one-stop shop where I can find pricing for all things Microsoft, whether they be Azure, On-Premises, or Software as a Service. At worse it should be one click from the product name to it’s pricing page. Ideally, I’d like it all in a single table, but let’s face it, software pricing can be complex and each product probably needs it’s own page with pricing details.

The other thing that would be really cool, and this is more of an Azure thing, is to have pricing data built-in to the API for deploying solutions. That way I can build pricing based intelligence into my automation code, to rollout cost optimized solutions for Azure.

Anyone else have feature suggestions?

Updated: Jason Hall has a great comment below that I totally forgot about. Oracle has a very good price list (it definitely wins the number of commas award) that is very easy to access. So dear readers in Redmond: Oracle does it, we you should too!

Updated: There is some of this available in Azure. It’s not perfect though. https://msdn.microsoft.com/en-us/library/azure/mt219004?f=255&MSPPError=-2147217396. Amazon just announced enhancements to their version of this service. https://awsinsider.net/articles/2017/01/09/pricing-notifications.aspx

But What about Postgres?

What About Postgres?

Since I wrote my post yesterday about Oracle and SQL Server, I’ve gotten a lot of positive feedback (except for one grouchy Oracle DBA) on my post. That said, I should probably stay clear of Redwood Shores anytime soon. However there was one interesting comment from Brent Ozar (b|t)

Screen Shot 2016-08-20 at 12.05.36 PM

While Postgres is a very robust database that is great for custom developed applications, this customer has built a pretty big solution on top of SQL Server, so that’s not really an option.

multiple-cords-in-one-outlet

However, let’s look at the features they are using in SQL Server and compare them to Postgres. Since this a real customer case, it’s easy to compare.

1. Columnstore indexes—Microsoft has done an excellent job on this feature, and in SQL Server 2016 new features like batch mode push-down drive really solid performance on large analytic queries. Postgres has a project for columnstore but it is not developed. There’s also this add-on feature https://www.citusdata.com/blog/2014/04/03/columnar-store-for-analytics/ which does not offer batch execution mode performance enhancements and frankly offers extremely mediocre performance.

You can compare this benchmark:

https://www.monetdb.org/content/citusdb-postgresql-column-store-vs-monetdb-tpc-h-shootout

to the SQL Server one:

SQL Server 2016 posts world record TPC-H 10 TB benchmark

2. Always On Availability Groups—In this system design we are using readable secondaries as a method to deliver more data to customers. It doesn’t work for all systems, but in this case it works really well. Postgres has a readable secondary option, but it is far less mature than the SQL Server feature. For example, you can’t create a temp table in a readable secondary.

3. Analysis Service Tabular—There is no comparison here. Postgres has some OLAP functions that are comparable to windowing functions in T-SQL. Not an in-memory calculation engine.

4. R Services—You can connect R to Postgres. However, SQL Server’s R Services leverages the SQL Server engine to process data, unlike Postgres which uses R’s traditional approach of needing the entire dataset in memory. Once again, this would require a 3rd party plug in to work in Postgres.

5. While Postgres has partitioning, it is not as seamless as in SQL Server, and requires some level of application changes to support.

https://www.postgresql.org/docs/9.1/static/ddl-partitioning.html

While I feel that SQL Server’s implementation of partitioning could be better, I don’t have to change any code to implement.

6. Postgres has nothing like the Query Store. There are data dictionary views that offer some level of insight, but the Query Store is a fantastic addition to SQL Server that helps developers and DBAs alike

7. Postgres has no native spatial feature. There is a plug-in that does it, but once again we are making an even bigger footprint of 3rd party add-ins to manage.

Postgres is a really good database engine, with a rich ecosystem of developers writing code for it. SQL Server on the other hand, is a mature product that has had a large push to support analytic performance and scale.

Additionally, this customer is leveraging the Azure ecosystem as part of their process, and that is only possible via SQL Server’s tight integration with the platform.

The SQL Virtualization Tax?

I’ve been working in virtual server environments for a long time, and a big proponent of virtualization. It’s a great way to reduce hardware costs and power consumption, and frankly for smaller shops it’s also their easy foray into high availability. The main reason for the high availability are technologies like VMWare’s vMotion and Microsoft’s Hyper-V Live Migration—if a physical server in a virtualization farm fails the underlying virtual servers get moved to other hardware, without any downtime. This is awesome, and one of the best features of a virtual environment. What I don’t like is when software vendors feel they are getting the raw end of the deal with virtualization, so they develop asinine licensing policies around.

Oracle is my favorite whipping boy in this discussion—Oracle is most typically licensed by the CPU core. In my opinion, a CPU core should be the number of cores that the operating system can address. Oracle agrees with me, but only in the case of hard partitions (mostly old, expensive UNIX hardware that they happen to sell). Basically, if I have a cluster of 64 physical nodes, and I have one virtual machine, with one virtual CPU, Oracle expects me to license EVERY CORE in that cluster. The ways around this are to physically lock down your virtual machine to a given hardware pool and then license all of those cores (a smaller number of course). The other option is to dedicate a bunch of hardware to Oracle, and virtualize it—while this works, it definitely takes away a lot of the flexibility of virtualization, and is a non-starter for many larger IT organizations.

Microsoft, on the other hand has been generally pretty fair in their virtualization licensing policies. An Enterprise license for Windows Server bought you four VM licenses, and SQL Server (before 2008 R2) had some very favorable VM licensing. However, starting with the SQL Server 2012 things started to get a bit murkier—for Enterprise Edition, we have to buy a minimum of 4 core licenses, even if you are only running one 1 or 2 virtual CPUs. However, we don’t have to license every core in the VM farm. One thing that caught my eye with the SQL Server 2012 licensing, is that if you license all of the physical cores in a VM farm, you can run unlimited number of VMs running SQL Server, but only if you purchase Software Assurance. Software Assurance costs 29% of license costs, and is a recurring annual cost. In the past Software Assurance was generally only related to the right to upgrade the version of your software (e.g. if you had SA, you could upgrade from SQL 2008 R2 to SQL 2012). This rule bothered me, but it didn’t really affect me, so I ignored it.

I was talking to Tim Radney (b|t) yesterday, and he mentioned that in order to do vMotion/LiveMigration (key features of virtualization) Software Assurance was required. I hadn’t heard this before, but sure enough in this document from Microsoft, it is mentioned:

So, in a nutshell if you want to run SQL Server in virtual environment, and take advantage of the features that you paid for, you have to pay Microsoft an additional 29% per license of SQL Server. I think this stinks—please share your thoughts in the comments

Vendors, Again—8 Things To Do When Delivering a Technical Sales Presentation

In the last two days, I’ve sat through some of the most horrific sales presentations I’ve ever done—this was worse than the time share in Florida. If you happen to be a vendor and reading (especially if you are database vendor—don’t worry it wasn’t you), I hope this helps you craft better sales messages. In one of these presentations, the vendor has a really compelling product that I still have interest in, but was really put off by bad sales form.

I’ll be honest, I’ve never been in sales—I’ve thought about it a couple times, and still would consider it if the right opportunity came along, but I present, a lot. Most of these things apply to technical presentations as well as sales presentations. So here goes.

The top 8 things to do when delivering a sales presentation:

  1. Arrive Early—ask the meeting host to book your room a half hour early and let you in. This way you can get your connectivity going, and everything started before the meeting actually starts, wasting the attendee’s valuable time, and more importantly cutting into your time to deliver your sales message. Also starting on time allows you to respect your attendees’ schedules on the back end of the presentation.
  2. Bring Your Own Connectivity—if you need to connect to the internet (and if you have remote attendees, you do) bring your own connectivity. Mobile hotspots are widely available, and if you are in sales you are out of the office most of the time anyway, consider it a good investment.
  3. Understand Your Presentation Technology—please understand how to start a WebEx and share your presentation. If you have a Mac have any adapters you need to connect to video. If you want to use PowerPoint presentation mode (great feature by the way) make sure the audience doesn’t see the presenter view, and sees your slides. Not being able to do this is completely inexcusable.
  4. Understand Who Your Audience Is—if you are presenting to very Senior Infrastructure architects in a large firm, you probably don’t need to explain why solid state drives are faster than spinning disks. Craft your message to your intended audience, especially if it has the potential to be a big account. Also, if you know you are going to have remote attendees don’t plan on whiteboarding anything unless you have access to some electronic means to do so. You are alienating half of your audience.
  5. Don’t Tell Me Who Your Customers Are—I really don’t care that 10 Wall St banks use your software/hardware/widget. I think vendors all get that same slide from somewhere. Here’s a dirty little secret—large companies have so many divisions/partners/filing cabinets that we probably do own 90% of all available software products. It could be in one branch office that some manager paid for, but yeah technically we own it.
  6. I Don’t Care Who You Worked For—While I know it may have been a big decision to leave MegaCoolTechCorp for SmallCrappyStorageVendor, Inc., I don’t really care that you worked for MegaCoolTechCorp. If you mention it once, I can deal with it, but if you keep dropping the name it starts to get annoying and distracting.
  7. Get on Message Quickly—don’t waste a bunch of time telling me about marketing, especially when you go back to point #4—knowing your audience. If you are presenting to a bunch of engineers, they want to know about the guys of your product, not what your company’s earnings were. Like I mentioned above, one of the vendors I’ve seen recently has a really cool product, which I’m still interested in, but they didn’t start telling me about the product differentiation until 48 minutes into a 60 minute presentation.
  8. Complex Technical Concepts Need Pictures—this is a big thing with me. I do a lot of high availability and disaster recovery presentations—I take real pride in crafting nice PowerPoint graphics that take a complex concept like clustering and simplify it so I can show how it works to anyone. Today’s vendor was explaining their technology, and I was pretty familiar with the technology stack, yet I got really lost because there were no diagrams to follow. Good pictures make complex technical concepts easy to understand.

I hope some vendors read this and learn something. A lot of vendors have pretty compelling products, but fail to deliver the sales message which is costing them money. I don’t mind listening to a sales presentation, even for a vendor I may not buy something from, but I do really hate sitting through a lousy presentation that distracts me from the product.

%d bloggers like this: