Getting Started with the Cloud with No Budget and an Unsupportive Employer

This thread on Twitter last night really piqued my interest:

 

It really made me think of a conversation I had with a colleague in my last “regular” job. I’m not counting my time at Comcast, because we were effectively a technology firm. I mean a normal, regular company whose core business does not relate to computers or software. Scott, who was my colleague had just attended TechEd 2011, or maybe 2012–the years run together at this point. His comment was “with everything going to the cloud, it seems like all the jobs will be with Microsoft, or helping other customers implement cloud.” In 2011-12, the cloud was still really awful (remember the original SQL Azure? I do, and it was bad), but it was clear what the future would be.

The Future is Here What Do We Do Now?

So if you are working in a “traditional” firm, and you feel as though your skills are slipping away, as the rest of the technology world moves forward, what should you do? The first thing I’m going to say isn’t an option for everyone, because of challenges, and personal situations, but given the current state of economy and IT employment, I think it needs to be said. If you are in a job where you are only supporting legacy tech, of which I don’t really mean on-prem firms–some of the most cutting edge SQL Server orgs in the world are 100% on-premises, but if you are regularly supporting software whose version conforms to the regular expression ^(200)\d{2}$ my best bit of advice to you would be to start the process of finding another job.

I know changing firms isn’t for everyone, and if you want to become a cloud engineer, you need to build your skills in that space. The crux of the twitter thread is how do you learn this things when you are in an organization that thinks that cloud computing has something to do with rain? The first thing I would recommend, if you are willing to spend a little money, is to use skillmeup.com (note: both DCAC and my company have business relationships with Opsgility, the parent company). I have taught classes using their labs–you get a real Azure subscription, with a production scenario, and you also get online training associated with the lab.

Other resources like Pluralsight or LinkedIn Learning (note: DCAC has a business relationship with LinkedIn Learning) offer online training, however I really feel like getting hands on with tech is the best way learn tech.

My Budget Isn’t Even That High

Both Amazon and Microsoft offer free trials–I know Azure a lot better, so I’m going to focus on that scenario. (BTW, this ties to another bit of advice I have, learn one cloud first. The concepts between them are pretty similar, and if you learn one cloud really well, transitioning to the other one will be much easier than trying to consume all of it at once). The Microsoft offer gives you $200 to use for 30 days, also if you have an MSDN subscription you also get somewhere between $50-150 month to use.

While those numbers are small, especially when talking about services, it can still easily get you started with the basics of cloud. Virtual machines (which also cost a lot) are for all intents and purposes very similar to your virtual machines on-prem. But if you want to learn how to extend an on-premises Active Directory to the cloud, you can do that by building a Windows Server VM on your laptop, and then connecting to Azure Active Directory. That has minimal cost (AAD is a free service). Also, learning things like networking and storage also have minimal cost.

One of the most important cloud skills you can have, Automation, just involves using PowerShell (or CLI, depending on what you like). If you haven’t learned a scripting language, you should invest more time into that. You can do this on any trial account, and with a minimal cost, especially when you learn how to clean up the resources you deployed as soon as your deployment script created.

As a SQL Server pro, if you want to start learning Azure SQL*, you should get started with Azure SQL Database. It’s cheap, and you can do everything you can do in the $15,000/month database with the $5/month database.

tl;dr

This was a long post. Here’s what you should start learning for cloud:

  • scripting (powershell, cli, or rest, doesn’t matter, learn one of them)
  • networking
  • storage
  • security
  • platform as a service offerings in your field and how they work with networking, storage and security

You can do all of these things with a minimal financial investment, or perhaps even for free.

Summary

You are in charge of your career, not your current employer. If you want to advance your skills you are going to have to work for it, and maybe spend some money, but definitely a big time investment. Also, consider going to some training–I just did a precon at SQL Saturday Chicago, and while the attendees aren’t going to be cloud experts after a day, they have a great basis on which to move forward. Books and reading are challenging in a cloud world–it moves quickly and changes fast.

If Your Hardware and OS are Older than Your Interns, Fix It

Yesterday, I wrote my monthly column for Redmond Magazine about dealing with situations where your management doesn’t want to invest in the resources to provide reliability for your hardware and software environment. Whether its redundant hardware and even offsite backups, many IT organizations fail at these basic tasks, because the business side of the organization fails to see the benefit, or more likely doesn’t perceive the total risk of having a major outage.

carimage

Is this your hardware?

As I was writing the column and as mentioned in the column, AMC theaters had a system meltdown the other day, during the early sale of tickets for the premier of the new Avengers movie. The next day, Arizona Iced Tea (whom I didn’t realize was still in business) got ransomwared.

 

While I agree with Andy about testing your restores, I wanted to address a couple of other things. If you are running an old OS like Windows 2003, your business is at risk. If for some reason you absolutely have to run a 16 year old operating system in your environment, you should ensure that it is isolated enough on your network that it’s exposure is limited.

Additionally, as an IT organization it’s your job to be yelling up and down at your software vendors who won’t support modern versions of infrastructure components like operating and database systems. And yes, while I’m a consultant now, I’ve had many real jobs, and I understand the business chooses the software packages they want to run. I also understand, that when the org gets ransomwared because “SomeShittyApp” needed to run on an unpatched XP box with an SMB-1 share open to the internet, IT are going to be the folks whose jobs are on the line.

One of the other things I brought up in my column is how to handle the PR aspects of a system outage. Let’s face it, if your site and services are down, you need to be able to explain to your customers why and what your timeline for repair is. When you are investing in your systems and doing all of the right things, it is very easy to be transparent and explain “we had redundancy in place, but the failure happened in such a way that we incurred an outage”, sounds a lot better than “yeah, we had to run an OS that’s older than our interns because we didn’t have the budget for an upgrade.”

Finally, if you are on really old hardware (hint: if your servers were originally beige and are now a much yellower shade of beige, you’re on really old hardware), it’s probably cheaper and more efficient to do a cloud migration. You can move to Azure IaaS (or AWS) or if you’re a VMWare shop their cloud option on AWS offers a very simple migration path, especially if your cloud experience is limited. Just get off that really old hardware and software and onto something that gets patched regularly.

Really, Really Fast Cloud Storage

For a long time, CPUs and memory got faster, but we were stuck with spinning disks that topped out at 15,000 RPM. We got SSDs which were orders of magnitude faster, but we were still ultimately limited by controller throughput. NVME changes all of that. Do you know what NVME storage is? If you don’t, you should read my column this month at Redmond Mag.

Did you know Azure now had VMs available with direct attached NVME storage? I was looking at a client’s servers this week, and I was going to write a post about how storage latency in the cloud (specifically Azure with Premium Storage) is about 90-95% of most on-premises environments, based on what I was seeing on my customer system. Then I met the Lv2 series of VMs.

So just to give you a point of reference, this customer system is running on one of the largest VM types in Azure, the GS-5. We are running with 16 cores (of 32) for licensing reasons and we have 18 TB of premium storage presented to each VM in a single Storage Spaces pool, which gives us plenty of IOPs (~80,000). Remember though—premium storage is SSD, but it’s networked so our data needs to travel over cable to make it back to the machine. With that in mind I’ve been seeing single digit millisecond latency, as viewed from SQL Server’s sys.dm_io_virtual_file_stats DMV. I know there are other ways of better measuring IO performance using Performance Monitor or looking at the storage tier itself, but when my SQL numbers are that good, I generally don’t care.

I wrote my column with a focus on some of the newer persisted memory technologies—but some folks were kind enough to tell me that NVME drives were currently available in Azure. Microsoft is kind enough to allow MVPs to have a nice allowance in Azure—I spun up an L64s_V2 VM. Books online mentioned that the NVME was available, but when I logged into the VM, I didn’t see a volume mounted. I looked in in Storage Spaces and I found this magic.

Disks

You’ll need to use Storage Spaces within Windows to create a storage pool, and then create a virtual disk after that. I went ahead and used this post from Glenn Berry on how to use the DiskSpd tool from Microsoft. If you have ever used SQLIO to validate storage in the past, this is the more modern version of the tool. Anyway, onto the results.

thread        bytes      I/Os     MiB/s IOPs   AvgLat(ms) LatStdDev
     0       9654157312       1178486      306.90    39283.16     0.101      0.088
     1       6744514560        823305      214.40    27443.70     0.145      0.088
     2       9005244416       1099273      286.27    36642.71     0.108      0.134
     3       9004244992       1099151      286.24    36638.64     0.108      0.147
     4       9141108736       1115858      290.59    37195.54     0.107      0.088
     5       9164423168       1118704      291.33    37290.41     0.107      0.087
     6       9122758656       1113618      290.01    37120.88     0.107      0.086
     7       9144197120       1116235      290.69    37208.11     0.107      0.086
total: 70980648960       8664630     2256.43   288823.14     0.110      0.10

 

Two key things to look at in the above table—288,000 IOPs—holy crap batman. That’s a lot—I think I could get more by running more threads as well. Next is the latency—that’s .11 ms latency—that dot isn’t a typo. Really good traditional SANs have 1-3ms latencies, the crappy array your boss scored a deal on from Ron’s House of Unsupported Hardware is probably pushing 20-30 ms if you are lucky. This storage is 300x last latent than that off-brand array your boss got from Ron.

car-salesman-funnyjpg.jpg

Don’t buy your storage from Ron—go to Azure, AWS, or one of the many vendors offering this amazing technology.

Medtronic—Your Information Security is Garbage

My wife is a Type 1 diabetic, and uses a Medtronic insulin pump to maintain her blood sugar at healthy levels. While the device performs its job adequately, the surrounding software has always been a total disaster. Whether it’s requiring outdated operating systems or browsers, or the use of Java, Kelly has always had great difficulties getting the data off of her pump and onto a computer.

dumpster-garbage-fire-gif.0

There is a really great open source project that Scott Hanselman (b|t) uses, called NightScout, but Kelly likes the comfort of using Medtronic as it is widely supported by her doctors.

If I were building something like this, I’d want a Kindle experience, where there was a single use cellular chip, that always uploaded data to a secure cloud service. Recently, Medtronic has improved this upload process, and it sucks far less than it did in the past.

However, last night something happened with a medical care professional that made me flip out. I was talking with Kelly, and she mentioned that the nurse couldn’t completely see her data through the professional portal (which uses delegated permissions) and then the nurse asked Kelly for her username and password. As you can imagine I was livid, and I assumed the nurse was using the portal incorrectly. I told Kelly, that this was likely a HIPAA violation (or on the edge of one) and she should follow up with Medtronic about it. This was the email she received:

Hi Kelly,

I’m sorry that you are uncomfortable with the system Medtronic uses for your 670G follow-up. Medtronic is hippa compliant and certainly does not share this information. We always ASK for your username and password and access your reports with your permission. We use Professional CareLink but access through your Personal (this is the same way your doctor is able to view your reports).

You can certainly change your password and my access is eliminated.

My first comment is that the Medtronic employee can’t spell HIPAA correctly. But it’s really Medtronic’s business process to ask for your username and password? Does anyone at that company realize how terrible of a security policy this is (I don’t care if this is health data or data about lawn mowing, it’s awful)? I’m completely taken aback that a major health care company has such shit security, and claims compliance. All that said, after looking at their software, I’m not surprised.

Medtronic, do your damn job. And hire better dev and architecture staff.

Azure SQL Analytics—Automating Deployment

One of the cool feature additions to Azure SQL Database is Azure SQL Analytics, an additional feature that connects your Azure SQL Database to Azure Log Analytics. You can use this in conjunction with Azure SQL Database, Azure SQL Managed Instance, or Azure SQL Database Elastic Pools. Once this is configured you can see all sorts of information from your database in Log Analytics.

logazsql

The first step to enabling this is to create a Log Analytics workspace and add the “Azure SQL Database” solution to your workspace. The steps to create that are in the above link. After that is created, you can enable this in your database by launching the Diagnostic Settings blade in the Azure portal from the monitoring section of your Azure SQL Database. This process is pretty straightforward—it’s just a few clicks and you have data in your log analytics workspace.

However, I’m working with a customer who is building their own service based on Azure SQL Database, and I have fully automated their database deployment process. I wanted to take this a few steps further and add the SQL Analytics step as part of our deployment. This was harder than I expected it to be—the code samples in the books online post above weren’t working in my environment. And furthermore, once I got it working, I was having failures in my Azure Automation Runbook once I got the code running in the PowerShell ISE (I was having issues using VS Code on my Mac).

The command shown in books online BOL is here:

Set-AzureRmDiagnosticSetting -ResourceId [your SQL Database’s resource id] -WorkspaceId [resource id of the log analytics workspace] -Enabled $true

When I eventually got that working, it wasn’t enabling what I was expecting it to. I found out that the optional parameter of -Categories was required to in order to enable all of the SQL logging.

My final code looks like this:

#Add Analytics to Database

try {
$sqlDbInfo = Get-AzureRmSqlDatabase -DatabaseName $DatabaseName -ServerName $ServerName -ResourceGroupName $RGName

$sqlresourceID = $($sqlDbInfo).ResourceId.ToString()

$workSpaceid=(Get-AzureRmOperationalInsightsWorkspace).ResourceId.ToString()
$Metrics = @(“AllMetrics”)
$categories=@(“SQLInsights”,”AutomaticTuning”,”QueryStoreWaitstatistics”,”QueryStoreRunTimeStatistics”,”Errors”,”DatabaseWaitStatistics”,”Timeouts”,”Blocks”,”Deadlocks”)

Set-AzureRmDiagnosticSetting -ResourceId $sqlresourceID -WorkspaceID $WorkspaceID -MetricCategory $Metrics -Enabled $true -RetentionInDays 20 -Categories $categories

}
Catch {$ErrorMessage = $_.Exception.Message
“Write-Error Creating Analytics. Error was $ErrorMessage”}
You’ll note I’m having to use ToString() functions on both of my resource id options, and more importantly I’m having to specify the metrics and categories required. I tried using “AllCategories” however that failed, so I took a guess and just explicity named the categories from the portal, and it worked.

The other fun fact, if you want to automate this, and not run it interactively, you will need to import two modules. For whatever reason Get-AzureRMOperationalInsightsWorkspace is in the Azure.RM.Operationalnsights module, while Set-AzureRMDiagnosticsSetting is in Azure.RM.Insights. So you will need to add both of those modules as well as the Azure.RM.Sql module to your runbook.

My SQL Saturday Chicago Precon–Managing and Architecting Azure Data Platform

After the MVP Summit in March, I’m headed to Chicago to speak at SQL Saturday Chicago, and on Friday March 22nd, I’ll be delivering an all-day training session on the Azure Data Platform. The term data platform is somewhat of a Microsoft marketing term, but we will talk about a wide variety of topics that will help you get up to speed on Azure.

image_thumb.png

All of the morning, and some of the afternoon will be spent talking about the core infrastructure of Azure. You’ll learn about topics like:

• Networking
• Storage
• Virtual Machines

While these are topics normally outside of the scope of the DBA, in the cloud you will have to at least understand them. Want to build an Availability Group in Azure? You’ll need to build an internal load balancer and map probe ports into your VM. Remember how you normally complain to the SAN team about your lack of IOPs? In the cloud, you can fix that yourself. You’ll also learn about what’s different about managing SQL Server in the Azure environment.

In the afternoon, we’ll spend our time talking about platform as a service (PAAS) offerings from Microsoft. While we will spend most of our time talking about Azure SQL Database and Azure SQL Managed Instance, I’ll also spend some time talking about other offerings like CosmosDB, and when it is appropriate to use them.

It will be a packed day, so put your learning hat on. You can register at Eventbrite here—there are five discounted tickets remaining.

My LinkedIn Learning Courses

I’ve had the good fortune to start working with LinkedIn Learning (which was formerly known as Lynda, but through acquisition became part of LinkedIn and now Microsoft), and I’ve recorded several SQL Server oriented courses there. You can find my author page here — these topics are oriented around a wide array of topics–security, concurrency, performance, and development.

In addition to the videos, you can follow along with the code samples I’m using in my demos. The format of the courses are videos that are around 5 minutes, so it’s easy to get up to view a few courses. If you have any questions around any of my training, hit me up here.

 

Using Azure Key Vault with Azure SQL Database

Recently I was writing a PowerShell script to help a customer automate the process of deploying Azure SQL Databases. One of the challenges of automation that I remember since the dawn of time, is how to secure credentials in your automation script. Back in the old UNIX days, we used permissions to protect files and then read in the password files, but it was messy and probably not as secure as I would have liked.

One of the benefits cloud computing has offered is building out a lot of infrastructure and opportunities for smaller organizations to take advantage of structures that used to only be available to large enterprises. A good example of this is Azure SQL Database geo-replication—in the past if you wanted to have a database in four regions throughout the world, you had to lease space in four data centers, build a global network, and possibly even get people in place in different parts of the planet to make sure those machines stayed running. Now, with a few mouse clicks you can have your database on four continents (and for as cheap as $20/month, or realistically $1000/month)—this is where we see the real benefits of cloud computing. Another one of these components is Azure Key Vault—in the past Hardware Security Modules provided root trust amongst other security features. Now, in Azure, we can use Key Vault for password management, certificate management, and hardware trusts.

Key Vault is especially handy when trying to pass in a password to a script. Since it’s fully implemented with PowerShell, CLI, and Rest API, we can easily call it in a script. This script example is pretty basic, but it’s all I needed to securely pass a password into my automation job.

Screen Shot 2018-12-11 at 12.29.33 PM

The first thing you will need to do is create a key vault, and then create a secret. Once your secret is created, you will be able to reference it in code.

I pasted this image for readability, but you can see my code example on GitHub here. It’s pretty basic—I’m defining a variable called password, and getting from the Key Vault, and then passing it into the -SQLAdministratorCredentials in New-AzureRMSQLServer.

Columnstore Indexes and Key Lookups–The Worst

Key Lookups are one of my least favorite SQL Server execution plan operators. This is where for each record in an index seek, SQL Server goes back to the clustered index and looks up a record. Generally, we either live with this condition (for a very small number of rows) or we fix it by adding columns directly or adding included columns to our nonclustered index.

However, in the plan below, we have a different scenario. We have a clustered columnstore index, that has an additional nonclustered index on the table. This was a feature that was added in SQL Server 2016 to allow point lookups on a column without having to scan many row segments of the index. This works pretty well for some conditions, though it is important to know that it can slow down your inserts significantly.

In the last year or so, with a large customer who makes fairly heavy use of this pattern, I’ve noticed another concern. Sometimes, and I can’t figure out what exactly triggers it, the execution plan generated, will do a seek against the nonclustered index and then do a key lookup against the columnstore as seen below. This is bad for two reasons–first the key lookup is super expensive, and generally columnstores are very large, secondly this key lookup is in row execution  mode rather than batch and drops the rest of the execution plan into row mode, thus slowing the query down even further.

 

Screen Shot 2018-11-14 at 7.38.32 AM

I’ve anonymized the schema, but that key lookup is against my clustered columnstore index. This query never finished. So what’s the resolution? Index hints–I’m not a big fan of using hints, but sometimes you need to kick the query optimizer in the shin. When I changed this query with the index hint, it completed in around 7 seconds. The ultimate fix is for Microsoft to fix this costing, but that’s hard. You can vote on my User Voice item here:

https://feedback.azure.com/forums/908035-sql-server/suggestions/36015868-key-lookup-against-columnstore-index-causes-slow-q

If you see this pattern pop up at all, you will definitely want to hint your queries.

 

Not all PaaS Services are the Same

In public cloud computing, the notion of platform as a service, is an offering that offers some key computing concept, as a service. Since you are reading my blog, you are probably most familiar with Azure SQL Database (which if you are old like me, you might call SQL Azure). Azure SQL Database for the uninitiated is Microsoft’s database as a service offering. You make a few mouse clicks or run a few lines of code, pay some money to Microsoft, and you have a database for your application. Microsoft handles your software installation, your settings, backups and restores, and numerous other operational tasks and lets you write your code.

1200px-Production-stn-platform

Azure SQL Database’s original pricing model simply referred to a number called database throughput/transaction (Microsoft really likes to rename things if you haven’t noticed) units, or DTUs, which are a blended measure of CPU, memory, and read/write IOPs. A pricing model like this allows the cloud provider to abstract the underlying hardware, which is another simplification of the cloud model. However, especially in the early days of the public cloud, this concept was hard for IT orgs (many, at the time were still buying physical hardware) to grasp.

Different PaaS offerings offer different levels of abstraction, though they do have a few things in common. You are almost never going to install software—patching and installation are nearly always handled by the vendor. However, with some different offerings you may be doing things like configuring storage and networks—a good example of this is with the new SQL Server offering—Managed Instance. This excellent post from the SQLCAT team shows the complexity of sizing data files to get maximum performance out of the underlying storage. Doing this requires a really firm understanding of Azure storage and how it works on the platform. You also need to know a little bit about virtual networking, because you will need to get your managed instance “wired” into a network where it can talk to your application.

Another difference with Managed Instance is that, you choose a tier, General Purpose, or Business Critical (which affects your storage performance and availability) and then the number of cores you would like, and the generation of hardware you would like. While Azure SQL Database is also moving to this model, it is just noteworthy to see the difference in abstraction. I like to think of this as a curve that has pure infrastructure as a service (IaaS) on one side (VMs in the cloud) and software as a service (SaaS) like Office 365 on the other side. Managed instance is definitely closer to the IaaS side of things.

Another thing you want to verify is, the backup situation for your chosen service. While most services do backups (and you may not even have the option to do your own backups, like in Azure SQL DB), it is important to know what your data loss SLA and what your options are if there’s an outage or a disaster.

I was inspired to write this post, because while delivering training last week, I deployed a new Azure Databricks cluster, and I was totally blown away at how simple the deployment process was—you had two performance options (for advanced deployments you could choose more changes), but it was dead simple from a deployment perspective. I don’t think it’s a good thing or a bad thing that different services have different levels of configuration—I think in some services you need that, but sometimes it’s nice just to have a box that does a service for you.

%d bloggers like this: