How the Cloud Democratizes Solutions

The grey hairs I see every now and again remind me of how long I’ve been working in IT. I’m now in my 22nd year of having a job (2 years as an intern, but I did have the ‘sys’ password), and I’ve seen a lot of things come and go. When I started working servers cost at least tens of thousands of dollars, were the size of refrigerators, and had less processing power than the Macbook Pro I’m typing this post on. More importantly, they had service contracts that cost a rather large amount of cash per year. This bought you, well I’m not sure, but your hardware reps surely took you out for at least a steak dinner or two. Eventually, we moved to commodity hardware (those boxes from HP and Dell that cost a tenth of what you paid 20 years ago) and service contracts were a few hundred dollars per server per year. (And then those sales reps started selling expensive storage).

Most of my career has been spent working in large Fortune 500 enterprises—I think about things that at the time were only available to organizations of that size and budget, and are now available for a few clicks and a few dollars per hour. I’m going to focus on three specific technologies that I think are cool, and interesting, but as I’m writing this, I’ve already thought of three or four more.

image

Massively Parallel Processing

MPP processing is a data warehouse design pattern that allows for massive scale out solutions, to quickly process very large amounts of data. In the past it required buying an expensive appliance from Teradata, Oracle, Netezza, or Microsoft. I’m going to focus on Microsoft here, but there are several other cloud options for this model, like Amazon Redshift or Snowflake, amongst others. In terms of the on-premises investment you had to make, effectively to get your foot in the door with one of these solutions, you were looking at at least $250k/USD which is a fairly conservative estimate. In a cloud world? SQL Data Warehouse can cost as little as $6/hour, and while that can add up to a tidy sum over the course of a month, you can pause the service when you aren’t using it, and only pay for the storage. This allows you to do quick proof of concept work, and more importantly compare solutions to see which one best meets your needs. It also allows a smaller organization to get into the MPP game without a major investment.

Secondary and Tertiary Data Centers

Many organizations have two data centers. I’ve only worked for one that had a third data center. You may ask why is this important? A common question I get when teaching Always On Availability Groups, is if we are split across multiple sites, where do we put the quorum file share? The correct answer is that it should be in a third, independent data center. (Which virtually no organizations have). However, Windows Server 2016 offers a great solution, for mere pennies a month—a cloud witness, a “disk” stored in Azure Blob Storage. If you aren’t on Windows Server 2016, it may be possible to implement a similar design using Azure File Storage, but it is not natively supported. Additionally, cloud computing greatly simplifies the process of having multiple data centers. There’s no worries about having staff in two locales, or getting network connectivity between the two sites; that’s all done by your cloud vendor. Just build stuff, and make it reliable. And freaking test your DR (That doesn’t change in the cloud)

Multi-Factor Authentication

If you aren’t using multi-factor authentication for just about everything, you are doing it wrong. (This was a tell that Joey wrote this post and not the Russian mafia). Anyway, security is more important than ever (hackers use the cloud too) and having multi-factor authentication can offer additional levels of security that go far beyond passwords. MFA is not a new thing, and I have a collection of dead SecureID tokens dating back to the 90s that tell me that. However, implementing MFA used to require you to buy an appliance (later it was some software), possible have ADFS, and a lot of complicated configuration work. Now? It’s simply a setting in the Office 365 portal (if you are using AAD authentication; if you are an MS shop learn AAD, it’s a good thing). While I complain about the portal, and how buried this setting is, it’s still far easier and cheaper (a few $ a month) than buying stuff and configured ADFS.

These a but a few examples of how cloud computing makes things that used to be available to only the largest enterprises available to organizations of any size. Cloud is cool and makes things better.

Ignite—Important Data Announcements You Probably Missed

So, if you have been living under a rock for the last year (and yes, he really is president), you may not have heard that SQL Server 2017 launched yesterday, and it runs on Linux!!! If you want to know more about that you can read my article over at Redmond mag. SQL Server 2017 will be released next Monday, October 2nd. But there were a lot of other Microsoft data platform announcements, that you might have missed, because I know I did, and I watched the keynote and stood in the SQL Server booth all day.

cities-of-learning-announcement-at-the-2014-open-badges-summit-to-reconnect-learning-1-638.jpg (638×359)

  • Azure SQL Database Machine Learning Support—If you are using a premium Azure SQL Database you can now take advantage of R and most of the goodies that are available in SQL Server. You can learn more about that here. Python isn’t there yet, and some options are limited, but it’s a nice start
  • SSIS as a Service Public Preview—Anyone who has tried to use Azure Data Factory, well hi, Meagan! There’s a new release of ADF that will support using SSIS packages, which should make the service much more friendly to use. You can learn more about that here.
  • Public Preview of Managed Instances—We’ve been hearing a little bit about this for a while, and I’ve been fortunate enough to be part of the private preview. This is hybrid PaaS/IaaS product, that will be easy to migrate to. It will also have the added bonus of allowing you to bring your own license to the party. This isn’t quite available yet, but based on yesterday’s announcement, should be soon. There’s a little bit of info in this post from Microsoft.
  • SQL Data Warehouse BIG SCALE—Do you have a Neteeza? Are you sick of paying them $500k/yr for support? Azure SQL Data Warehouse has a solution for you (bring a black card only, please). You can go up to 18,000 DWUs using NVMe hardware in the cloud. This offering is aimed at large customers who want to remove expensive on-premises appliances and get the flexibility that the cloud offers.
  • Azure SQL Database vNet Support—This is something that a lot of a customers have been asking for in Azure SQL DB. The ability to not have a public facing endpoint (even though you could lock it down), but it can allow you have total isolation. The bigger benefit of this is that you can allow more granularity in what connects to your SQL DBs from Azure. You can learn more about the preview here.

There are a lot of announcements this week, and not a lot of time in the keynotes to get to them all. So I thought this might be helpful.

SQL Data Warehouse—Size Limits? (No, Not Really)

One of the promises of Azure SQL Data Warehouse is the ability to have petabyte scale. The ability to quickly scale data, and have that data scale independently of compute resources. So when one I my clients emailed me yesterday with this screenshot, needless to say I was concerned.

image

As you can see, when  the properties screen shows a max size of 5 GB. So I sent a frantic email to one of my colleagues on the SQL Data Warehouse team (thanks, JRJ). Turns out, due to the distributed architecture of SQL DW, that max database size refers to the size of the control node, and not the underlying data nodes that service your queries. So feel free to ignore this.

%d bloggers like this: