Colorado DOT Post Mortem–Don’t Do Stupid #$%^

A good policy in your technology career, or life in general, is to, to quote a former president, “don’t do stupid shit”. If we look historically at major IT outages or data breaches, there’s always some universal tenants–some piece of critical infrastructure was left completely unsecured, or someone put the private keys on Github. In fact, just last week I wrote my column for Redmond Mag, on some common patterns, mostly specific to SQL Server, but applicable to many other types of applications. I wanted to call it Data Breach Bingo, but I didn’t enough space in my column for 25 vulnerabilities.

Stupid GIF - Find & Share on GIPHY

While we’re here, it’s important that when thinking about security, you think about the basics first. The GRU operatives working for the Russian government likely aren’t targeting your data–it’s some bot network looking for a blank SA password running on a SQL Server that’s got port 1433 open to the internet. If you are on twitter, I highly recommend following Swift on Security a rather serious parody account (of Taylor Swift) that focuses on good security practices. That twitter account also hosts an excellent website called DecentSecurity.com that talks about the basics of how you should be securing your personal and business environments. While you might not be protected from the GRU, you can avoid most of the other “hackxors” in the world that are really just dumb bots.

The reason why I wrote this post, was that my colleague Meagan Longoria (b|t), shared with me a post-mortem from a ransomware attack on the Department of Transportation in the state of Colorado. I would really like to applaud the state for putting out this document, even though it doesn’t paint the organization in the best light. I’m going to use this space to talk about some of the things highlighted in the report.

A virtual server was created on February 18, 2018. The virtual server was directly connected into the Colorado Department of Transportation (CDOT) network, as if it was a local on premise system. The virtual server instance also had an internet address and did not have OIT’s standardized security controls in place.”

Ok, this sounds pretty much like they deployed a VM to either Azure or AWS, and they had a network connection back into their on-premises network. This in and of itself is not problematic, but not applying security policy in the provider is a problem. Both Azure and Amazon have standard security templates and policies that can be configured and applied at deployment.

“The account utilized to establish the connection into the CDOT network was a domain administrator account – this is the highest level privileged account, and means that 1) the account cannot be disabled for too many failed login accounts, and 2) it provides the highest level of access to the agency domain controllers (gatekeepers for all access to everything in the department).”

There’s a lot going on here. I read this as that they joined the server to the Active Directory domain, and used a domain admin account to perform the join. Neither of these are problematic. The next sentence is wrong–domain admin accounts can absolutely be disabled, however this indicates to me, that the department had the “never disable this login” checkbox check. Bad move. Also, in an ideal world, you would never have a domain admin log into member servers, so its password was never in kernel memory, but that’s not how this (or many) hacks work.

“Later, OIT was informed by the vendor that when an external IP address is requested, the vendor automatically opens the Remote Desktop protocol to the internet. The Remote Desktop protocol is how this attack was initiated.”

This is my biggest complaint about both Amazon and Microsoft Azure, especially in the timeframe of this attack, both services defaulted to having a public IP address and didn’t actively discourage public ports from being opened. While it’s a totally bad practice that should be enforced by policy, up until a few months ago on the Azure side, it was very easy to do.

“An attacker discovered this system available on the internet, broke into the Administrator account using approximately 40,000 password guesses until the account was compromised. From there, the attacker was able to access CDOT’s environment as the domain administrator, installing and activating the ransomware attack.”

This is a statement that will sound somewhat impressive to lay people. 40,000 password guesses sounds like a whole lot, however for an automated script that would take seconds. This means they either had approximately 3 character passwords (I’m guessing they didn’t), or the domain administrator was using a common dictionary based password. You can use add-ons to Active Directory to limit the types of passwords users can use, and while this may be painful for normal users, it is absolutely critical for high privilege users like DBAs or System Admins. They should also have separate user accounts for logging in as domain admin, and multi-factor authentication configured.

This is also the part where I would usually say you should have network segmentation configured to block the malware from spreading, but if someone pwns your domain controllers all the segmentation in the world won’t help that much. An attacker could even go as far to install their malware via group policy. Good luck cleaning that one up.

The state of Colorado had a really robust backup solution, and some good network practices (smart firewalls) that were able to limit the spread of the malware. So they were able to recover from this attack relatively quickly. It’s important to worry about big things, but doing all of the little things right and most importantly not doing stupid things will go a long way to securing your environment.

 

 

Azure SQL Database Price and Performance vs Amazon RDS

Yesterday, GigaOm published a benchmark of Azure SQL Database as compared to Amazon’s RDS service. It’s an interesting test case that tries to compare the performance of these platform as a service database offerings. One of the many challenges of this kind of a study is that the product offerings are not exactly analogous. However, GigaOm was able to build a test case where they used similar sized offerings as shown in the image below.

Screen Shot 2019-10-15 at 3.17.40 PM

GigaOm derived from the TPC-E workload, which is a blended OLTP workload which involves both read-only and update transactions. In the raw performance numbers, SQL DB was a little bit better in terms of overall throughput, but where Azure really shines in comparison to AWS is pricing.

In terms of raw pricing, there is a significant difference between Azure SQL DB and AWS pricing for these two instances. Azure is $25,000/month less, however it’s not quite an apples to apples comparison, as Azure SQL DB is single database service as opposed to RDS which acts more like Managed Instance, or on-premises SQL Server. If you use use the elastic pool functionality in Azure SQL DB which is the same cost as single database SQL DB, you get a better comparison.

Screen Shot 2019-10-15 at 3.54.24 PM

Where the real benefits come in, are that Microsoft allows you to bring on-premises licenses to Azure SQL Database, and the fact the Microsoft offers a much lower cost for 3 year reservations. (Which you can know pay for on a month to month basis). This is a big advantage that Microsoft has over Amazon is that they control the licensing for SQL Server. You can argue that this isn’t fair, but it’s Microsoft’s ball game, so they get to make the rules.

There will always be an inherent feature advantage to being on Azure SQL DB over AWS RDS, as Microsoft owns the code and can roll it out faster. While AWS is a big customer, and will be quick to roll out new updates and versions, they are still subject to Microsoft’s external release cycle whereas Azure is not. In my opinion, if you are building a new PaaS solution based on SQL Server, your best approach is to run on Azure SQL DB.

Cloud Field Day–Solo.io #CFD6

As I mentioned, I was in Silicon Valley a couple of weeks ago for an analyst event, and got to meet with a variety of the companies. The final company we met with on Friday, was Solo.io, and I have to say they knocked it out of the park. Their technology was super interesting and their founder Idit Levine, and their CTO, Christian Posta were excellent presenters who were clearly enthusiastic about their product.

SoloIO

So what does Solo.io do? In the modern microservices oriented world, we have distributed systems which are nearly all API driven. Solo.io has a number of products in this space, but their core product Gloo is a modern API gateway that securely bridges modern applications like Lamba or Azure functions to both legacy monolithic applications as well as modern databases running in Kubernetes pods.

They also have another open source project called SuperGloo, which is an abstraction layer for service mesh architecture. A service mesh provides modern applications with monitoring, scaling, and high availability through APIs rather than discrete appliances. Istio from Google is best known tool in this space, and SuperGloo can work with it, and other service meshes in the same architecture.

The other really interesting tool that Solo.io highlighted was called Squash, which is a debugger for distributed systems. If you’ve ever tried to troubleshoot a distributed system, even figuring out where to start can be challenging. By acting as a bridge between Kubernetes (drink) and the IDE, you can choose which pods or containers you are debugging and set breakpoints, or change variables during runtime.

 

Cloud Field Day 6–HashiCorp Consul #CFD6

I was recently in Silicon Valley for Cloud Field Day 6, and one of the companies we met with with HashiCorp. HashiCorp is known mostly for two key products in cloud automation–Terraform and Vault which enable cloud automation, and secrets management respectively. Both of these are open source projects, which have support  and premium feature offerings for companies and are free to get started with for individuals.  Both of these products are considered best of class, and are widely used by many organizations.

Hashicorp

We had the honor of hearing from the founder and CTO of HashiCorp, Mitchell Hashimoto, who spoke to us about Consul, a service based networking tool for dynamic infrastructure (this means things like containers, Kubernetes, and serverless cloud services). Mitchell explained that companies are trying to apply on-premises networking paradigms to cloud infrastructure doesn’t really work.

Consul steps in, how can make this simpler.

  • Service Registry & Health Monitoring
  • Network Middleware Automation
  • Zero trust network with service mesh

 

The goal of the product is easier adoption, crawl, walk, run, earlier adoption. It lets you ID what’s deployed in every single platform–registry does that. Consult provides a unified view for both DNS, and API and provides active health monitoring. It also builds catalog of your entire network. Consul launched in 2014, 50,000+ agents, most widely deployed service discovery tool on AWS. Servers form a cluster and do leader election. All membership is via gossip. Consult requires one server cluster per data center. Separate gossip pool,  the open source edition requires fully connected network, while the enterprise edition allows for hub and spoke topologies.

Consul also provides a number of other services like traffic splitting, which allows you do rolling deployment of application code, while sending a small percentage of traffic to the newly released version of your app, in order to check for errors.

Consul is unique tool–networking in containers and serverless is very challenging, and this product brings it together with old school technology like mainframes and physical servers. Also, given HashiCorp’s record with their other products, I expect this one to be really successful.

 

 

%d bloggers like this: