PASS–An Organization in Trouble

PASS is an organization that has helped my career at many levels. I’ve served as a user group leader, a SQL Saturday organizer, a regional mentor, and spoken at PASS events around the world. PASS has a big problem–its main revenue source is the annual PASS Summit, which isn’t happening in person this year due to the pandemic. This is a force majeure event, which would have been challenging for any organization, but especially one that’s been managed as poorly as PASS has in recent years. Today, I’m calling on the PASS Board to put C&C in formal review based on it’s failures to properly manage the organization. 

If you don’t know, PASS is run by a for-profit Canadian company called Christianson & Company, which has been involved with PASS for as long as I have at least. If you look at my good friend Steph Locke’s analysis of the organization’s budget, you can see how little of of the organization’s revenue goes back into community activities. While the board of directors continually praises C&C for their efforts, most members and volunteers have not seen any value from these efforts. SQL Saturday subsidies have been cut, the website has numerous bugs, which has led many events to no longer be run under PASS’ governance, especially in Europe, where most of the major events are no longer SQL Saturdays.

The list goes on: the job board never grew into anything worth further investment, the 2019 Microsoft Modernization events sponsored by Microsoft didn’t get the traction it should have, the execution of the Business Analytics Conference was a complete failure (something C&C should have excelled at), the multiple attempts to create lasting events in Europe was a failure (the full PASS Summit and SQL Rally), and all efforts to monetize Summit content has fallen flat. 

This brings us to today: While this pandemic is likely (and hopefully) a once in a lifetime event, for having a full-time management company, it has not been handled well. The conference dragged its feet moving to a virtual event, being beat to the punch by many other events. Additionally, as most of those other (admittedly vendor subsidized) events went online, they became free events. So PASS is in a situation where most of their formerly paid competition, like Microsoft Ignite, and VMWare VMWorld are free online conferences. So they need to prove they can deliver value in their paid conference. And arguably the management company is not handling it. They’ve outsourced it to another third party.

At DCAC, we’ve thought about how to make a virtual event better. We haven’t worked out the details, but during Denny and John’s precons on Azure, I’ll be in an adjacent chat/breakout room if students in the session want to see a specific demo again, or ask some deeper dive question. A competent management company would been thinking about this since the pandemic became news in February, and it was fairly obvious that the event was going to be virtual. What I would have done is the following:

  1. Plan a small-ish virtual event in late March or early April with 5 prominent community speakers on whatever platform appeared to be your leading candidate for a virtual summit. This does two things–it lets you work out some of the kinks of a virtual conference, and helps your demonstrate the value of your event, by maybe having breakout rooms with out speakers. The community would have helped with this.
  2. After demonstrating that value, announce the virtual conference. Plan a series of smaller virtual events to keep up the energy around the event.

I spoke at an event this week that had nearly as many attendees as PASS Summit did. It was run by the volunteers. Great job @eightkbconference

C&C is not a management company. At this point, they are PASS. They have no other revenue, all 20 or so employees work full-time for PASS and they are on an opaque contract with no end in sight. It’s disingenuous for us to consider them anything less. Their organizational oversight is a volunteer board. Not even the executive committee has a fiduciary responsibility to the organization. No one is paid and therefore, no one has skin in the game. That’s why all the excess money is simply funneled into C&C’s pockets while SQL Saturday sponsorship budgets are cut and there is no value to the UGs affiliation beyond the Summit discount and a marketing platform they could get free anywhere else. You want more value from PASS? For this and all the reasons I list above, I say the community should call on the PASS BoD to put C&C in review, accept competing offers, and see what the market will come with.

Licensing SQL Server Reporting Services, Power BI Report Server, Integration Services, and Analysis Services

On the twitter hashtag #sqlhelp I saw a really dangerous (dangerous because it could cost your company a lot (somewhere between tens of thousands to hundreds of thousands of pick your favorite currency) bit of advice, that someone has received from their software reseller.

I’ll paraphrase the tweet so as to protect the guilty: “our reselller said that if we bought a license of enterprise edition, we could run the database engine on one server, and SQL Server Reporting Services on another.” This is 100% wrong, and always has been. Per the SQL Server 2019 licensing guide.

Screen Shot 2020-05-19 at 10.56.02 AM

Even though SQL Server Reporting Services is separate installation now, the licensing is exactly the same as SQL Server. I think some of the confusion in the Twitter stream is related to the one of the terms of Power BI Report Server. If you purchase Power BI Reporting Services through your SQL Server licensing, it is treated exactly like any other SQL Server component for the purposes of licensing. That means, if you need a SQL Server database engine for your report server databases (the database that contains the PBIRS metadata), you have two choices:

1) Install the database engine side by side with your PBI services

2) Buy additional cores to run the database engine on a different server.

This last bit is where it gets a little confusing. If you buy your PBIRS licenses through having Premium capacity in the Power BI service, you can install SQL Server standard edition, exclusively for the purposes of Power BI or other products like SSRS or SSIS that require a SQL Server database.

Customer may run any number of Instances of any SQL Server database software (SQL Server Standard) included in Power BI Report Server in one OSE on a Server dedicated to Customer’s use for the limited purpose of supporting Power BI Report Server and any other product that includes SQL Server database software. Dedicated Servers used for this purpose, that are under the management or control of an entity other than Customer or one of its Affiliates, are subject to the Outsourcing Software Management clause

That’s from the volume licensing guide.

tl;dr Always assume you need a license for production, unless you are paying for PBI premium and then you may have an engine license you can use just for that.

New Features in Cloud Shell in the Azure Portal

One of the things that was really painful in the early days of Azure, especially for those of us who are consultants with many customers, was the process of switching tenants and logged in sessions. To Microsoft’s credit, they have made this process much better, it’s a single click to switch between logins and/or subscriptions. However, when I working with Azure programmatically as I often do, logging in from my laptop is a little bit more painful.

The workaround for this has been to use cloud shell within the Azure Portal. You can even do this on a mobile device, which can be really handy, if something bad happens and you don’t have a laptop handy. However, the one problem I had with Cloud shell was that it was hard to debug in. I would develop a script “offline” and then paste it into the shell, and sometimes miss obvious variables or cut and paste errors. Also, if you were using the PowerShell version of cloud shell, saving scripts was not intuitive.

However, when I logged in this morning in an effort to run some PowerShell code against a customer’s tenant. I was greeted by a message that I didn’t get a screen shot of and can no longer recreate that said vim, nano, emacs, and code (VS Code) were available as text editors in cloud shell. Let’s try it out.

 

Screen Shot 2020-05-18 at 5.06.11 PM

Note, that is how you exit vi like a boss (escape+:x!). So I created a trivial file, big deal.

Screen Shot 2020-05-18 at 5.06.32 PM

I can also see my file. That’s also pretty nifty. But check this out:

Screen Shot 2020-05-18 at 5.09.51 PM

By typing code followed by my file name, I get a limited version of Visual Studio Code (btw, I checked and dark mode seems to be the only choice). You can’t highlight code and execute it using F8, but you do get really nice editing functionality in the portal. And you can save a file and it stays in your home drive in cloud shell.

 

 

 

A Weird Networking Problem with Azure SQL Database (and Fios)

Note: This is my first blog post in a while. You too may have been stressed by recent events in world. I’m also doing a large content project I’ll hope you get to enjoy in the near future too. This post actually ties into working from home.

 

pexels-photo-373543

So anyway, one of our customers recently spun up a new Azure SQL Database in the Azure South Africa North region, and two of their team members couldn’t connect to it from their homes. The common element–both of them used Verizon Fios as their ISP.

It’s important to note that connectivity for Azure SQL Database is to a public IP address. Each region has a public IP address and a lookup takes place, and then you are connected through a gateway and you eventually connect to the database. The IP address is just a public endpoint. If I run a trace route to a database running in the Azure US East region I see:

Josephs-MacBook-Pro-3:Dropbox joey$ traceroute dcac-demo.database.windows.net
traceroute to cr4.eastus2-a.control.database.windows.net (52.167.104.0), 64 hops max, 52 byte packets
1 192.168.115.1 (192.168.115.1) 2.224 ms 2.849 ms 1.953 ms
2 96.120.76.77 (96.120.76.77) 12.961 ms 12.978 ms 14.438 ms
3 68.86.213.153 (68.86.213.153) 10.461 ms 9.803 ms 8.983 ms
4 68.86.211.170 (68.86.211.170) 16.069 ms 6.967 ms 12.133 ms
5 68.85.158.49 (68.85.158.49) 11.460 ms 15.657 ms 11.314 ms
6 be-201-ar03.ivyland.pa.panjde.comcast.net (96.108.181.21) 14.815 ms 13.470 ms 13.775 ms
7 be-33287-cr01.newark.nj.ibone.comcast.net (68.86.93.173) 18.519 ms 17.500 ms 19.012 ms
8 be-1301-cs03.newark.nj.ibone.comcast.net (96.110.36.89) 16.457 ms 23.452 ms 20.628 ms
9 be-2303-pe03.newark.nj.ibone.comcast.net (96.110.37.74) 15.248 ms 21.317 ms 22.250 ms
10 50.248.118.226 (50.248.118.226) 16.467 ms 18.986 ms 23.555 ms
11 ae23-0.ear01.ewr30.ntwk.msn.net (104.44.239.159) 24.966 ms 22.497 ms
ae24-0.ear01.nyc30.ntwk.msn.net (104.44.239.155) 26.774 ms
12 be-21-0.ibr02.ewr30.ntwk.msn.net (104.44.20.122) 32.792 ms
be-20-0.ibr01.nyc30.ntwk.msn.net (104.44.20.124) 27.437 ms
be-20-0.ibr01.ewr30.ntwk.msn.net (104.44.20.120) 34.036 ms

You can see that on steps 7-9 it my connection jumps onto the Comcast backbone (ibone.comcast.net) network, and at step 11 jumps onto the Microsoft Azure network (msn.net). I truncated the output of the trace route there.

Monica Rathbun (b|t ) who works with me at DCAC, also had Fios and was able to help us troubleshoot this. When Monica ran a trace route to the public IP address the  of Azure SQL Database in South Africa North it looked like:

C:\Users\monica>Tracert egpaftz.database.windows.net

Tracing route to cr1.southafricanorth1-a.control.database.windows.net [102.133.152.0]

over a maximum of 30 hops:

  1    <1 ms     1 ms    <1 ms  Fios_Quantum_Gateway.fios-router.home [192.168.1.1]

  2     3 ms     1 ms     2 ms  102.133.152.0

That is not how the internet is supposed to work. While we tried to figure out how to get in touch with someone at Verizon to fix this, which through normal consumer channels is not fun. So while we waited for that to get fixed, we had another options to fix this.

Introducing Private Link

Azure Private Link was recently introduced as generally available for Azure SQL Database and allows for you to have a truly private connection to your database. For several years now you have been able to use network endpoints to allow a specific VM or App Service to connect to Azure SQL, however that design had some limitations. The first being that it still routed the connection to the database over the public IP address. The second was that network endpoints did not support multi-region scenarios. With Private Link your database has it’s own private IP address on the virtual network where it is deployed.

In our client’s case, this work around involved connecting to an Azure VPN in US East which would then be connected to South Africa North. One interesting thing about Private Link is that it doesn’t support Azure Virtual Network peering, so you will need to create a Gateway if you want your traffic to traverse virtual networks. Additionally, we were working in two different subcriptions, which mean we had to use the PowerShell from here. Private Link is easy to configure from your Azure SQL server logical server, but be sure you have the right permissions in Azure RBAC. I needed to be granted the network contributor role in order to get the GUI to work.

The Conclusion

I had some issues that were permissions related which delayed our implementation of Private Link, and I was fortunate enough to have some members of the Azure Networking product group helping me out. Since Microsoft and Verizon are technically partners (Verizon is an ExpressRoute partner) they were able to get in touch with some folks as at Verizon, and they were able to resolve the problem. Sort of–the trace routes still look weird, but everyone can connect. Verizon reports that the trace routes from what they can see look normal. Which tells me they are doing something really weird on the router itself with how Azure traffic is routed.

Photo by Pixabay on Pexels.com

 

 

 

Storage Field Day 19 MinIO #SFD19

In January I had the chance to attend Storage Field Day 19 in Santa Clara, where we got to meet with a wide variety of startups and large enterprise storage companies. One of the more interesting companies we meet with was MinIO which has a really interesting and compelling object-based storage product.

IMG_3141
Some of MinIOs Customers and Use Cases

I’ve talked about object storage here before, but it’s a very different paradigm than the traditional block based storage you may currently be using. With block storage files are split into evenly sized blocks of data (typically somewhere between 64 KB and 1 MB depending on your vendor). Data protection is provided by traditional RAID options.

Object storage on the other hand doesn’t split files into blocks. Files are stored as objects which contain the file data, metadata, and a unique identifier. There is no limit on the size or amount of the metadata associated with the file. If you have ever created a managed disk in Azure, taken a backup to URL, or used an Azure SQL Database you’ve used object based storage. In object based storage, redundancy is generally provided by maintaining three copies of the object (e.g. a write isn’t considered complete until it writes to all three copies).

Object storage is designed to solve problems of scale. One of the things I learned at Comcast was that the cost of SAN storage didn’t scale to some of the massive petabyte scale data problems we had. The management overhead, the cost, and sometimes even the storage itself does not scale. This is a problem largely for companies like Microsoft, Amazon, Google, Facebook, etc, who have massive amounts of data to store.  But as data volumes grow there are lots of other firms who have very large volumes that they need to manage.

MinIO is a firm that offers such a solution. MinIO offers open source storage management software that offers extremely fast (183 GB/s reads and 171 GB/s writes). It is fully compatible with Amazon’s S3 API, which has somewhat become the de facto standard for object storage. They were working on Azure Blob Storage support when we visited.

One of the ways MinIO is able to get such good performance out of pretty standard hardware is by taking advantage of SIMD processor instructions, which all more text and number crunching to be performed per CPU instruction which dramatically increases performance. SQL Server uses this through the query processor’s use of batch mode.

MinIO’s storage can also be used as a persistent store for Kubernetes (drink), or used for systems like Spark, TensorFlow, and a replacement for Hadoop HDFS. Where you would probably use this in your environment would be to replace your file servers, or as a target for container storage, or maybe even an analytic store. Or you want to become a cloud storage provider and you need to host 50 PB of data in your data center.

Storage Field Day 19–Western Digital #SFD19

As I mentioned in my post last week I recently had the opportunity to attend Storage Field Day 19, where I got to meet with a wide variety of storage software and hardware companies in Silicon Valley. One of the more interesting companies we met with was a longtime player in storage—Western Digital. (Disclosure—I own shares of Western Digital and was gifted an SSD after the event) One of the overwhelming themes of the week was the vast amounts of data that we are generating much of which is coming from video and IoT device telemetry. Western Digital estimates that 103 zetabytes (that’s 103MM petabytes, or 103 Billion terabytes) of just IoT data will be created by 2023.

IMG_3044

We were able to hear from a wide array of executives at Western Digital making up various parts of their business. There are a few market forces that are driving the direction of the company. The first area is gaming—building internal NVME drives with up to 2 terabytes with bandwidth up to 3480 MB/second. Performance is one aspect of gaming systems, but design aesthetic and cooling are also very important.  PC Gaming is a $37.5 billion market, so Western Digital sees this as a major market for them.

While the gaming part of the presentation focused on bleeding edge performance, the rest of the afternoon looked at increasing storage densities. While it went unsaid, I feel like much of the development in the hardware business is increasingly focused on public cloud providers like Microsoft and Amazon, as well as large scale data companies like Facebook and Twitter. Western Digital is at the forefront of this development through the develpment of zoned storage. One of the goals of this extension to the NVME standard is to  allow ultra-fast SSDs to be zoned similar to the way hard drives can now. This is not technology that you will be implementing in your data center anytime soon, however it will likely be coming to a cloud provider in the near future.

The other aspect of storage futures are increased densities. While many analysts have prematurely speculated about the death of the spinning hard drive (in lieu of lighter, faster, cooler solid-state drives), the density offered by traditional hard drives is unmatched. Western Digital showcased volumes up to 20 TB, as well as multi-actuator driveswhich can increase the performance of a spinning disk by an order of magnitude. These drives will consume more power than a traditional drive, but less than the two traditional drives. The data on these platters is striped in a RAID-0 fashion on the drive itself.

The world is heavily dependent on reliable, fast storage for all of the data systems modern life demands. As one of the leading builders of storage media, Western Digital is well positioned to support both end users and hyperscale cloud providers now, and in the future.

Storage Field Day 19–Current State of the Storage Industry #SFD19

Last week I got to spend some time meeting with numerous storage companies in Silicon Valley. I along with another dozen or so delegates met with companies large and small, including Western Digital, Dell EMC, NetApp, and startups like MinIO. I’ll be writing posts in coming weeks to talk about some of the interesting technology we learned about this week.

pexels-photo-157544
Photo by Anthony on Pexels.com

In this post I wanted to focus on some interesting scenarios. It’s something I specifically noticed when we were at Western Digital but came up again particularly with the startups we met with. I had this thought, and then on Sunday Argenis Fernandez (b|t) who recently returned to Pure Storage about about after this tweet.

Argenis was complaining about file systems because when you have very fast (think NVME, or faster) storage, or storage-class memory the overhead of all the things the file systems does become a significant portion of the time that it takes to complete an I/O operation. This isn’t significant when your IOs take 4-5 milliseconds to complete, but when they are completing in 50 microseconds you notice the time it takes for the filesystem to timestamp a file.

This leads me to the point I wanted to make in the post. Storage technology futures are very much bifurcated (that’s a fancy word for going in two directions) –on one end there is ultra-high performance NVME storage for workloads like gaming and ultra-high throughput trading systems. On the other end there is a lot of development around ultra-high density storage for hyperscale providers (that’s basically your public clouds and Facebook).

Did you know that there were hard drives with multiple actuators (needles to a record player for those of you who are old)?

Multi Actuator Technology: A New Performance Breakthrough

The reason why this is happening is that spinning hard drives are here to stay, for density reasons (you may have heard that the world is going to have eleventy billion zetabytes by 2022 and most of it will be in cloud, or something to that effect),  and SSDs still lack the density required to say be a cloud provider or host most of the world’s photos.

The cloud providers are also part of the high-speed storage game–mostly to be able to do things like NMVE over fabric, which will allow ultra-fast disk to by virtualized and shared.

What does this mean for you as a data professional and consumer of storage? It means things probably aren’t going to change that much for you. If you are working with an all-flash vendor for performance storage, you’ll see the gains as NVME rolls in, but a lot of the ultra-high speed storage will be limited by the rest of stack (OS and RDBMS). If you are in the public cloud, I think you will see storage get gradually faster and less latent over the next 18 months, and you will see densities increase in hard drives. Your SAN admin will get some better tools, that I’m going to talk about in some coming posts.

 

 

 

 

 

My SQL Saturday Chicago Precon: Leveling Up Your Azure Skills

I’m doing another precon at SQL Saturday Chicago on March 20th, 2020. The event is at Benedictine University in Lisle, IL. This time we’re going to dive a little bit deeper into Azure. While last years precon focused on basic infrastructure skills in Azure, this time we’re going to focus a little deeper into the specifics of the Data Platform in Azure. I did a variant of this topic in India last August, but I’ve made a few changes based on a number of conversations with customers I had at Microsoft Ignite last year.

aerial photography of building city lights
Photo by Nate on Pexels.com

In this full day session, here’s what we’re going to cover:

  • VMs and Storage (like it or not this is still 80% of the cloud spend, and it’s how people migrate to the cloud)
  • Understanding the Azure services you need to manage your environment (this is going to focus heavily on Azure Automation, Azure Key Vaults, and Logic Apps)
  • Azure Data Platform–This is a Big Topic, but here’s how it breaks down
    • Understand the differences between
      • SQL Server in a VM
      • Azure SQL Database
      • Azure Managed Instance
      • Azure Synapse (nee Azure SQL Data Warehouse)
      • Cosmos DB
    • We’ll talk about the costs/benefits/performance of each of these options, and how they are incorporated into your broader architecture
    • Don’t worry, I’m a DBA, we’ll cover HA/DR here
  • We’ll also talk about a bit about cloud focused application architecture
    • Why Redis is front of your database is a good idea
    • Taking advantage of dynamically scaling resources

That’s a lot of topics–it’s a full day, and the deep dive part will be on the data platform, but any and all questions are welcome. There are still a few discount tickets left–you can sign up here.

 

Let’s Talk About Ransomware (Again)

I’ve posted about ransomware a few times before. At DCAC, we’ve worked with a few customers who’ve fortunately been able to help them recover from their attacks. But ransomware attacks are trending upward, as seen below–there’s a ransomware attacker that has pulled in $10 million dollars this year (this year being so far in 2020. 7 days in). This had been a number of announcements about ransomware attacks in recent days– a small company that went bankrupt and money exchange Travelex who was recently attacked.

Ransomware attacks are quite different from traditional hacking. Traditionally hackers are in pursuit of high value data, whether it be from a government agency or a large enterprise business, or someone with a vast array of customer data like Equifax. Ransomware is targeting easy targets, that the attackers think will pay, or hope they will pay. What that means is that if you work for a smaller firm, you are far more likely to be attacked with ransomware than you would be in a traditional hacking attack. Theoretically,  large enterprises have more network segmentation, and better security controls, that make them less vulnerable to these attacks. As the the last year has shown, this is definitely not always the case.

It’s All About the Network

The way ransomware typically works, is to find a vulnerability via user accounts, in order to run exploit software on your network. If the only thing that can be reached on your network is user PCs, that’s the only data that’s going to be encrypted. Where organizations get exploited by this, is that their user network is either directly connected to their business network, or there is just one flat network structure. This means if a user is attacked (typical attack vectors are emailing infected office documents or PDFs, but others do include more advanced attack vectors like the aforementioned Pulse VPN vulnerability) the ransomware can go after file servers and domain controllers and start encrypting all of your files.

Assume Breach

One of the most important approaches to this, is that in order to build your network structures appropriately, you need to assume that your user network is going to be breached. It’s what Microsoft does with Azure, and it’s what you should do on your network as well. This means you need to do a few things:

  • Segment your network
  • Separate accounts, and maybe even a separate domain for your all of your servers
  • Disconnected backups, you can encrypt a safe full of tapes
  • Frequent restore testing
  • Basically, if you can connect to a file or database server from your corporate laptop without multi-factor auth, changing credentials, and/or network, you’re at the mercy of your dumbest user to not launch an infected Word file

All of these controls make our jobs harder, and it sucks, but it’s what we have to do. The one benefit of modern technology is that the cloud makes all these things approachable to even the smallest of businesses. I can build a segmented network in about 20 min in Azure, whereas it would take me a few hours to find the the right equipment at the right price point from a networking vendor if you were trying to do it on-premises.

I’m Just the DBA, What Can I Do?

I get it, and I’ve been there–you’re the DBA at a company where IT isn’t exactly the priority. There’s a few approaches you can take–the first is the most self-serving for me, is to engage a consulting firm like DCAC. While you know what you are talking about, your management sees you as a worker bee, and may not listen to your complaints about having an open WiFi network that your regulatory agency can logon to and see the public file servers (yes, this actually happened to me). However, when they are paying a consultant $ALot/hr they tend to be more receptive of taking advice, even if it’s the same thing the DBA has been talking about for years.

Another approach is to hire a penetration testing firm–many large organizations are required to do this by regulation, and they do a good job of identifying vulnerabilities in a firm.

Finally, and the easiest thing to execute is to PATCH YOUR SHIT. This is where you can assist the most–you can patch Windows and SQL Server (or Linux if that’s your bag), and you can work with the sysadmin teams in your organization to ensure that all OS, application, and mouse software is patched regularly.  It’s not perfect, and it doesn’t fix all of your problem, but it’s a start. Also, test your god damned restores, it’s not enough to have backups if you don’t know that they work.

 

 

PowerShell for Parsing DiskSpd Results to CSV

We have a customer who is moving platforms, and as part of this, I’ve been tasked with testing a lot of various storage configurations. This means lots of utilization of DiskSpd, which is a disk performance benchmarking tool from Microsoft. We could argue about the value of synthetic disk benchmarks, but they are good for testing a variety of disk configurations with a standardized tool for comparing results. It also has the benefit of add the runtime configuration into the results file. So as long as you have your results file, you can know what parameters you ran the test with. (You do have to document your disk configuration–we are using the name of our output file for this).

muscles food
Photo by Elle Hughes on Pexels.com

Anyway, I have a bunch of these files, and I needed to get the data into Excel. Since I was too lazy to figure out how to parse a text file in C#, my first thought was to use some combination of sed, awk, and grep in a bash shell. I reached out to my friend Anthony Nocentino (b|t) about his thoughts on the best way to do this, and he immediately said PowerShell.

When I asked about how to do things I wanted to do with specific bash commands, he mentioned the fact that I could use bash statements that supported standard input and output in PowerShell. The linked blog shows how to do this in Windows, however I wrote all of this code in PowerShell natively on my MacBook Pro.

$path='$InsertYourPathHere'

foreach ($file in $files)

{

$content = get-content $file

$command= ($content)|select -first 1 -skip 1

$results= ($content)|grep total -m1|sed 's/"|"/","/g'|sed 's/"total:"//g'




$results= $results.split(",")

$Output = New-Object -TypeName PSObject -Property @{

FileName = $File.Name

Command = $Command

TotalBytes = $Results[0].Trim()

TotalIOs = $results[1].Trim()

MiBperSec = $results[2].Trim()

IOPs = $results[3].Trim()

AvgLatency = $results[4].Trim()

LatStdDev = $results[5].Trim()}| Select-Object FileName,Command,TotalBytes, TotalIOs, MiBperSec, IOPs,AvgLatency,LatStdDev

$Output|Export-CSV results.csv -Append
}
As you can see, I’m passing output in my $Results variable to a grep to give me the first match of the word “total” and then using sed to do a couple of find and replace commands to make parsing the file a little bit easier. After I’ve done all that, I split the array into a comma delimited set of results, and output it to a CSV file. This allows you to grab the results, with headers and open then in your favorite spreadsheet software. For posterity, the code is available at in our GitHub repo here.