C# / .NETDevOpsMisc
DevOps
Download and extract gzip tar with PowerShell
Alexandru Puiu
Alexandru Puiu
August 21, 2019
1 min

Table Of Contents

01
Create a clean temp folder
02
Unzipping a GZip with PowerShell
03
Expand Tar archive with PowerShell
04
Troubleshooting

We found ourselves with a requirement to download an updated version of a public dataset on a regular basis, so PowerShell + windows scheduler came to mind, since the application runs in a windows environment. But only to find that PowerShell doesn’t make this quite trivial.

In PowerShell v5+ we have the Expand-Archive command:

Expand-Archive c:\a.zip -DestinationPath c:\a

but this doesn’t support gzip or tar

gzip is a compression algorithm, and is based on the DEFLATE algorithm, which is a combination of LZ77 and Huffman coding. There’s a good comparison on popular compression algorithms worth checking out: https://stackoverflow.com/questions/28635496/difference-lz77-vs-lz4-vs-lz4hc-compression-algorithms 

tar or tarball is an archive format, which allows multiple files to be grouped into one for backup or distribution purposes.

Combining the two, which is very common, let’s you download a single very well compressed archive containing multiple files and folders. But now we have a couple layers to deal with. Here are the steps I came up with:

Create a clean temp folder

First we’ll delete any folder we plan  to create (in case a previous run of this script failed in the middle), and then create our temp folder:

Remove-Item "c:\temp\maxmind\" -Filter * -Recurse -ErrorAction Ignore
New-Item -ItemType directory -Path C:\temp\maxmind\

Download a file using PowerShell

The BitsTransfer cmdlet if available is really fast at downloading”

Import-Module BitsTransfer
Start-BitsTransfer -Source "https://example.com/download.tar.gz" -Destination "c:\temp\maxmind\temp.tar.gz"

Unzipping a GZip with PowerShell

PowerShell doesn’t support gzip as far as I found, but we can make use of the .Net Framework through PowerShell, thanks to RiffyRiot on Technet https://social.technet.microsoft.com/Forums/windowsserver/en-US/5aa53fef-5229-4313-a035-8b3a38ab93f5/unzip-gz-files-using-powershell?forum=winserverpowershell

Function DeGZip-File{
    Param(
        $infile,
        $outfile = ($infile -replace '\.gz$','')
        )

    $input = New-Object System.IO.FileStream $inFile, ([IO.FileMode]::Open), ([IO.FileAccess]::Read), ([IO.FileShare]::Read)
    $output = New-Object System.IO.FileStream $outFile, ([IO.FileMode]::Create), ([IO.FileAccess]::Write), ([IO.FileShare]::None)
    $gzipStream = New-Object System.IO.Compression.GzipStream $input, ([IO.Compression.CompressionMode]::Decompress)

    $buffer = New-Object byte[](1024)
    while($true){
        $read = $gzipstream.Read($buffer, 0, 1024)
        if ($read -le 0){break}
        $output.Write($buffer, 0, $read)
        }

    $gzipStream.Close()
    $output.Close()
    $input.Close()
}

DeGZip-File "C:\temp\maxmind\temp.tar.gz" "C:\temp\maxmind\temp.tar"

Expand Tar archive with PowerShell

Finally, we have to extract the Tar, for which we can use the 7Zip4Powershell cmdlet:

if (-not (Get-Command Expand-7Zip -ErrorAction Ignore)) {
  Install-Package -Scope CurrentUser -Force 7Zip4PowerShell > $null
}
Expand-7Zip C:\temp\maxmind\temp.tar c:\temp\maxmind\

Find and copy the file we need, to our destination

$files=@("GeoLite2-City", "*.mmdb")
Get-ChildItem -recurse "c:\temp\maxmind\" -include ($files) | Copy-Item -Destination (c:\data\GeoLite2-City.mmdb)

And finally, we clean up our temp folder

Remove-Item "c:\temp\maxmind\" -Filter * -Recurse -ErrorAction Ignore

Lastly, we wrap the whole thing into a powershell script, and change it to accept parameters for the url and output, and save it as DownloadAndExtract.ps1

Param 
( 
  [string] 
  $url,
  [string] 
  $output
)

Now we schedule it in Windows Task Scheduler with a basic task

create basic task

Then we set the schedule

create basic schedule

And for our Action, we Start a Program with powershell as the script, and the location of our ps1 script in the arguments:

Arguments: -file "C:\scripts\DownloadAndExtract.ps1" https://example.com/data.tar.gz c:\data\GeoLite2-City.mmdb

create basic run 2

Troubleshooting

If Install-Package cannot be found: https://winaero.com/blog/fix-install-module-missing-powershell/


Tags

powershell
Alexandru Puiu

Alexandru Puiu

Engineer / Security Architect

Systems Engineering advocate, Software Engineer, Security Architect / Researcher, SQL/NoSQL DBA, and Certified Scrum Master with a passion for Distributed Systems, AI and IoT..

Expertise

.NET
RavenDB
Kubernetes

Social Media

githubtwitterwebsite

Related Posts

Powershell REST API
Work with a REST API using PowerShell
February 11, 2020
1 min

Subscribe To My Newsletter

I'll only send worthwhile content I think you'll want, less than once a month, and promise to never spam or sell your information!
© 2023, All Rights Reserved.

Quick Links

Get In TouchAbout Me

Social Media