Tiny PowerShell Project 8 - Data Extraction

Tiny PowerShell Project 8 - Data Extraction

What's often the most significant expense in a project? It's integration. To put it simply for the non-technical reader, companies purchase technologies from various vendors, and integration is the crucial step of making sure these different technologies function cohesively together.

In the context of software engineering, "integration" refers to the process of combining different software components or systems to function as one unified entity. The term can be used in various contexts and can denote different activities, including:

  1. Software Integration: This involves combining individual software modules or components that have been developed separately. These components might be functions, classes, libraries, or even entire systems. The goal is to ensure that they work harmoniously together.

  2. System Integration: This is the process of connecting different IT systems, including both software and hardware, to work together within an organization. System integrators might use various techniques and products, such as middleware, to make disparate systems communicate and function as one cohesive unit.

In each context, the core idea of integration is to ensure different pieces of software, whether they're components, systems, or services, can effectively work together to achieve a desired result. Proper integration is crucial in software engineering to ensure system consistency, reliability, and scalability.

In the realm of application support, it's commonplace to encounter scenarios requiring data transfer from one system to another. This challenge arises frequently because many technologies lack compatible software stacks or well-defined APIs. You might be surprised to learn how many companies choose batch processing for records over API integration.

Today's tiny project is going to focus on writing an extraction. In this scenario, we'll have to connect to a database, run our query, and store the extracted information to our destination.

The first step would be to securely capture the needed credentials. In essence, we have to write a code that does the following steps:

  • Prompts the user for a password.

  • Converts the password to a SecureString.

  • Encrypts the SecureString.

  • Writes the encrypted string to a file.

  • Use this credential to query our database

  • Store the extracted information to a given destination

Let's see what the code would look like:

# Define constants
$PasswordFilePath = "C:\Users\scoappjobs\Downloads\password.txt"
$User = "yourUserName"
$DatabaseServer = "YourServerName"
$DatabaseName = "yourDatabaseName"
$CsvOutputPath = "C:\Users\$env:username\Downloads\output.csv"

# Check and save password if it doesn't exist
if (-not (Test-Path $PasswordFilePath)) {
    Read-Host "Enter password to secure:" -AsSecureString |
    ConvertFrom-SecureString | Out-File $PasswordFilePath
}

# Load email credentials
$Pass = Get-Content $PasswordFilePath | ConvertTo-SecureString
$Credential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $User, $Pass

function Get-Data {
    [CmdletBinding()]
    param()

    begin {}

    process {
        try {
            $SQLquery = @"
USE [DatabaseName];

SELECT T1.[FirstNameColumn]
    ,T1.[LastNameColumn]
    ,T2.PhoneColumn
    ,T2.ID1
    ,T2.ID2
    ,T2.ID3
    ,T3.DescriptionColumn
    ,GETDATE()
    ,T4.OfficeNameColumn
    ,T5.[LanguageColumn]
    ,T6.EmailColumn
FROM [DatabaseName].[schema].[Table1] as T1
INNER JOIN [DatabaseName].[schema].[LanguageTable] T5 ON T1.LanguageIDColumn = T5.LanguageID
INNER JOIN [DatabaseName].[schema].[Table2] AS T6 ON T1.IDColumn = T6.IDColumn AND T6.RoleColumn = 'RoleName' 
INNER JOIN [DatabaseName].[schema].[Table3] AS T2 ON T2.IDColumn = T6.IDColumn AND T2.PhoneColumn != '' AND T2.DateColumn IS NULL
INNER JOIN [DatabaseName].[schema].[StatusTable] AS T2Status ON T2.IDColumn = T2Status.IDColumn AND (T2Status.StatusDescriptionColumn = 'Status1' OR T2Status.StatusDescriptionColumn = 'Status2')
INNER JOIN [DatabaseName].[schema].[CareTable] AS T3 ON T2Status.CareIDColumn = T3.CareID AND T3.SettingColumn = 'SettingValue'
INNER JOIN [DatabaseName].[schema].[OfficeTable] AS T4 ON T4.OfficeIDColumn = T2.OfficeIDColumn
"@

            $result = invoke-sqlcmd -query $SQLquery -serverinstance $DatabaseServer -database $DatabaseName -Credential $Credential
            $result | ConvertTo-Csv -Delimiter '|' -NoTypeInformation |
            ForEach-Object { $_.Replace('"', '') } | Out-File $CsvOutputPath -Encoding ascii

        }
        catch {
            Write-Error "$($_.Exception.Message) - Line Number: $($_.InvocationInfo.ScriptLineNumber)"
        }
    }

    end {}
}

# Call the function
Get-Data

Let's break down the code further:

  • Read-Host "Enter password to secure:" -AsSecureString: This prompts the user to enter a password. The -AsSecureString parameter ensures that the input is treated as a SecureString. A SecureString is a special string in .NET that keeps the string data encrypted in memory to protect sensitive information, such as a password. The characters you type will be obscured as you type them (e.g., displayed as asterisks).

  • ConvertFrom-SecureString: This cmdlet converts a SecureString into an encrypted standard string. The encryption is based on the Windows Data Protection API (DPAPI). By default, only the user who encrypted the data (on the same computer where it was encrypted) can decrypt it.

  • Out-File $PasswordFilePath: This cmdlet writes the output (the encrypted standard string from the previous step) to a file. The path to that file is stored in the $PasswordFilePath variable, which should be defined elsewhere in your script or session.

This is often used as a way to save passwords or other sensitive data to disk in a more secure manner than plain text, although the protection it offers is primarily for casual security (since the decryption can be done by the same user on the same machine). For more robust security solutions, you would typically look into other encryption methods or secrets management tools.

Certainly! The provided code essentially retrieves an encrypted password from a file, decrypts it, and then creates a PowerShell credential object with a username and the decrypted password. Let's break it down:

  1. $Pass = Get-Content $PasswordFilePath | ConvertTo-SecureString:

    • Get-Content $PasswordFilePath: This cmdlet retrieves the content of the file specified by $PasswordFilePath. The content of this file is expected to be the encrypted string version of a password, as saved (possibly by the previous code you shared).

    • | ConvertTo-SecureString: The pipe (|) passes the encrypted string to the ConvertTo-SecureString cmdlet, which decrypts it. It uses the Windows Data Protection API (DPAPI) to do this decryption, and as mentioned before, by default, only the user who encrypted the data (on the same computer where it was encrypted) can decrypt it. The result is a SecureString object, which is then stored in the $Pass variable.

  2. $Credential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $User, $Pass:

    • New-Object -TypeName System.Management.Automation.PSCredential: This creates a new object of type PSCredential, which represents a set of credentials in PowerShell (username and password).

    • -ArgumentList $User, $Pass: The PSCredential type requires two arguments: a username (a plain string) and a password (a SecureString). $User presumably contains the username, and $Pass (from the previous step) contains the password as a SecureString. These are passed to the PSCredential constructor using the -ArgumentList parameter.

The result is a $Credential object, which contains both the username and the SecureString password. This object can then be used in various PowerShell commands that require credentials, such as remote operations or accessing secure resources.

The biggest problem with this extraction is that we're running a query against the database instead of a stored procedure. Using stored procedures over raw SQL queries embedded in application code or scripts offers several benefits:

  1. Maintainability: Once a stored procedure is tested and added to the database, it can be used across multiple applications or scripts without having to duplicate the SQL. If there's ever a need to modify the query, the change is done in one location (the stored procedure) rather than multiple locations in application/script code.

  2. Performance: SQL Server can optimize the execution plan of stored procedures, making them faster to execute over time compared to ad-hoc SQL queries.

  3. Security: Stored procedures provide an extra layer of security. You can restrict permissions on the base tables and give execute permissions on stored procedures, thus ensuring that users can only perform the actions allowed by the stored procedure and nothing more.

  4. Reduced SQL Injection Risk: While parameterized queries (like the one in your script) already reduce the risk of SQL injection, stored procedures can further decrease this risk since the execution of a stored procedure is typically a non-dynamic, single command to the server.

  5. Encapsulation of Business Logic: This allows for business logic to be housed within the database itself, leading to consistent data operations, especially when multiple applications interact with the same database.

It would be ideal to place the SQL query inside a stored procedure. Not only would this encapsulate the logic within the database, but it would also make the PowerShell script cleaner and easier to read. Instead of a lengthy SQL query within the script, you'd have a single invoke-sqlcmd call to execute the stored procedure.

For instance, instead of:

$SQLquery = @"
USE [DatabaseName];
-- A lot of SQL here...
"@
$result = invoke-sqlcmd -query $SQLquery -serverinstance $DatabaseServer -database $DatabaseName -Credential $Credential

You could have:

$result = invoke-sqlcmd -StoredProc "YourProcedureName" -serverinstance $DatabaseServer -database $DatabaseName -Credential $Credential

This assumes you've created a stored procedure named "YourProcedureName" that encapsulates the SQL logic. It's a cleaner, more maintainable, and potentially safer approach.

Did you find this article valuable?

Support Application Support by becoming a sponsor. Any amount is appreciated!