This post explains how we can use the PowerShell substring method to extract and match part of a string.
Before we continue, it’s important to remember that the first index of a string starts at position 0. So in the following example string of text:
abcdefghijklmnopqrstuvwxyz
The character “a” is at position index 0, “b” is at position index 1 and “z” is at position index 25.
PowerShell Substring Basic Example
Let’s start with the most basic example of extracting part of a string by using the PowerShell substring method with one argument:
substring(int startIndex)
In real-world scenarios, this method is most useful when the string of text we are searching has a consistent pattern. For example, maybe our product database has product references similar to the following:
Reference | Item |
---|---|
ALK-3242334UK | Laptop |
ALK-9876352UK | Keyboard |
ALK-6622553UK | Mouse |
And the references are always in the format:
ALK-[7 Digits]UK (“ALK-” followed by 7 numeric digits and finally “UK”).
By providing just the starting index as an argument to the Substring method, we can tell PowerShell to return everything after and including that index. For example, let’s assume we want to strip off the starting “ALK-” part of our reference. We know that:
- “A” is character index 0
- “L” is character index 1
- “K is character index 2
- “-” is character index 3
So we need everything from character index 4 onwards:
$alkaneString = "ALK-3242334UK"
write-host $alkaneString.Substring(4)
Which will output:
3242334UK
PowerShell Substring Basic Example using String Length
But maybe we don’t want to include the “UK” characters at the end? Luckily the substring method also allows us to specify how many characters we want to return from a particular character index position:
substring(int startIndex, int length)
Since we know that our product references always has 7 digits in the middle, we can specify the start index and length like so:
$alkaneString = "ALK-3242334UK"
write-host $alkaneString.Substring(4,7)
Which will output:
3242334
The plot thickens when we are searching for a string or text in some random output. Consider a requirement to extract product references from this string of random text:
Thank you for purchasing the Alkane Laptop (ALK-3242334UK), Keyboard (ALK-9876352UK) and Mouse (ALK-6622553UK). We value your custom and look forward to seeing you in future.
We have no idea which point in the sentence our product references will appear. In this example, the only way we can extract our product references is by using the select-string cmdlet with a regular expression.