Sunday, 12 August 2012

PowerShell and SubVersion with XML

Recently I was asked if it was possible to automate part of our release process whereby our release management department needed to find what SQL scripts in a particular database directory had changed between specific repo revisions on a specific path. It is possible to automate the above and I have chosen PowerShell using the SVN client binaries (which output XML that can be easily processed) to achieve this. I have listed an example of the XML skeleton returned by SVN and the majority of the PowerShell code which will collect the SQL file URLs - once you have the URLs, they can be exported.

So, the SVN client binaries can output, in XML format, which files have been modified on a certain path between certain revisions and PowerShell can process XML using the Select-Xml cmdlet with some XPath. To generate an XML document which can be piped to the PowerShell Select-Xml cmdlet, execute a command similar to the following:

svn.exe log http://path/to/do/an/svn/log/on --verbose --xml --revision 4453:4478

The above will return an XML document ready for piping - the following is an example of the skeleton. It contains two SVN logentries (one logentry = one commit).

<?xml version="1.0"?>
<log>
    <logentry revision="4467">
        <author>jdolan</author>
        <date>2012-02-24T17:06:46.686349Z</date>
        <paths>
            <path
               kind="file"
               action="M">[/project_name/etc]</path>
            <path
               kind="file"
               action="M">[/project_name/etc]</path>
        </paths>
        <msg>[the_commit_message]</msg>
    </logentry>
    <logentry revision="4488">
        <author>ccallaghan</author>
        <date>2012-02-29T12:36:10.386087Z</date>
        <paths>
            <path
               kind="dir"
               copyfrom-path="[/project_name/etc]"
               copyfrom-rev="4566"
               action="A">[/project_name/etc]</path>
        </paths>
        <msg>[the_commit_message]</msg>
    </logentry>
</log>

The pseudo-code for the PowerShell script below is:
  • Get an XML document representing all SVN logentries on a particular path between particular revisions and pipe it to...
  • The Select-Xml cmdlet which uses XPath to get a per logentry and pipe it to...
  • The ForEach-Object cmdlet which iterates over each logentry and allows for the processing to take place. Here we can obtain the revision, if the file has been modified, added or deleted, the commit date, the path to each file and even the log message. After some processing, we can add to a list the fully qualified paths to the SQL files we are interested in.
You will need to fill in the appropriate blanks, such as the root and log URLs, to revision, from revision, the file extension to match, etc - for convenience, I have declared these for you at the top of the script below. As the script is well commented, I'll let it speak for itself rather than dissecting it too much

$fromRev = 4453
$toRev = 4478
$svnUrlRoot = "http://root/path/to/repo"
$svnUrlLog = "$svnUrlRoot/path/to/do/an/svn/log/on"
$fileExt = ".sql"

# Using the svn log command, get an XML document representing the preset criteria 
# and cast it to an XML object, pipe this to the Select-Xml cmdlet to get the 
# logentries and pipe these to the ForEach-Object cmdlet to iterate over them.
([xml] (svn.exe log $svnUrlLog --verbose --xml --revision ""$fromRev"":""$toRev"")) |
        Select-Xml -XPath /log/logentry |
        ForEach-Object {
            $logEntry = $_
            $logEntryPaths = $logEntry.Node.paths.path
           
            # Iterate through each path in the logEntryPaths XmlElement.
            foreach ($logEntryPath in $logEntryPaths)
            {                   
                # Get the contents (relative HTTP path) of the logEntryPath and append it to 
                # the SVN root URL. #text must be in double quotes to ensure the hash / pound symbol is 
                # correctly dealt with by PowerShell.
                $sqlFileUrl = ($svnUrlRoot, $logEntryPath."#text") -join ""
               
                # Ensure that URL's matching the component database 
                # pattern and ending with .SQL are examined.
                if ($sqlFileUrl -imatch $sqlFileUrl.EndsWith($fileExt))
                {
                    $fileAction = $logEntryPath.action
                   
                    # If the SQL file has been deleted, it should be ignored 
                    # as it no longer exists.
                    if (! ($fileAction -ieq "D"))
                    {
                        # Determine if the SQL file already exists
                        # in the array from a previous relevant rev,
                        # if not, add the complete URL to the array
                        # for export later. This prevents
                        # duplicates as a developer may have made
                        # many commits to the same SQL file.
 
                       if ($sqlFileList -inotcontains $sqlFileUrl)
                        {
                            $sqlFileList += $sqlFileTagUrl
                        }
                    }
                }
            }
        }

Now that we have a list called $sqlFileList containing the paths to all the fully qualified SQL files, we can put it into a foreach loop and using the svn export command to export them to our desired location.

If using the standard trunk, tag, and branches repo layout, remember that if executing this script on tag, the paths which the svn log will return, will probably be from trunk so if you wish to export the SQL files from the tag, you will have to replace part of the URL from trunk to tags, before adding the URL to the export list.
       
We also had a requirement whereby we would like to ignore any SQL files which were forward merged. This can be achieved in a variety of ways but we have found the easiest to be the following: ensure there is a standard commit message for forward merging and in the above powershell script, extract the log message per logentry and compare it to the pre-defined message and if it matches, ignore the contents of that revision.

As an aside, much of the above code can be reused to read and process any XML document in PowerShell. Consider replacing the svn log command line above with the Get-Content cmdlet which allows you to read the document from a file instead of the output returned from a command. The XPath and SVN specific logic can also be replaced. Another useful test before processing XML is ensuring the XML element has child nodes. To do this in the above example with logentry, add an if statement that evaluates $logEntry.Node.HasChildNodes which was omitted as the svn log XML will always have child nodes.