andrewjwelch.com
oXygen IDE
KernowKernow
SAXONICA

A CSV to XML converter in XSLT 2.0

Instructions
Download - for a quote for modifications/support please email me
This transform converts a csv file into XML. See the instructions on how to use it.
<xsl:stylesheet
  xmlns:xsl
="http://www.w3.org/1999/XSL/Transform"
  xmlns:fn
="fn"
  xmlns:xs
="http://www.w3.org/2001/XMLSchema"
  version="2.0" exclude-result-prefixes="xs fn">

<xsl:output indent="yes" encoding="US-ASCII" />

<xsl:param name="pathToCSV" select="'file:///c:/csv.csv'" />

<xsl:function name="fn:getTokens" as="xs:string+">
    
<xsl:param name="str" as="xs:string" />
    
<xsl:analyze-string select="concat($str, ',')" regex='(("[^"]*")+|[^,]*),'>
        
<xsl:matching-substring>
        
<xsl:sequence select='replace(regex-group(1), "^""|""$|("")""", "$1")' />
        
</xsl:matching-substring>
    
</xsl:analyze-string>
</xsl:function>

<xsl:template match="/" name="main">
    
<xsl:choose>
    
<xsl:when test="unparsed-text-available($pathToCSV)">
        
<xsl:variable name="csv" select="unparsed-text($pathToCSV)" />
        
<xsl:variable name="lines" select="tokenize($csv, ' ')" as="xs:string+" />
        
<xsl:variable name="elemNames" select="fn:getTokens($lines[1])" as="xs:string+" />
        
<root>
        
<xsl:for-each select="$lines[position() &gt; 1]">
            
<row>
            
<xsl:variable name="lineItems" select="fn:getTokens(.)" as="xs:string+" />

            
<xsl:for-each select="$elemNames">
                
<xsl:variable name="pos" select="position()" />
                
<elem name="{.}">
                
<xsl:value-of select="$lineItems[$pos]" />
                
</elem>
            
</xsl:for-each>
            
</row>
        
</xsl:for-each>
        
</root>
    
</xsl:when>
    
<xsl:otherwise>
        
<xsl:text>Cannot locate : </xsl:text><xsl:value-of select="$pathToCSV" />
    
</xsl:otherwise>
    
</xsl:choose>
</xsl:template>

</xsl:stylesheet>
I've tested the transform with the following template - these few cover the edge cases and all pass.
<xsl:template match="/">
    
<xsl:variable name="tests">
    
<tests>
        
<test>
            
<input>a,b,c</input>
            
<expected><n>a</n><n>b</n><n>c</n></expected>
        
</test>
        
<test>
            
<input>,,</input>
            
<expected><n /><n /><n /></expected>
        
</test>
        
<test>
            
<input>"a","b","c"</input>
            
<expected><n>a</n><n>b</n><n>c</n></expected>
        
</test>
        
<test>
            
<input>"foo,bar",,"foo,bar"</input>
            
<expected><n>foo,bar</n><n /><n>foo,bar</n></expected>
        
</test>
        
<test>
            
<input>"foo,""bar""",,</input>
            
<expected><n>foo,"bar"</n><n /><n /></expected>
        
</test>
        
<test>
            
<input>,,"foo,""bar"""</input>
            
<expected><n /><n /><n>foo,"bar"</n></expected>
        
</test>
    
</tests>
    
</xsl:variable>

    
<xsl:for-each select="$tests/tests/test">
        
<xsl:variable name="tokens" select="fn:getTokens(input/text())" as="xs:string+" />
        
<xsl:variable name="result" select="expected/n" as="xs:string+" />
        
<xsl:variable name="tokenStr" select="string-join($tokens,',')" as="xs:string" />
        
<xsl:variable name="resultStr" select="string-join($result, ',')" as="xs:string" />
        
<xsl:sequence select="if ($tokenStr = $resultStr) then (concat(' Passed: (', $tokenStr, ')')) else concat(' Failed: (', $tokenStr, ') (', $resultStr, ')')" />
    
</xsl:for-each>
</xsl:template>
If this transform fails for a valid CSV file, let me know!





Instructions

This transform is a "standalone" transform, which means it doesn't expect input XML. Because there is no input, you must specify which template the transform should start at - instructions are below on how to do that.
You need to tell the transform where your CSV is. You can do that by supplying the parameter "pathToCSV" to the transform, or by editing the default value in the XSLT.
For free help, please don't email me directly, use the xsl-list for general XSLT questions, the kernow-help for help with Kernow and the saxon-help list for help with Saxon. I'm on all of these lists so I will still get the email, but it makes it much more likely you will get response.

Instructions for running the transform using Kernow

- Download Kernow
- Download the transform
- Run Kernow, go to the Standalone tab and open the xslt
- Click "auto" button to discover the Initial Template - it should say "main"
- Click the Params button, click "Auto Populate", then double click the Value entry to edit the path to your CSV file
- The path must be a valid URI, which basically means for Windows use "file:///c:/" (yes - triple slash) at the start and then forward slashes as the separator. For example, if your CSV file was called "test.csv" in the "dev" folder of your C: drive, you would use file:///c:/dev/test.csv
- Close the params dialog (ensuring you have pressed enter to confirm the changes).
- Run the transform!
- The results are displayed in the output window.
- Tick the "send output to file" checkbox, type the path in directly or use the file chooser then re-run the transform to write the output to a file
- That's it all done :)

Instructions for running the transform from the command line using Saxon

- Download Saxon
- Download the transform
- Use the command java -cp saxon.jar net.sf.saxon.Transform -o output.xml -it main csv-to-xml.xslt pathToCSV=file:/C:/dev/test.csv
- You will need to modify the -cp switch to correctly configure your classpath (if it isn't already done), and the other switches to use the correct paths
- If you get a NoClassDefFoundError then you classpath isn't configured correctly


Valid XHTML 1.0 Transitional