I have about one hundred of XML file (with the same structure) and I want to import them in SAS. Unfortunately in doing that I have some issues relatated to the MAP file of the XML files (I have not the MAP file for these files). So I though to convert these files in CSV through Excel. But if I use this path, I need something that is able to convert massively all my XML files in CSV, because clearly I can't convert by hands every file individually.
Anyone knows how can I solve?
Thanks.
93 Answers
I've solve my issue with this VBA script:
Public Sub ConvertXmlToXlsx()
Application.DisplayAlerts = False
Dim objFSO As Object
Dim objFolder As Object
Dim objFile As Object
xmlFolder = "C:\Users\xxx\xxx\xxx\xxx\"
convFolder = "C:\Users\xxx\xxx\xxx\xxx\"
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder(xmlFolder)
For Each objFile In objFolder.Files If UCase(Right(objFile.Name, Len(XML))) = UCase(XML) Then NewFileName = convFolder & objFile.Name & ".xlsx" Workbooks.OpenXML (objFolder & "\" & objFile.Name), LoadOption:=xlXmlLoadImportToList ActiveWorkbook.SaveAs Filename:=NewFileName ActiveWorkbook.Close End If
Next objFile
End Sub Since you seem to be familiar with SAS, or you'll have to be soon, I'd use R to read out the Excel files and then write them again as CSV.
The code below allows you to set the working directory, read the contents onto a list and iterate through the list to conver the files in a few lines.
library(readxl)
setwd("The directory containing your files")
list <- list.files()
for(i in 1:length(list)) { Intermediate <- read_excel(list[i]) write.csv(Intermediate, paste0(list[i],".csv"))
} For the following code you can use any XSLT-2.0 processor to convert your XML to a CSV file.
The XML file should have a structure like this:
<AnyRoot> <AnyEntry> <Value1></Value1> <Value2></Value2> <Value3></Value3> ... </AnyEntry> <AnyEntry> <Value1></Value1> ... </AnyEntry> ...
</AnyRoot>For this example I use the following XML file:
<root> <Entry> <CSVValue1>A</CSVValue1> <CSVValue2>"B"</CSVValue2> <CSVValue3>C,D</CSVValue3> <CSVValue4>"E","F"</CSVValue4> </Entry> <Entry> <CSVValue1>G H</CSVValue1> <CSVValue2>""</CSVValue2> <CSVValue3></CSVValue3> <CSVValue4 /> </Entry> <Entry> <CSVValue1>1996</CSVValue1> <CSVValue2>Jeep</CSVValue2> <CSVValue3>Grand Cherokee</CSVValue3> <CSVValue4>MUST SELL!
air, moon roof, loaded</CSVValue4> <CSVValue5>4999.00</CSVValue5> </Entry>
</root>And this is the XSLT-2.0 stylesheet you can use to transform all of your XML files to CSV files. As far as I have tested it, it works for all cases described in the specification. But, to be honest, I cannot guarantee that. You have to test it and give some feedback here.
However, here is the XSLT-2.0 code that converts XML to CSV:
<?xml version='1.0' encoding='utf-8'?>
<xsl:stylesheet version="2.0" xmlns:xsl="">
<xsl:output method="text" omit-xml-declaration="yes" indent="yes"/>
<!-- ================================================================= -->
<!-- XML to CSV Version 1.0 by zx485 on 30-01-2019@01:58 -->
<!-- Run it with java -jar saxon9he.jar -xsl:XML2CSV.xslt input.xml -->
<!-- ================================================================= --> <xsl:variable name="csvItems"> <xsl:for-each select="/*/*[1]/*"> <Item name="{local-name()}" /> </xsl:for-each> </xsl:variable> <xsl:template match="/*"> <xsl:value-of select="$csvItems/Item/@name" separator="," /> <xsl:text>
</xsl:text> <xsl:apply-templates select="*" /> </xsl:template> <xsl:template match="/*/*"> <xsl:for-each select="*"> <xsl:apply-templates select="." /> <xsl:if test="position() != last()"> <xsl:text>,</xsl:text> </xsl:if> </xsl:for-each> <xsl:text>
</xsl:text> </xsl:template> <xsl:template match="text()"> <xsl:choose> <xsl:when test=".='""'"> <xsl:value-of select="'""'" /> </xsl:when> <xsl:when test="contains(.,',') or contains(.,'
')"> <xsl:value-of select="concat('"',.,'"')" /> </xsl:when> <xsl:when test="contains(.,'"')"> <xsl:value-of select="replace(.,'"','""')" /> </xsl:when> <xsl:when test="contains(.,',') and contains(.,'"')"> <xsl:value-of select="concat('"',replace(.,'"','""'),'"')" /> </xsl:when> <xsl:otherwise> <xsl:value-of select="." /> </xsl:otherwise> </xsl:choose> </xsl:template>
</xsl:stylesheet>The output of this is:
CSVValue1,CSVValue2,CSVValue3,CSVValue4
A,""B"","C,D",""E","F""
G H,"",,
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4999.00If you put this transformation in a loop of a script, you can transform many XML files at once.
2