[ACCEPTED]-Case insensitive XML parser in c#-xpath
An XMl document can have two different elements named respectively: MyName
and myName
-- that are intended to be different. Converting/treating them as the same name is an error that can have gross consequences.
In case the above is not the case, then 7 here is a more precise solution, using XSLT 6 to process the document into one that only 5 has lowercase element names and lowercase 4 attribute names:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="vUpper" select=
"'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="vLower" select=
"'abcdefghijklmnopqrstuvwxyz'"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[name()=local-name()]" priority="2">
<xsl:element name="{translate(name(), $vUpper, $vLower)}"
namespace="{namespace-uri()}">
<xsl:apply-templates select="node()|@*"/>
</xsl:element>
</xsl:template>
<xsl:template match="*" priority="1">
<xsl:element name=
"{substring-before(name(), ':')}:{translate(local-name(), $vUpper, $vLower)}"
namespace="{namespace-uri()}">
<xsl:apply-templates select="node()|@*"/>
</xsl:element>
</xsl:template>
<xsl:template match="@*[name()=local-name()]" priority="2">
<xsl:attribute name="{translate(name(), $vUpper, $vLower)}"
namespace="{namespace-uri()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
<xsl:template match="@*" priority="1">
<xsl:attribute name=
"{substring-before(name(), ':')}:{translate(local-name(), $vUpper, $vLower)}"
namespace="{namespace-uri()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on any XML document, for example this one:
<authors xmlns:user="myNamespace">
<?ttt This is a PI ?>
<Author xmlns:user2="myNamespace2">
<Name idd="VH">Victor Hugo</Name>
<user2:Name idd="VH">Victor Hugo</user2:Name>
<Nationality xmlns:user3="myNamespace3">French</Nationality>
</Author>
<!-- This is a very long comment the purpose is
to test the default stylesheet for long comments-->
<Author Period="classical">
<Name>Sophocles</Name>
<Nationality>Greek</Nationality>
</Author>
<author>
<Name>Leo Tolstoy</Name>
<Nationality>Russian</Nationality>
</author>
<Author>
<Name>Alexander Pushkin</Name>
<Nationality>Russian</Nationality>
</Author>
<Author Period="classical">
<Name>Plato</Name>
<Nationality>Greek</Nationality>
</Author>
</authors>
the wanted, correct result (element and attribute names converted to lowercase) is produced:
<authors><?ttt This is a PI ?>
<author>
<name idd="VH">Victor Hugo</name>
<user2:name xmlns:user2="myNamespace2" idd="VH">Victor Hugo</user2:name>
<nationality>French</nationality>
</author><!-- This is a very long comment the purpose is
to test the default stylesheet for long comments-->
<author period="classical">
<name>Sophocles</name>
<nationality>Greek</nationality>
</author>
<author>
<name>Leo Tolstoy</name>
<nationality>Russian</nationality>
</author>
<author>
<name>Alexander Pushkin</name>
<nationality>Russian</nationality>
</author>
<author period="classical">
<name>Plato</name>
<nationality>Greek</nationality>
</author>
</authors>
Once the document is converted 3 to your desired form, then you can perform 2 any desired processing on the converted 1 document.
You can create case-insensitive methods 1 (extensions for usability), e.g.:
public static class XDocumentExtensions
{
public static IEnumerable<XElement> ElementsCaseInsensitive(this XContainer source,
XName name)
{
return source.Elements()
.Where(e => e.Name.Namespace == name.Namespace
&& e.Name.LocalName.Equals(name.LocalName, StringComparison.OrdinalIgnoreCase));
}
}
XML is text. Just ToLower
it before loading to 6 whatever parser you are using.
So long as 5 you don't have to validate against a schema 4 and don't mind the values being all lower 3 case, this should work just fine.
The fact 2 is that any XML parser will be case sensitive. If 1 it were not, it wouldn't be an XML parser.
I use another solution. The reason people 10 want this is because you don't want to duplicate 9 the name of the property in the class file 8 in an attribute as well. So what I do is 7 add a custom attribute to all properties:
[AttributeUsage(AttributeTargets.Property)]
public class UsePropertyNameToLowerAsXmlElementAttribute: XmlElementAttribute
{
public UsePropertyNameToLowerAsXmlElementAttribute([CallerMemberName] string propertyName = null)
: base(propertyName?.ToLower())
{
}
}
This 6 way the XML serializer can map lower case 5 properties to CamelCased classes.
The properties 4 on the classes still have a decorator that 3 says that something is different, but you 2 don't have the overhead of marking every 1 property with a name:
public class Settings
{
[UsePropertyNameToLowerAsXmlElement]
public string VersionId { get; set; }
[UsePropertyNameToLowerAsXmlElement]
public int? ApplicationId { get; set; }
}
I would start by converting all tags and 2 attribute names to lowercase, leaving values 1 untouched, by using SAX
parsing, ie. with XmlTextReader
.
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.