[ACCEPTED]-Check well-formed XML without a try/catch?-well-formed

Accepted answer
Score: 23

I don't know a way of validating without 8 the exception, but you can change the debugger 7 settings to only break for XmlException if it's unhandled 6 - that should solve your immediate issues, even 5 if the code is still inelegant.

To do this, go 4 to Debug / Exceptions... / Common Language 3 Runtime Exceptions and find System.Xml.XmlException, then 2 make sure only "User-unhandled" is ticked 1 (not Thrown).

Score: 8

Steve,

We had an 3rd party that accidentally 2 sometimes sent us JSON instead of XML. Here 1 is what I implemented:

public static bool IsValidXml(string xmlString)
{
    Regex tagsWithData = new Regex("<\\w+>[^<]+</\\w+>");

    //Light checking
    if (string.IsNullOrEmpty(xmlString) || tagsWithData.IsMatch(xmlString) == false)
    {
        return false;
    }

    try
    {
        XmlDocument xmlDocument = new XmlDocument();
        xmlDocument.LoadXml(xmlString);
        return true;
    }
    catch (Exception e1)
    {
        return false;
    }
}

[TestMethod()]
public void TestValidXml()
{
    string xml = "<result>true</result>";
    Assert.IsTrue(Utility.IsValidXml(xml));
}

[TestMethod()]
public void TestIsNotValidXml()
{
    string json = "{ \"result\": \"true\" }";
    Assert.IsFalse(Utility.IsValidXml(json));
}
Score: 6

That's a reasonable way to do it, except 4 that the IsNullOrEmpty is redundant (LoadXml 3 can figure that out fine). If you do keep 2 IsNullOrEmpty, do if(!string.IsNullOrEmpty(value)).

Basically, though, your 1 debugger is the problem, not the code.

Score: 4

Add the [System.Diagnostics.DebuggerStepThrough] attribute to the IsValidXml method. This 4 suppresses the XmlException from being caught 3 by the debugger, which means you can turn 2 on the catching of first-change exceptions 1 and this particular method will not be debugged.

Score: 2

Caution with using XmlDocument for it possible to load 7 an element along the lines of <0>some text</0> using XmlDocument doc = (XmlDocument)JsonConvert.DeserializeXmlNode(object) without an 6 exception being thrown.

Numeric element names 5 are not valid xml, and in my case an error 4 did not occur until I tried to write the 3 xmlDoc.innerText to an Sql server datatype 2 of xml.

This how I validate now, and an exception 1 gets thrown
XmlDocument tempDoc = XmlDocument)JsonConvert.DeserializeXmlNode(formData.ToString(), "data"); doc.LoadXml(tempDoc.InnerXml);

Score: 1

The XmlTextReader class is an implementation 9 of XmlReader, and provides a fast, performant 8 parser. It enforces the rules that XML 7 must be well-formed. It is neither a validating 6 nor a non-validating parser since it does 5 not have DTD or schema information. It 4 can read text in blocks, or read characters 3 from a stream.

And an example from another 2 MSDN article to which I have added code 1 to read the whole contents of the XML stream.

string str = "<ROOT>AQID</ROOT>";
XmlTextReader r = new XmlTextReader(new StringReader(str));
try
{
  while (r.Read())
  {
  }
}
finally
{
  r.Close();
}

source: http://bytes.com/topic/c-sharp/answers/261090-check-wellformedness-xml

Score: 0

I disagree that the problem is the debugger. In 18 general, for non-exceptional cases, exceptions 17 should be avoided. This means that if someone 16 is looking for a method like IsWellFormed() which returns 15 true/false based on whether the input is 14 well formed XML or not, exceptions should 13 not be thrown within this implementation, regardless 12 of whether they are caught and handled or 11 not.

Exceptions are expensive and they 10 should not be encountered during normal 9 successful execution. An example is writing 8 a method which checks for the existance 7 of a file and using File.Open and catching 6 the exception in the case the file doesn't 5 exist. This would be a poor implementation. Instead 4 File.Exists() should be used (and hopefully the implementation 3 of that does not simply put a try/catch 2 around some method which throws an exception 1 if the file doesn't exist, I'm sure it doesn't).

Score: 0

Just my 2 cents - there are various questions 23 about this around, and most people agree 22 on the "garbage in - garbage out" fact. I 21 don't disagree with that - but personally 20 I found the following quick and dirty solution, especially 19 for the cases where you deal with xml data 18 from 3rd parties which simply do not communicate 17 with you easily.. It doesn't avoid using 16 try/catch - but it uses it with finer granularity, so 15 in cases where the quantity of invalid xml 14 characters is not that big, it helps.. I 13 used XmlTextReader, and its method ReadChars() for 12 each parent element, which is one of the 11 commands that do not do well-formed checks, like 10 ReadInner/OuterXml does. So it's a combination 9 of Read() and ReadChars() when Read() stubmbles 8 upon a parent node. Of course this works 7 because I can do assumption that the basic 6 structure of the XML is okay, but contents 5 (values) of certain nodes can contain special 4 characters that haven't been replaced with 3 &..; equivalent... (I found an article 2 about this somewhere, but can't find the 1 source link at the moment)

Score: 0

I'm using this function for verifying strings/fragments

<Runtime.CompilerServices.Extension()>
Public Function IsValidXMLFragment(ByVal xmlFragment As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLFragment = True

    Dim NameTable As New Xml.NameTable

    Dim XmlNamespaceManager As New Xml.XmlNamespaceManager(NameTable)
    XmlNamespaceManager.AddNamespace("xsd", "http://www.w3.org/2001/XMLSchema")
    XmlNamespaceManager.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance")

    Dim XmlParserContext As New Xml.XmlParserContext(Nothing, XmlNamespaceManager, Nothing, Xml.XmlSpace.None)

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Fragment
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLFragment = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Dim XmlReader As Xml.XmlReader = Xml.XmlReader.Create(New IO.StringReader(xmlFragment), XmlReaderSettings, XmlParserContext)
    While XmlReader.Read
        'Read entire XML
    End While
End Function

I'm 1 using this function for verifying files:

Public Function IsValidXMLDocument(ByVal Path As String, Optional Strict As Boolean = False) As Boolean
    IsValidXMLDocument = IO.File.Exists(Path)
    If Not IsValidXMLDocument Then Exit Function

    Dim XmlReaderSettings As New Xml.XmlReaderSettings
    XmlReaderSettings.ConformanceLevel = Xml.ConformanceLevel.Document
    XmlReaderSettings.ValidationType = Xml.ValidationType.Schema
    If Strict Then
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ProcessInlineSchema)
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.ReportValidationWarnings)
    Else
        XmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.None
        XmlReaderSettings.ValidationFlags = (XmlReaderSettings.ValidationFlags Or XmlSchemaValidationFlags.AllowXmlAttributes)
    End If
    XmlReaderSettings.CloseInput = True

    AddHandler XmlReaderSettings.ValidationEventHandler, Sub() IsValidXMLDocument = False
    AddHandler XmlReaderSettings.ValidationEventHandler, AddressOf XMLValidationCallBack

    Using FileStream As New IO.FileStream(Path, IO.FileMode.Open)
        Using XmlReader As Xml.XmlReader = Xml.XmlReader.Create(FileStream, XmlReaderSettings)
            While XmlReader.Read
                'Read entire XML
            End While
        End Using
    End Using
End Function
Score: 0

In addition, when only verifying syntactic 8 correctness of the XML string (when there 7 is no need to resolve an external schema), I 6 think adding a XmlResolver = null setting may be a good idea. This 5 both ensures security (no Web access) and 4 security (avoid malicious XML content directing 3 the code to access bad sites). Code follows 2 (requires C# 2.0 or higher):

public static bool IsValidXml(string candidateString)
{
    try
    {
        XmlReaderSettings settings = new XmlReaderSettings();
        settings.XmlResolver = null;
        XmlDocument document = new XmlDocument();
        document.XmlResolver = null;
        document.Load(XmlReader.Create(new MemoryStream(Encoding.UTF8.GetBytes(candidateString)), settings));
        return true;
    }
    catch (XmlException)
    {
        return false;
    }
}

An optimized 1 version for C# 6.0 or higher:

public static bool IsValidXml(string candidateString)
{
    try
    {
        var settings = new XmlReaderSettings { XmlResolver = null };
        var document = new XmlDocument() { XmlResolver = null };
        document.Load(XmlReader.Create(new MemoryStream(Encoding.UTF8.GetBytes(candidateString)), settings));
        return true;
    }
    catch (XmlException)
    {
        return false;
    }
}

More Related questions