Different ways how to escape an XML string in C#

Encountered XML parsing error problem while coding yesterday. After looking for solution through internet, found out that there are some special characters need to be escaped in order for xml parsing to work correctly.

These special characters and their replacement values are:

< -> &lt;
> -> &gt;
" -> &quot;
' -> &apos;
& -> &amp;

Here are 4 ways you can encode XML in C#:

1. string.Replace() 5 times

This is ugly but it works. Note that Replace("&", "&amp;") has to be the first replace so we don't replace other already escaped &.


string xml = "<node>it's my \"node\" & i like it<node>";
encodedXml
= xml.Replace("&", "&amp;").Replace("<", "&lt;").Replace(">", "&gt;").Replace("\"", "&quot;").Replace("'", "&apos;");
// RESULT: &lt;node&gt;it&apos;s my &quot;node&quot; &amp; i like it&lt;node&gt;
2. System.Web.HttpUtility.HtmlEncode()

Used for encoding HTML, but HTML is a form of XML so we can use that too. Mostly used in ASP.NET apps. Note that HtmlEncode does NOT encode apostrophes ( ' ).


string xml = "<node>it's my \"node\" & i like it<node>";
string encodedXml = HttpUtility.HtmlEncode(xml);


// RESULT: &lt;node&gt;it's my &quot;node&quot; &amp; i like it&lt;node&gt;
3. System.Security.SecurityElement.Escape()

In Windows Forms or Console apps I use this method. If nothing else it saves me including the System.Web reference in my projects and it encodes all 5 chars.


string xml = "<node>it's my \"node\" & i like it<node>";
string encodedXml = System.Security.SecurityElement.Escape(xml);


// RESULT: &lt;node&gt;it&apos;s my &quot;node&quot; &amp; i like it&lt;node&gt;
4. System.Xml.XmlTextWriter

Using XmlTextWriter you don't have to worry about escaping anything since it escapes the chars where needed. For example in the attributes it doesn't escape apostrophes, while in node values it doesn't escape apostrophes and qoutes.


string xml = "<node>it's my \"node\" & i like it<node>";
using (XmlTextWriter xtw = new XmlTextWriter(@"c:\xmlTest.xml", Encoding.Unicode))
{
xtw.WriteStartElement(
"xmlEncodeTest");
xtw.WriteAttributeString(
"testAttribute", xml);
xtw.WriteString(xml);
xtw.WriteEndElement();
}


// RESULT:
/*

<xmlEncodeTest testAttribute="&lt;node&gt;it's my &quot;node&quot; &amp; i like it&lt;node&gt;">
&lt;node&gt;it's my "node" &amp; i like it&lt;node&gt;

</xmlEncodeTest>
*/


Reference: http://weblogs.sqlteam.com/mladenp/archive/2008/10/21/Different-ways-how-to-escape-an-XML-string-in-C.aspx

2 comments:

花木兰 said...

这有什么用处?用华语解释我比较明白,我的电脑常识半桶水。^_^

萧遥 said...

木兰老师:
老师,你考倒我了。
我不知道该怎么样用华文来作解释。
拜托拜托,不要在我的考卷上画鸡蛋,好不好?