HtmlContentProvider fails to handle emoji characters

SyntaxEditor for WPF Forum

Posted 7 years ago by Erel Uziel
Version: 16.1.0635
Platform: .NET 4.5
Environment: Windows 10 (64-bit)
Avatar

I'm getting the following error when calling: new HtmlContentProvider(html).GetContent()

html string was created with HtmlContentProvider.Escape("string with emojis")

The emoji character is available here: http://emojipedia.org/money-bag/

System.Xml.XmlException: '?', hexadecimal value 0xD83D, is an invalid character. Line 1, position 43.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args)
at System.Xml.XmlTextReaderImpl.ParseNumericCharRefInline(Int32 startPos, Boolean expand, StringBuilder internalSubsetBuilder, Int32& charCount, EntityType& entityType)
at System.Xml.XmlTextReaderImpl.ParseCharRefInline(Int32 startPos, Int32& charCount, EntityType& entityType)
at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars)
at System.Xml.XmlTextReaderImpl.ParseText()
at System.Xml.XmlTextReaderImpl.ParseElementContent()
at System.Xml.XmlTextReaderImpl.Read()
at ActiproSoftware.Windows.Controls.SyntaxEditor.IntelliPrompt.Implementation.HtmlContentProvider.CreateBlockSpan(XmlReader reader, String tagName, Span span)
at ActiproSoftware.Windows.Controls.SyntaxEditor.IntelliPrompt.Implementation.HtmlContentProvider.CreateInline(XmlReader reader)
at ActiproSoftware.Windows.Controls.SyntaxEditor.IntelliPrompt.Implementation.HtmlContentProvider.GetRootInline(XmlReader reader)
at ActiproSoftware.Windows.Controls.SyntaxEditor.IntelliPrompt.Implementation.HtmlContentProvider.CreateElement(String htmlSnippet)
at ActiproSoftware.Windows.Controls.SyntaxEditor.IntelliPrompt.Implementation.HtmlContentProvider.GetContent()

Comments (3)

Posted 7 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Erel,

Can you please put together a new simple sample project that shows this happening so we can debug with it and ensure that anything we change fixes the problem?  Please email it to our support address and rename the .zip file extension so it doesn't get spam blocked.  Thanks!


Actipro Software Support

Posted 7 years ago by Erel Uziel
Avatar

This line is enough to reproduce it:

 

new HtmlContentProvider(HtmlContentProvider.Escape("\U0001F4B0")).GetContent();

Posted 7 years ago by Actipro Software Support - Cleveland, OH, USA
Avatar

Hi Erel,

Thanks, we could reproduce it.  To work around it, we had to add a CheckCharacters = false setting to the XmlReader settings.  Apparently Microsoft validates that characters fit within a certain range for XML by default.


Actipro Software Support

The latest build of this product (v24.1.1) was released 1 month ago, which was after the last post in this thread.

Add Comment

Please log in to a validated account to post comments.