How can I fix parser Error in ASP.NET?

What's wrong with my XML?

  • XML Character Woes: I'm getting the error Reference to undefined entity 'ldquo' (and 'rdquo, etc) when I try to open my XML files with IE or Firefox. Can you help me fix it? I know the parser is seeing it as an entity and looking for a definition, but I can't define them in the DTD because I don't know what entities might be coming in. I've set the elements to CDATA hoping the parser would ignore it, but that doesn't change anything. I've also tried changing the entities to the various numerical entities. My goal is just to have valid XHTML entities in the text. These files are certainly going to be converted to HTML at some point but who knows where else they'll go. They might go back into InDesign, etc. In case it matters: I'm getting the content from InDesign and running it against some scripts to fix them up. InDesign is giving me Unicode, and I'm converting the Unicode special characters to the 'rdquo' style html entities

  • Answer:

    The only XML entities are: amp, lt, gt, apos, quot. That's it. Untrue. http://www.w3.org/TR/REC-xml/#sec-references. XHTML has optional support for the character entities but you'll have to include the entities in xhtml-lat1.ent, xhtml-special.ent, xhtml-symbol.ent Also untrue. http://www.w3.org/TR/xhtml1/dtds.html#a_dtd_XHTML-1.0-Strict. You don't need to do anything special to include them. Is there a Decimal Entity equivalent to HTML-ENTITIES? I couldn't find one in the PHP docs. Not that I know of. A while back http://www.randomchaos.com/documents/?source=php_and_unicode. You could use those like so: $contents_for_xml_and_html = unicode_to_entities_preserving_ascii( utf8_to_unicode( $utf8_contents ) );

miniape at Ask.Metafilter.Com Visit the source

Was this solution helpful to you?

Other answers

I think it's because those are HTML entities, not XML entities. Would & quot ; work better? (spaces because without them it looks like ")

utsutsu

You could just leave the Unicode (UTF8 I assume) alone and encode the bare minimum required by XML, as long as you store and process the content with Unicode-friendly software throughout. If you want to handle it as ASCII or ISO-8859-1 then using numeric entities should work fine, I've done that myself to get around Unicode-hostile systems. Are you sure you tried it properly?

malevolent

InDesign is giving me Unicode, and I'm converting the Unicode special characters to the 'rdquo' style html entities Don't do that. Use the Unicode values to create numerical character entities, which work in both HTML and XML.

scottreynen

The only XML entities are: amp, lt, gt, apos, quot. That's it. XHTML has optional support for the character entities but you'll have to include the entities in xhtml-lat1.ent, xhtml-special.ent, xhtml-symbol.ent , and even then many applications won't read the entities correctly. As malevolent said, Use UTF8. All modern application support it. Or use numeric entities if not.

Sharcho

Wow. Thanks. I must have been trying the decimal numbers improperly because it looks like it might work. I would like to use UTF-8, I'd I'll probably keep a copy in that form, but I'm getting a little resistance from outside forces. Someone we shipped some of this content to (A huge internet company no less) actually asked for ASCII csv's, so we're trying to keep it as simple as possible in case something like that comes up again. So basically, scottreynen's solution, but I was using a php function: mb_convert_encoding($contents, 'HTML-ENTITIES', "UTF-8"); to convert them. Is there a Decimal Entity equivalent to HTML-ENTITIES? I couldn't find one in the PHP docs. I prefer not to come up with a conversion table if possible.

miniape

If you're using a stylesheet (xsl), you can also define the ones you want near the top. For example:<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE xsl:stylesheet [ <!ENTITY nbsp " "> ]> <xsl:stylesheet xmlns="http://www.w3.org/1999/XSL/Transform"; version="1.0">XSLT brand noobie here, but I ran into that problem trying to get non-breaking spaces into table cells and googled the specific. Here's the http://www.biglist.com/lists/xsl-list/archives/200011/msg00056.html I pulled this from, in case the code above doesn't show up right. I'm not sure it's the same one I found from work, but the general approach is the same. (Isn't there a way to show pre/code stuff here? I guess I should check the FAQ.)

phrits

On overnight review, it's not limited to stylesheets, I guess. You can declare any of your commonly used entities such that the shorthand (e.g., &nbsp;) references the numeric code.

phrits

Thanks to everyone. All is well. I'm marking scottreynen's answer as best because his link has some great resources and his functions worked perfectly.

miniape

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.