Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails parsing cdata markup that is followed by escaped markup #117

Open
dy-dx opened this issue Feb 1, 2019 · 4 comments
Open

Fails parsing cdata markup that is followed by escaped markup #117

dy-dx opened this issue Feb 1, 2019 · 4 comments

Comments

@dy-dx
Copy link

dy-dx commented Feb 1, 2019

When parsing this valid xml:

<rss version="2.0">
  <channel>
    <item>
      <description>
        <![CDATA[<a src="http://foo.com?foo&bar=baz"></a>]]>&amp;
      </description>
    </item>
  </channel>
</rss>

gofeed fails with the error message:

unknown predefined entity &bar=baz"></a>]]>&amp;
@ghost
Copy link

ghost commented Feb 1, 2019

I can confirm that this is not a problem of encoding/xml which has no problems decoding this input. See https://play.golang.org/p/wWJicjEa-iv

@mmcdole
Copy link
Owner

mmcdole commented Feb 2, 2019

@dy-dx @lutzhorn thank you for the report and confirmation.

CDATA parsing is currently a hack and needs to be rewritten. We aren't using encoding/xml's Unmarshal because it wasn't flexible enough for gofeed's requirements, so we don't get it for free.

I'll try to take a look at CDATA handling soon. Perhaps we can pull some code from encoding/xml itself.

@OrKoN
Copy link
Contributor

OrKoN commented Apr 6, 2019

I have opened a PR to address this issue #120 PTAL @mmcdole

@sudhanshuraheja
Copy link
Contributor

This issue is fixed on the latest master. Here's the commit - 22a67f9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants