Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to parse description field with escaped CDATA. #440

Open
cdhigh opened this issue Apr 27, 2024 · 0 comments
Open

Failed to parse description field with escaped CDATA. #440

cdhigh opened this issue Apr 27, 2024 · 0 comments

Comments

@cdhigh
Copy link

cdhigh commented Apr 27, 2024

Bug Description:
Up to the current version (2024-04-12), if the description field contains escaped CDATA, feedparser fails to extract the content. I have simplified the issue and provided a minimal reproducible test case ( source RSS link ).

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>xueqiu</title>
    <link>http://xueqiu.com/hots/topic</link>
    <description>xiuqiu</description>
    <item>
      <title>title</title>
      <link>http://xueqiu.com/1630191122/288006046</link>
      <description>&lt;![CDATA[some text]]&gt;</description>
      <pubDate>Sat, 27 Apr 2024 08:26:02 GMT</pubDate>
      <guid>http://xueqiu.com/1630191122/288006046</guid>
      <dc:creator>name</dc:creator>
      <dc:date>2024-04-27T08:26:02Z</dc:date>
    </item>
  </channel>
</rss>

Expectation:
feed.entries[0].description=='some text', but the actual result is an empty string.
If &lt;![CDATA[some text]]&gt; is changed to <![CDATA[some text]]>, then it works fine.

@cdhigh cdhigh changed the title parse description with CDATA error. parse description with escaped CDATA error. Apr 27, 2024
@cdhigh cdhigh changed the title parse description with escaped CDATA error. Error to parse description with escaped CDATA. Apr 27, 2024
@cdhigh cdhigh changed the title Error to parse description with escaped CDATA. Failed to parse description field with escaped CDATA. Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant