Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect parsing of subelements #317

Open
Animus-Surge opened this issue Jun 27, 2022 · 1 comment
Open

Incorrect parsing of subelements #317

Animus-Surge opened this issue Jun 27, 2022 · 1 comment

Comments

@Animus-Surge
Copy link

Animus-Surge commented Jun 27, 2022

Hello,

I have noticed that some keys are empty in the returned dict of parse. Certain fields in the rss feed I'm using are deliberately blank, but the ones I'm focusing on are when fields have subelements. The subelement gets placed in the root of the entry instead of within an object as the value of the parent element. In addition, when different elements have the same subelement, the subelement's value gets overwritten by each subsequent instance of the subelement.

To give an example, lets say I have an entry that looks something like this:

<element1>
  <subelement>Hi</subelement>
</element1>
<!--...-->
<element2>
  <subelement>Hello there</subelement>
</element2>

I would expect parse to return a dict that looks something like this:

{
  //...
  "element1":{
    "subelement":"Hi"
  },
  "element2":{
    "subelement":"Hello there"
  }
}

However, it seems like I get something like this instead:

{
  //...
  "subelement":"Hello there", //"Hi" gets overwritten with "Hello there"
  "element1":"",
  "element2":""
}

minor edit: Using feedparser version 6.0.10 and python 3.10.0

@farouknaser1
Copy link

I am having the same problem, any solution???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants