Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HDChina] DLF and ULF not properly parse #14530

Open
3 tasks done
Siriussee opened this issue Jul 3, 2023 · 6 comments
Open
3 tasks done

[HDChina] DLF and ULF not properly parse #14530

Siriussee opened this issue Jul 3, 2023 · 6 comments
Labels
Invite needed Needs C# PR Welcome! We would welcome a volunteer to prepare a PR to solve this problem! Rewrite in C#

Comments

@Siriussee
Copy link

Have you checked our Troubleshooting page for your issue?

  • I have checked the Troubleshooting page

Is there already an issue for your problem?

  • I have checked older issues, open and closed

Have you read our Contributing Guidelines?

  • I have read the Contributing Guidelines

Environment

Using HTTP Client: HttpWebClient2
Using FlareSolverr: http://192.168.1.160:8191	
Using proxy: Disabled
App config/log directory: /config/Jackett
ThreadPool MaxThreads: 32767 workerThreads, 1000 completionPortThreads
Running in Docker: Yes (image build: v0.21.265-ls111)
File /etc/issue: Welcome to Alpine Linux 3.17	
Jackett variant: CoreLinuxMuslAmdx64
OS version: Unix 5.19.9.0 (64bit OS) (64bit process)
Environment version: 6.0.19 (/app/Jackett/)

Description

The parser ignores all the free leech torrents (DLF=0/ULF=1) in HDC, and parses all torrents as regular torrents (DLF=1/ULF=1). One example is Good.Witch.S07.2021.1080p, which is a free leech torrent in HDC:

<tr>
<td class="t_cat"><a href="https://hdchina.org/torrents.php?cat=21"><img class="caten type_21" src="./HDChina __ Torrents_files/cattrans.gif" alt="欧美剧集包(EU/US TV series pack)" title="欧美剧集包(EU/US TV series pack)"></a></td>
<td class="t_name"><table class="tbname"><tbody><tr><td class="t_pin"></td><td><h3><a title="Good.Witch.S07.2021.1080p.BluRay.Remux.AVC.DTS-HD.MA.2.0-BTN" href="https://hdchina.org/details.php?id=819289&amp;hit=1">Good.Witch.S07.2021.1080p.BluRay.Remux.AVC.DTS-HD.MA.2.0-BTN</a></h3><h4>好女巫 第七季 </h4></td><td class="discount"><p> <img class="pro_free" src="./HDChina __ Torrents_files/trans.gif" alt="Free" onmouseover="domTT_activate(this, event, &#39;content&#39;, &#39;&lt;b&gt;&lt;font class=&quot;free&quot;&gt;Free&lt;/font&gt;&lt;/b&gt;  will end in &lt;b&gt;&lt;span title=&quot;2023-07-03 17:56:35&quot;&gt;5 hours 6 mins&lt;/span&gt;&lt;/b&gt;&#39;, &#39;trail&#39;, false, &#39;delay&#39;,500,&#39;lifetime&#39;,3000,&#39;fade&#39;,&#39;both&#39;,&#39;styleClass&#39;,&#39;niceTitle&#39;, &#39;fadeMax&#39;,87, &#39;maxWidth&#39;, 300);"></p><span title="2023-07-03 17:56:35">5 hours 6 mins</span></td>
<td class="share_rule" valign="middle"><img src="./HDChina __ Torrents_files/share_rule_3.gif" title="Welcome to Distribution"></td>
<td class="act" valign="middle"><a class="imdb" title="IMDB评分" href="https://hdchina.org/retriver.php?id=819289&amp;type=1&amp;siteid=1"><em class="t icon_t i_imdb"></em>7.3</a><br><a href="https://hdchina.org/download.php?hash=2qi52JIgabktrtQRtIY5pA"><img class="download" src="./HDChina __ Torrents_files/trans.gif" alt="download" title="Download Torrent"></a><a id="bookmark46" href="javascript: bookmark(819289,46);"><img class="delbookmark" src="./HDChina __ Torrents_files/trans.gif" alt="Unbookmarked" title="Bookmark"></a><a id="addtorss46" href="javascript:void(0);" onclick="setbookmark(this,819289,1);"><img class="delbookmark_rss" src="./HDChina __ Torrents_files/trans.gif" alt="Unbookmarked" title="DownBox"></a></td>
</tr>

However, in Jackett, it is recognized as a regular torrent:

<item>
      <title>Good Witch S07 2021 1080p BluRay Remux AVC DTS-HD MA 2 0-BTN</title>
      <guid>https://hdchina.org/download.php?hash=2qi52JIgabktrtQRtIY5pA</guid>
      <jackettindexer id="hdchina">HDChina</jackettindexer>
      <type>private</type>
      <comments>https://hdchina.org/details.php?id=819289&amp;hit=1</comments>
      <pubDate>Sun, 02 Jul 2023 20:57:03 -0700</pubDate>
      <size>76106817536</size>
      <grabs>0</grabs>
      <description>好女巫 第七季</description>
      <link>http://[REDACTED]/dl/hdchina/?jackett_apikey=[REDACTED];path=Q2ZESjhCSW5zcm1UejJWTXAya3Z5NFdpMTZtdHREV2FiSVQzSUZHN2xRdHJBSlJNNjhEdW1QMzdVSE9CcWJNTUZsYWpXOERTLVpud2ZTa1ZsX1FyX3dmVXQ0dkFUUlI5VWlWNTd2OXFONHRMdTF3WmJQOUJHS2doQzdBVDNzaDhlWUFsb0hFdWRVTVdFQVY5OEd6N1FYMTV5YUtFYkdvNkRRRzEzVzRmMWZNNGFCY1RQRGtIbW5UYWhTZzBxWExtaVE0U3BR&amp;file=Good+Witch+S07+2021+1080p+BluRay+Remux+AVC+DTS-HD+MA+2+0-BTN</link>
      <category>5000</category>
      <category>100021</category>
      <enclosure url="http://[REDACTED]/dl/hdchina/?jackett_apikey=[REDACTED];path=Q2ZESjhCSW5zcm1UejJWTXAya3Z5NFdpMTZtdHREV2FiSVQzSUZHN2xRdHJBSlJNNjhEdW1QMzdVSE9CcWJNTUZsYWpXOERTLVpud2ZTa1ZsX1FyX3dmVXQ0dkFUUlI5VWlWNTd2OXFONHRMdTF3WmJQOUJHS2doQzdBVDNzaDhlWUFsb0hFdWRVTVdFQVY5OEd6N1FYMTV5YUtFYkdvNkRRRzEzVzRmMWZNNGFCY1RQRGtIbW5UYWhTZzBxWExtaVE0U3BR&amp;file=Good+Witch+S07+2021+1080p+BluRay+Remux+AVC+DTS-HD+MA+2+0-BTN" length="76106817536" type="application/x-bittorrent" />
      <torznab:attr name="category" value="5000" />
      <torznab:attr name="category" value="100021" />
      <torznab:attr name="genre" value="" />
      <torznab:attr name="seeders" value="1" />
      <torznab:attr name="peers" value="13" />
      <torznab:attr name="downloadvolumefactor" value="1" />
      <torznab:attr name="uploadvolumefactor" value="1" />
    </item>

Logged Error Messages

There is no error msg.

Screenshots

No response

@garfield69 garfield69 self-assigned this Jul 3, 2023
@garfield69
Copy link
Contributor

The snippet you provided of the html looks suspect, for example with a table missing it's end tags in the second td block.
Did you accidentally remove some parts during copy/paste?
Please enable Jackett enhanced logging, and repeat the search for good witch s07 , and provide that html as captured by the jackett debug logging.
Thanks.

@Siriussee
Copy link
Author

Siriussee commented Jul 3, 2023

Attached. Thanks for taking care of this.

@garfield69
Copy link
Contributor

garfield69 commented Jul 3, 2023

the problem is that the discount is not hardcoded into the HTML like most of the other fields, but is being dynamically updated via javascript after the html has been sent to the browser
so the browser presents

<td class="discount"><p> <img class="pro_free" src="./HDChina __ Torrents_files/trans.gif" alt="Free" onmouseover="domTT_activate(this, event, &#39;content&#39;, &#39;&lt;b&gt;&lt;font class=&quot;free&quot;&gt;Free&lt;/font&gt;&lt;/b&gt;  will end in &lt;b&gt;&lt;span title=&quot;2023-07-03 17:56:35&quot;&gt;5 hours 6 mins&lt;/span&gt;&lt;/b&gt;&#39;, &#39;trail&#39;, false, &#39;delay&#39;,500,&#39;lifetime&#39;,3000,&#39;fade&#39;,&#39;both&#39;,&#39;styleClass&#39;,&#39;niceTitle&#39;, &#39;fadeMax&#39;,87, &#39;maxWidth&#39;, 300);"></p><span title="2023-07-03 17:56:35">5 hours 6 mins</span></td>

but since jackett cannot run the javascript, it gets

<td> class="discount"><span class="sp_state_placeholder" id="819292"></span></td>

so Jackett cannot see the discount img.pro_free and set the DLVF and ULVF

and there is nothing we can do about it :-(

@garfield69 garfield69 removed their assignment Jul 3, 2023
@Siriussee
Copy link
Author

Siriussee commented Jul 3, 2023

I do some deep dive. The website uses a POST request ajax_promotion.php followed by the torrents.php. Will it be possible if Jackett can also follow a second request as the browser does? The payload looks like this:

&ids[]=819293&ids[]=819292&ids[]=819289&csrf=[REDACTED]

@garfield69
Copy link
Contributor

garfield69 commented Jul 3, 2023

not with cardigann ATM.
the indexer would need to be converted from yaml to C#
[edit] removed obsolete info

@Siriussee
Copy link
Author

Ah, thanks for the info. The good news is that there's only one POST query after one search as you can pass as many as torrent id in the payload, so there's only 2x traffic.
As for the engineering man hour, I can spend some time on it. I am new to C# but I do quite well in dealing with web stuff. Let me know if and how I can contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Invite needed Needs C# PR Welcome! We would welcome a volunteer to prepare a PR to solve this problem! Rewrite in C#
Projects
None yet
Development

No branches or pull requests

3 participants