SynoCatch - Synology RSS Broadcatching Script

Questions regarding modifying the torrent engine or download station may go here.
Forum rules
Please note the disclaimer before modifying your Synology Product.

SynoCatch - Synology RSS Broadcatching Script

Postby h0me5k1n » Wed Oct 07, 2009 1:46 am

[There's more complete instructions further down this thread]

I've written a script based on Lincs mldonkey brodcatching script (http://mldonkey.sourceforge.net/Broadcatch) to download torrents from rss feeds.... and then it uses the "downloadstation" CLI tool (http://downloadstation.jroene.de/) to add the torrents.

synocatch
Code: Select all
#!/opt/bin/bash
# SynoCatch - Torrent downloading by RSS (hacked by h0me5k1n)
#
# Script based on http://linc.homeunix.org:8080/scripts/bashpodder

#need bash and xsltproc installed by ipkg

###CONFIGURATION PARAMETERS
## directory to put the downloaded torrents,with trailing slash
torrentdir="/volume1/downloads/"
# User Vars
CONFG="bp.conf"

# Debug Log (set to /dev/null to turn off)
DEBUG="/dev/null"
#DEBUG="debug.log"

# Make script crontab friendly:
cd $(dirname $0)
echo -e "\nExecuting $0 on $(date)" >> $DEBUG

# feed dump reset
rm -f rssdata

# Read the bp.conf file and wget any url not already in the catch.log file:
while read subscription
     do
     xmldata=$(wget $(echo "$subscription" | sed 's/[^@]*@\(.*\)/\1/') -q -O -)
     expression=$(echo "$subscription" | sed 's/\([^@]*\)@.*/\1/')

     # If $expression is blank or is the same as the source then use a wildcard
     if [ "$expression" = "$subscription" ] || [ -z "$expression" ];
         then
         expression="."
     fi

     # Parsing xml depending on where the torrent url is located inside <link> tags or as the value for the attribute enclosure
     if  fgrep -iq enclosure <<< "$xmldata"
         then
         file=$(echo "$xmldata" | xsltproc parse_enclosure.xsl - 2> /dev/null)
     else
         file=$(echo "$xmldata" | xsltproc parse_link.xsl - 2> /dev/null)
     fi
     
     # Protect against the white space gotchas
#     file=$(tr ' \\\t\r' '_/__' <<< "$file") # doesnt work oon Synology DS207
     file=${file//\ /_}
     file=${file//\\/}
   
     for url in $file
         do
         if  echo "$url" | egrep -i $expression &> /dev/null
             then
             torrent=$(sed 's/\([^#]*\)#.*/\1/' <<< "$url")
             if ! fgrep -i "$torrent" catch.log > /dev/null
                 then
                 # URL and Mininova fixer
                 torrent=$(echo "$torrent" | sed -e "s/mininova.org\/tor/mininova.org\/get/g")
                 
## parse the filename from the end of the $url variable (after the #)
torrentname=$( echo $url | sed 's/^.*\#//' )
# append .torrent on the end
torrentname=$torrentname.torrent
               
                 # Get the torrent, name it correctly and put it in the right directory
      wget -q -nH -O $torrentdir$torrentname $torrent && ./downloadstation torrent $torrentdir$torrentname && echo "$url" >> catch.log
             fi
         fi
         # rssdata is for test matching
         echo "$url" >> rssdata
     done
done < $CONFG


I think you need "bash" and "xsltproc" installed by ipkg (I can't remember which package xsltproc is in!)

AND...
you need a bp.conf, the parse_enclosure.xsl file, the parse_link.xsl file (see the mldonkey page), a catch.log file (`touch catch.log`) and the downloadstation executable in the same folder as the script.

AND...
you need to check that the "torrentdir" is set (this is where the .torrent files are downloaded to).

Now all I need to do is "cron" this to happen daily!

I'm sure this could be tidied up by someone who knows more about scripting and could maybe be added into the downloadstation CLI!
Last edited by h0me5k1n on Sun Mar 21, 2010 9:50 am, edited 2 times in total.
h0me5k1n
Beginner
Beginner
 
Posts: 20
Joined: Sun Jan 25, 2009 1:48 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby devro » Wed Nov 18, 2009 2:58 am

Thanks! Works great for me so far. A couple things I found:

- xsltproc is in package libxslt
- I had to use #!/opt/bin/bash for the first line of synocatch, as /bin/sh points to the stock ash shell

Now I just need the final piece to my evil plans: some way to get the files automatically moved to a configured folder on download completion. ie. move from /volume1/download to /volume1/video/TV/<show-name> based on feed.
devro
I'm New!
I'm New!
 
Posts: 3
Joined: Wed Oct 22, 2008 3:20 am

Re: SynoCatch - Synology RSS Broadcatching Script

Postby blouz » Fri Nov 20, 2009 2:21 pm

Additionnaly, one must take care of extra space added while copying the script.
A space before #!/opt/bin/bash would be an issue. (I get these spaces while using the forum's select all function)

I modified the script to subscribe to additional kind of rss feeds (added direct http support)
I also removed the .torrent file downloading as this was not useful for my use and removed the rssdata file stuff (I did not find what it was useful for)
http support let you download from rss feeds linking directly to files (software downloads site, subtitle sites...)

I send the catched files log in a _logs subdir (to create at script install)
I also manage my execution logs outside the script, that's why the debug stuff is no more in the script

Code: Select all
#!/opt/bin/bash
# SynoCatch - Torrent downloading by RSS (hacked by h0me5k1n)
#
# Script based on http://linc.homeunix.org:8080/scripts/bashpodder

#need bash and xsltproc installed by ipkg

###CONFIGURATION PARAMETERS
## directory to put the downloaded torrents,with trailing slash
# User Vars
CONFG="bp.conf"

# Make script crontab friendly:
cd $(dirname $0)
echo -e "\n- $(date) - $0 -"

# Read the bp.conf file and wget any url not already in the catch.log file:
while read subscription
     do
    #xmldata is the rss xml
     xmldata=$(wget $(echo "$subscription" | sed 's/[^@]*@\(.*\)/\1/') -q -t 5 -T 30 -O -)
    #expression is the filtering regexp
     expression=$(echo "$subscription" | sed 's/\([^@]*\)@.*/\1/')
     # If $expression is blank or is the same as the source then use a wildcard
     if [ "$expression" = "$subscription" ] || [ -z "$expression" ];
         then
         expression="."
     fi

     # Parsing xml depending on where the target url is located inside <link> tags or as the value for the attribute enclosure
     if  fgrep -iq enclosure <<< "$xmldata"
         then
         file=$(echo "$xmldata" | xsltproc parse_enclosure.xsl - 2> /dev/null)
     else
         file=$(echo "$xmldata" | xsltproc parse_link.xsl - 2> /dev/null)
     fi
     
     # Protect against the white space gotchas
    # file=$(tr ' \\\t\r' '_/__' <<< "$file") # doesnt work oon Synology DS207
     file=${file//\ /_}
     file=${file//\\/}
   
     for url in $file
         do
         if  echo "$url" | egrep -i $expression &> /dev/null
             then
             target=$(sed 's/\([^#]*\)#.*/\1/' <<< "$url")
             if ! fgrep -i "$target" _logs/catch.log > /dev/null
                 then
                 # URL and Mininova fixer
                 target=$(echo "$target" | sed -e "s/mininova.org\/tor/mininova.org\/get/g")
                 echo -e "Submitting URL $target on $(date '+(%r %D)')\n$url"
             ./downloadstation add $target && echo "$url" >> _logs/catch.log
             fi
         fi
         
     done
done < $CONFG
Last edited by blouz on Fri Dec 11, 2009 2:47 pm, edited 3 times in total.
blouz
Beginner
Beginner
 
Posts: 27
Joined: Fri Nov 20, 2009 2:15 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby h0me5k1n » Sat Dec 05, 2009 12:05 pm

I've just noticed a problem - if the location which the wget command is trying to get the .torrent file from no longer exists (it's happened quite a lot recently) then the wget command will get stuck in a loop retrying the download... This can result in 0kb .torrent files being created and the script not being able to finish (or taking ages to do so!). I'm just testing it and adding some functionality to limit the number of retries and checking for 0kb files.

I've updated the first line too - Thanks blouz
h0me5k1n
Beginner
Beginner
 
Posts: 20
Joined: Sun Jan 25, 2009 1:48 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby Fly » Mon Jan 25, 2010 6:26 pm

Hi,

is there possibility to get some instructions how this script should be installed? I guess there's more users which aren't Linux\Unix gurus.

Fly
Fly
I'm New!
I'm New!
 
Posts: 1
Joined: Mon Jan 25, 2010 6:23 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby laser21 » Fri Mar 12, 2010 12:15 am

Guys any help on this would be really appreciated! A guide of some kind for us noobs(linux)! :)

I dont understand, why RSS are still not fully supported in 2.3...
laser21
Rookie
Rookie
 
Posts: 38
Joined: Mon Feb 01, 2010 9:57 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby kerryandjane » Sun Mar 21, 2010 3:15 am

Fly wrote:Hi,

is there possibility to get some instructions how this script should be installed? I guess there's more users which aren't Linux\Unix gurus.

Fly


Agreed please help us noobs set up this RSS feed.

reading all the entries i assume this would be what we're wanting to just click on an RSS feed and it will automatically start downloading the file we RSS to right?
DS209+ 2 x 2TB HD's (2xWD20EARS-00S8B1)
running latest firmware DSM 4.2
SparkLAN CAS-371W IP Cam
iPhone 3GS with SynoDS
and slim PS3
Get tech or die trying
User avatar
kerryandjane
Versed
Versed
 
Posts: 218
Joined: Tue Feb 03, 2009 3:39 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby h0me5k1n » Sun Mar 21, 2010 9:47 am

kerryandjane wrote:reading all the entries i assume this would be what we're wanting to just click on an RSS feed and it will automatically start downloading the file we RSS to right?

You don't need to click on anything - the script (once configured and scheduled with cron) will automatically parse an RSS feed and download the torrents that are in the feed - it can either download all torrents in the feed or you can specify a Regular Expression to filter the torrents it actually gets. You just need to configure which RSS feeds you want to check and set the Regular Expressions as shown in the MLDonkey Page.

My requirements for the script have changed slightly.... I've split the "Broadcatching" (getting the torrents from the RSS feed) and the automatic loading of the torrents into two parts. I have a folder on my DiskStation where the synocatch script saves the torrent files and another script loads them from the folder into the DownloadStation... This means it doesn't matter whether the script puts the torrent files in the folder or if I do it myself - they'll automatically be picked up and downloaded.

Automating .torrent downloads by RSS

Put these files in the same folder

"synocatch.shell"
Code: Select all
#!/opt/bin/bash
# SynoCatch - Torrent downloading by RSS (hacked by h0me5k1n)
#
# Script based on http://linc.homeunix.org:8080/scripts/bashpodder
# discussion on http://forum.synology.com/enu/viewtopic.php?f=38&t=18039

#need bash and xsltproc installed by ipkg

###CONFIGURATION PARAMETERS
## directory to put the downloaded torrents,with trailing slash
torrentdir="/volume1/downloads/"
# User Vars
CONFG="bp.conf"

# Debug Log (set to /dev/null to turn off)
DEBUG="/dev/null"
#DEBUG="debug.log"

# Make script crontab friendly:
cd $(dirname $0)
echo -e "\nExecuting $0 on $(date)" >> $DEBUG

# feed dump reset
rm -f rssdata

# Read the bp.conf file and wget any url not already in the catch.log file:
while read subscription
 do
  xmldata=$(wget $(echo "$subscription" | sed 's/[^@]*@\(.*\)/\1/') -q -O -)
  expression=$(echo "$subscription" | sed 's/\([^@]*\)@.*/\1/')
  # If $expression is blank or is the same as the source then use a wildcard
  if [ "$expression" = "$subscription" ] || [ -z "$expression" ];
   then
   expression="."
  fi
  # Parsing xml depending on where the torrent url is located inside <link> tags or as the value for the attribute enclosure
  if  fgrep -iq enclosure <<< "$xmldata"
   then
   file=$(echo "$xmldata" | xsltproc parse_enclosure.xsl - 2> /dev/null)
  else
   file=$(echo "$xmldata" | xsltproc parse_link.xsl - 2> /dev/null)
  fi
  # Protect against the white space gotchas
  #     file=$(tr ' \\\t\r' '_/__' <<< "$file") # doesnt work oon Synology DS207
   file=${file//\ /_}
   file=${file//\\/}
 
  for url in $file
   do
#    echo "url is $url"
    if  echo "$url" | egrep -i $expression &> /dev/null
     then
     torrent=$(sed 's/\([^#]*\)#.*/\1/' <<< "$url")
     if ! fgrep -i "$torrent" catch.log > /dev/null
      then
      # URL and Mininova fixer
      torrent=$(echo "$torrent" | sed -e "s/mininova.org\/tor/mininova.org\/get/g")
      ## parse the filename from the end of the $url variable (after the #)
      torrentname=$( echo $url | sed 's/^.*\#//' )
      # append .torrent on the end
      torrentname=$torrentname.torrent
      # Get the torrent, name it correctly and put it in the right directory
      if wget --connect-timeout=10 --tries=2 -qncH -O $torrentdir$torrentname $torrent
       then
       echo "$torrentname downloaded from $torrent" >> $DEBUG
      else
       echo "failed to get $torrentname from $torrent" >> $DEBUG
      fi
      # Delete the torrent file if it's an empty file
      if [ -s "$torrentdir$torrentname" ]
       then
       echo "$url" >> catch.log
      else
       echo "$torrentname is 0kb... deleting..." >> $DEBUG
       rm $torrentdir$torrentname
      fi
     fi
    fi
     # rssdata is for test matching
     echo "$url" >> rssdata
  done
 done < $CONFG
[*] You need to configure the "torrentdir" variable with the folder that you want the torrent files downloaded to.
[*] You can set the "DEBUG" variable to troubleshoot the script.
[*] In this version I have fixed the issue found previously where 0kb torrent files were downloaded and causing errors.

the "bp.conf" file as shown in the MLDonkey Page
[*] Customise the links for your own torrent requirements

the "parse_enclosure.xsl" file as shown in the MLDonkey Page

the "parse_link.xsl" file as shown in the MLDonkey Page

You need "bash" and "libxslt" installed by ipkg too
Code: Select all
ipkg install bash libxslt

edit your "/etc/crontab" file and add entries to configure when the script will run... this is what mine looks like:
Code: Select all
# Synocatch script - load torrents from RSS feeds
5   10,11,23,2,4,6,7,8   *   *   *   root   /PATHTOSCRIPT/synocatch.shell &>/dev/null
[*] The first line is a "comment".
[*] You'll need to edit the "PATHTOSCRIPT" section
[*] This runs the script at 2.05am, 4.05am, 6.05am, 7.05am, 8.05am, 10.05am, 11.05am and 11.05pm every day
[*] The format of the crontab entry is important - IIRC you must use tabs between entries and NOT spaces
[*] you can restart crontab without rebooting using the following command
Code: Select all
/usr/syno/etc.defaults/rc.d/S04crond.sh stop && /usr/syno/etc.defaults/rc.d/S04crond.sh start


Loading Torrent Files Saved in a Directory
This script will look at the contents of a folder and if it finds any torrent files it will load them... I have a version which does nzb files at the same time too!

"syno_torrent_load.shell"
Code: Select all
#!/bin/sh
# Script to start .torrent files saved in a predefined folder

###CONFIGURATION PARAMETERS
## directory to put the downloaded torrents,with trailing slash
TORRENTDIR="/volume1/downloads"
SCRIPTNAME=`basename $0`
LOGFILE=/dev/null

# Make script crontab friendly:
cd $(dirname $0)

# Check for file presence
if ls $TORRENTDIR | grep torrent ;
 then
  # enter date in log file
  echo "-----------------------" >> $LOGFILE
  echo "LOG TIMESTAMP - $(date)" >> $LOGFILE

  find $TORRENTDIR -type f -name "*.torrent" | while read EACHFILE
  do

  # Remove spaces and square brackets from filename
  TORRENT=`echo $EACHFILE | sed 's/\([ ]\)/\\\ /g' | sed 's/\([[]\)/\\\[/g' |sed 's/\([]]\)/\\\]/g'`
  NEWFILENAME=`echo $EACHFILE | sed 's/[ ]*//g' | sed 's/[[]*//g' | sed 's/[]]*//g'`

  # Echo the variables (for debugging)
  #echo TORRENT is $TORRENT
  #echo NEWFILENAME is $NEWFILENAME
  mv "$EACHFILE" "$NEWFILENAME"

  # load into the downloadstation
  if downloadstation torrent "$NEWFILENAME"
   then
    rm "$NEWFILENAME"
    echo -e "$NEWFILENAME loaded" >> $LOGFILE
   else
    echo -e "failed to load $NEWFILENAME using downloadstation cli utility" >> $LOGFILE
  fi 
  done
 else
  echo "No .torrent files found" >> $LOGFILE
fi
[*] You need to configure the "TORRENTDIR" with the folder that you want the torrent files downloaded to.
[*] You can set the "LOGFILE" variable to troubleshoot the script.

Install DownloadStation CLI by Wawe
[*] I recommend editing the DownloadStation CLI script as per the post by blouz. This will ensure that the torrent added use the seeding configuration you have set up.

edit your "/etc/crontab" file and add entries to configure when the script will run... this is what mine looks like:
Code: Select all
# Synology Torrent Loader - loads torrents saved in a folder
10   0,1,2,3,4,5,6,7,22,23   *   *   *   root   /PATHTOSCRIPT/syno_torrent_load.shell &>/dev/null
[*] The first line is a "comment".
[*] You'll need to edit the "PATHTOSCRIPT" section
[*] This runs the script at 0.10am, 1.10am, 2.10am, 3.10am, 4.10am, 5.10am, 6.10am, 7.10am, 10.10pm and 11.10pm every day
[*] The format of the crontab entry is important - IIRC you must use tabs between entries and NOT spaces
[*] you can restart crontab without rebooting using the following command
Code: Select all
/usr/syno/etc.defaults/rc.d/S04crond.sh stop && /usr/syno/etc.defaults/rc.d/S04crond.sh start


Combining the Scripts
Using the information in this thread - the first post and these additional instructions - the scripts could easily be combined to automate the Broadcatching and the loading into DownloadStation into a single script if that's what you want... It's just not what I want :D

Other Scripts
I have another script which runs by crontab to "manage" the torrent downloads using the DownloadStation CLI... Once torrents are finished they won't disappear from the DownloadStation until they are manually removed. That script checks for "COMPLETED" downloads and removes any it finds. I'll post that another day... ^^ that lot took me ages to write! :D


Thanks to Wawe, blouz and linc
h0me5k1n
Beginner
Beginner
 
Posts: 20
Joined: Sun Jan 25, 2009 1:48 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby laser21 » Sun Mar 21, 2010 10:48 am

I think this should be added to the wiki!!

Its important for people who dont install rtorrent and want that extra functionality!!

Thanks!!
laser21
Rookie
Rookie
 
Posts: 38
Joined: Mon Feb 01, 2010 9:57 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby kerryandjane » Sun Mar 21, 2010 11:01 am

Homeskin you're a legend. i'll have a crack at it tomorrow. you need to add this to the wiki.

Thanks for your quick and detailed response mate.
DS209+ 2 x 2TB HD's (2xWD20EARS-00S8B1)
running latest firmware DSM 4.2
SparkLAN CAS-371W IP Cam
iPhone 3GS with SynoDS
and slim PS3
Get tech or die trying
User avatar
kerryandjane
Versed
Versed
 
Posts: 218
Joined: Tue Feb 03, 2009 3:39 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby h0me5k1n » Sun Mar 21, 2010 3:08 pm

laser21 wrote:Its important for people who dont install rtorrent and want that extra functionality!!

It's important to note that this script/configuration is not specific to Synology devices... I've been using this script (or a slight variation of it) on a linux server at home for a few years to "broadcatch" torrents.
h0me5k1n
Beginner
Beginner
 
Posts: 20
Joined: Sun Jan 25, 2009 1:48 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby John_P » Sat Apr 10, 2010 4:18 pm

Thanks for a great script and an incentive to learn something new.

Apart from upgrading the firmware I've never done anything none standard with my DS207+

Had a play with this and came across a couple of problems which meant it didn't fulfil my needs.

1 I use private torrents that use logins and cookies so the basic script wont work
2 Some of the RSS feeds fail to parse with xsltprocare as they are not encoded in UTF-8.

Now bear in mind that I haven't done anything with Linux before let alone any bash although I did study programming at uni and I haven't done any programming since Americas Army came out that must be 6+ years ago -
I used to run a UK based server before the were any official server in the UK, wrote a web site front end from scratch (html, javascript,php,mySQL) dealt with members login game stats displayed the last completed games, stats maintained a database of who killed who and when etc but I digress

Problem I had during the modifications
1 Couldn't get the files to run forgot to chmod !
2 Got sick of vi and forgetting those commands so installed joe
3 Soon found I needed iconv to convert some of the feeds.
5 Some checking and cleaning up of torrent names was needed
4 Getting the individual info from the torrent sites to be able to login and dl the rss feed.

iconv was the biggest hurdle spent an evening going round in circles trying to find a definitive way of getting it installed
4 days later I had toolchains installed and libiconv-1.13.tar.gz configured, made and installed
xsltproc still would not parse until I removed the first line of xmldata and replaced it with the correct encoding info.

I changed the conf file and added extra params to contain the login details username, password, login url.

When the synocatch runs it checks if the extra detail are present if they are it process them checks if the cookie is available if not logs in and downloads it then downloads the RSS feed.

As some of the sites I frequent use windows-1251 encoded feeds I have to check, re-encode as necessary then parse.

synoload was left untouched apart from redirecting the logfile to the same one as synocatch.

The new version has been running for 3 day and no errors yet -)

I just sat down and hacked this without any real plan and learnt as I went along so it's not polished.

Still need to do the tidying up and fixing a couple of bodges but seems ok

The main thing I'm not happy with is the error handling writing a message to a log and hoping things will be ok on the next run is not what I'd normally do.

I now know enough to be dangerous but I don't know enough to know I'm being dangerous

John
John_P
I'm New!
I'm New!
 
Posts: 4
Joined: Wed Apr 07, 2010 2:30 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby h0me5k1n » Sat Apr 10, 2010 9:09 pm

Your journey sounds a lot like mine John! I don't claim to being skilled in scripting/programming/Linux but I seem to be able to find my own way to make my computer usage easier when I think of something that I want! In this case I've ported my broadcatching script to my Synology box!

I don't use private torrents so I'd never thought about user logins and I've never had problems parsing RSS feeds although I've really only used Mininova, TVRSS (now EZTV) and ShowRSS (the only one I'm using now!). Post your modified script once your happy with it! I'm interested to see it.

The "error handling writing a message to a log" and "the hoping things will be ok on the next run" may not be ideal but this script (or a slight variation) has pretty much been running unattended on my home server for a few years without any problems (originally with torrentflux-b4rt) - My aim was always to create something that just runs and doesn't need to be checked. I added some error handling around 0KB .torrent files and included the ability to debug what's going on in case of problems but I'm open to suggestions!
h0me5k1n
Beginner
Beginner
 
Posts: 20
Joined: Sun Jan 25, 2009 1:48 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby John_P » Tue Apr 13, 2010 11:15 pm

Sorry for not posting an update been busy on other things including getting the car ready for its service and MOT today.

The error handling was not a slight at you or the posted script but an indictment of my lack of ability to do anything else :mrgreen:

I have been working on automating demonoid torrents and finally sorted it to night

had to write in a couple of exception to handle it.

first off the posted data to login is different "username" is "nickname" so the cookies I had been getting were actually empty.

second the demoniod page is encoded iso-8859-1 the script at first was re encoding it to utf-8 and breaking the parsing

third once the initial xmldata has been downloaded from demonoid any torrents to be downloaded had been identified, I then had to download the guide page and extract the file download url from that page sed did not want to handle it (large file or long line ?) then download the torrent file using the cookie and enter any data into the catch file.

It will make sense when I post it after cleaning it up abit, hopefully later this week.

John
John_P
I'm New!
I'm New!
 
Posts: 4
Joined: Wed Apr 07, 2010 2:30 pm

Re: SynoCatch - Synology RSS Broadcatching Script

Postby John_P » Wed Apr 14, 2010 9:58 pm

OK here it is with 1 day of testing since the modifications to cope with demonoid so may encounter some unexpected results along the way.

Some of the comments I have left in are to remind me what I'm doing or what I did they can be removed.

Still need to add something to tidy up some of the files downloaded during logins and cookies.

Some sites need you to login before you can download torrents an example is demonoid you can only download a few new torrents before you get an error page,

Some sites have specific download pages with all the information and full instructions on how to implement wget logins other sites you need to examine the login page source code looking at the form post data and action to determin the required procedure.

Most private torrent sites allow you to use passkeys and authkeys to eliminate the need to login the original version of the script will cope with those.

I used a file called 'feed.conf' to hold the configuration parameters as I was running both the original version and my own at the same time.

The first 2 parameters are the same as before with the exception that the line can be commented out with a # hash.
the next 4 parameters are :-
Name for Cookie = just a name for the cookie file I tended to use things like the site name ie demonoid
User name= the name you would use to login at that site.
Password = password used to login at that site.
URL to login page = the specific web page used to login
Code: Select all
Search Pattern.@RSS Feed URL#Name for cookie@user name@password@URL to login page

Real examples unfortunately because some of the lines are so long they are getting wrapped when I view in preview
Code: Select all
.@http://www.videoseed.com/feeds.php?feed=torrents_notify_1234qwertyuiui567890&user=1234567890&auth=1234567890abcdefghijklm&passkey=abcdefghijklm1234567890&authkey=1234567890abcdefghi#videoseed@user@password@http://www.videoseed.com/login.php
#.ice.@http://www.acetorrents.net/rss.php?feed=dl&cat=68,65,54,91,89,48,58,49,44,40,84,81,55,52,51,46,86,73,39,83,74,88,43,90&passkey=abcdefghi1234567890a#acetorrents@user@password@www.acetorrents.net/takelogin.php
.ice.@http://www.acetorrents.net/rss.php?feed=dl&cat=68,65,54,91,89,48,58,49,44,40,84,81,55,52,51,46,86,73,39,83,74,88,43,90&passkey=abcdefghi1234567890a
.tv.@http://static.demonoid.com/rss/0.xml#demonoid@user@password@http://www.demonoid.com/account_handler.php

I have removed the original authkeys and passkeys login names and passwords for obvious reasons but if you don't have an account with that particular site feel free to use the ones I have post they absolutely will not work.

the first site videoseed offers the ability to tailor the rss feed for your own requirements so I can download every thing from them, they also off the ability to use passkeys or login.
the second and third lines are for acet the second line is commented out useful when you consider the length of some of these lines they all have shortened pass and authkeys, it contains the correct login url. the third line is for acet again but only using the passkey for authorisation
the last example is demonoid and will download anything with tv in the name be it the category or title name

All the login urls are correct at this time.
You don't need to use authkeys etc with logins just don't add the last four parameters as per example 3.

the script
Code: Select all
#!/opt/bin/bash
# SynoFeedCatch - Torrent downloading by RSS based on a script by h0me5k1n)
#
# Script based on http://linc.homeunix.org:8080/scripts/bashpodder
# discussion on http://forum.synology.com/enu/viewtopic.php?f=38&t=18039

#need bash and xsltproc installed by ipkg
# gcc, ruby? installed during my travells but probably not needed
#Install toolchains via ipkg install 'ipkg install optware-devel'
# iconv see http://www.gnu.org/software/libiconv/
#download  ftp://ftp.gnu.org/gnu/libiconv/libiconv-1.13.tar.gz
# tar xvfz libiconv-1.13.tar.gz
# ./configure --prefix=/usr/local / make / make install

###CONFIGURATION PARAMETERS
## directory to put the downloaded torrents,with trailing slash
torrentdir="/volume1/torrents/rss/"
# User Vars
# Feed.conf contain lines with 6 strings seperated by an 'title regex'@'RSS feed url'#'name for cookie ie sitename'@'login name'@'userpassword'@login url / script
# lines can be comented out by placing a # as the first character
PRIVATEFEED="feed.conf"
#   Path to store cookies in
COOKIEPATH="/volume1/torrents/rss/cookies/"
rm -f DEBUG #seperate debug file I used for xsltproc
# Debug Log (set to /dev/null to turn off)
#DEBUG="/dev/null"
DEBUG=$torrentdir"debug.log"
echo "Debug log set to $DEBUG"
# Make script crontab friendly:
cd $(dirname $0)
echo -e "\nExecuting $0 on $(date)" >> $DEBUG
# feed dump reset
rm -f rssdata
# Read the feed.conf file and wget any url not already in the catch.log file:
#    subscription is the regex@url of rss feed from the bp.conf line by line.
while read torrent_private_feed
do
   if test $(echo "$torrent_private_feed" | sed 's/\(^#\).*$/\1/' ) = "#"
      then
# line is hashed out skip processing
         echo "Line HASHED out not processing $torrent_private_feed" >> $DEBUG
         continue
   fi
#<title regex>@<rss feed urL>#<name for cookie>@<user>@<password>@<login url>
#\(^[^@]\+@[^@#]\+\)#\+\([^#@]*@[^@]*@[^@]*@[^@]*\)
   subscription=$(echo "$torrent_private_feed" | sed 's/\(^[^@]\+@[^@#]\+\)#\+\([^#@]\+@[^@]\+@[^@]\+@[^@]\+\)/\1/' )
   private_feed=$(echo "$torrent_private_feed" | sed 's/\(^[^@]\+@[^@#]\+\)#\+\([^#@]\+@[^@]\+@[^@]\+@[^@]\+\)/\2/' )

# can't get the command to work right so will have to try a work around should look at this again!
# in the conf file 'private_feed' data might be empty I want to use this as a flag not to process
# it as a private torrent site needing a login but the regex fails?? and returns the full string???
   if [ "$subscription" = "$private_feed" ]
      then
         private_feed=
   fi
   expression=$(echo "$subscription" | sed 's/\([^@]*\)@.*/\1/' )
#   and check if the subscription contain a site that needs a login??
# -z string = True if the length of the string is 0.
# -n string = True if the length of the string is non-zero.
   if ! [ -z "$private_feed" ]
      then
#   Private site requiring login found
#   Extract the torrent site name data from the line using sed
         private_site=$(echo "$private_feed" | sed 's/^\([^@]*\)@\([^@]*\)@\([^@]*\)@\([^@]*\)$/\1/' )
         private_user=$(echo "$private_feed" | sed 's/^\([^@]*\)@\([^@]*\)@\([^@]*\)@\([^@]*\)$/\2/' )
         private_password=$(echo "$private_feed" | sed 's/^\([^@]*\)@\([^@]*\)@\([^@]*\)@\([^@]*\)$/\3/' )
         private_login=$(echo "$private_feed" | sed 's/^\([^@]*\)@\([^@]*\)@\([^@]*\)@\([^@]*\)$/\4/' )
         echo -e "$LINENO Private RSS feed processing login cookie for $private_site" >> $DEBUG
#   Do we have a cookie?
         cookie_file="$COOKIEPATH$private_site.txt"
         if ! [ -f $cookie_file ]
            then
#   Check for directory / file path and create as necessary
               if ! [ -d $COOKIEPATH ]
                  then
                     mkdir $COOKIEPATH 2>> $DEBUG
               fi
#   Get the cookie from site
# Demonoid Exception posted data username is nickname!!
# wget --save-cookies=$cookie_file --keep-session-cookies --post-data="nickname=$private_user&password=$private_password" "http://www.demonoid.com/account_handler.php"
               if echo "$subscription" | egrep -i "demonoid" &> /dev/null #gotyou for demonoid
                  then
                     post_data="nickname=$private_user&password=$private_password"
                     echo demonoid
                  else
                     post_data="username=$private_user&password=$private_password"
                     echo not demonoid
               fi
               wget --save-cookies=$cookie_file --keep-session-cookies --post-data="$post_data" $private_login
            fi
            xmldata=$(wget --load-cookies="$cookie_file" $(echo "$subscription" | sed 's/[^@]*@\(.*\)/\1/') -q -O -)
      else
            xmldata=$(wget $(echo "$subscription" | sed 's/[^@]*@\(.*\)/\1/') -q -O -)
   fi
   if  [ -z "$xmldata" ]
      then
         echo -e "$LINENO Failed to get data from "$(echo "$subscription" | sed 's/[^@]*@\(.*\)/\1/') >> $DEBUG
      else
# 1st line contains version and encoding
         version=$(echo "$xmldata" | sed '2,$d') #>> $DEBUG
# extract encoding <?xml version="1.0" encoding="utf-8"?>
         encoding=$(echo "$version" | sed 's/^.*<?xml version="[^"]*".*encoding="\([^"]*\)".*?>.*$/\1/' ) #>> $DEBUG
# convert to upper case
         encoding=$(echo "$encoding" | tr "[:lower:]" "[:upper:]") #>> $DEBUG
         if [ "$encoding" = "UTF-8" ] || [ "$encoding" = "ISO-8859-1" ] #converting 8859-1 breaks the paser
            then
               echo -e "$LINENO RSS Encoding is $encoding" >> $DEBUG
            else
               echo -e "$LINENO Reencoding from $encoding to UTF-8" >> $DEBUG
#   xsltproc throws a wobbly if you don't change the file declaration of the encoding
# Delete the first line and   Insert new line at the start of the file
               xmldata1=$(echo "$xmldata" | sed -e '1,1d' -e '1i\<?xml version="1.0" encoding="UTF-8"?>') #>> $DEBUG
# Re encode file to UTF-8
               xmldata=$(echo "$xmldata1" | iconv -c -s -f "$encoding" -t UTF-8) >> $DEBUG
#exampls    <?xml version="1.0" encoding="utf-8"?>
               #<?xml version="1.0" encoding="windows-1251" ?>
               #<?xml version="1.0" encoding="iso-8859-1"?>
# Wanted to do some validation of the file here but could not get it to do what I wanted
               #valid=$( echo "$utf8data" | xmllint --xmlout - ) 2>> $DEBUG
               #echo -e "\n$LINENO Valid=$Valid*" >> $DEBUG
         fi
# If $expression is blank or is the same as the source then use a wildcard
         if [ "$expression" = "$subscription" ] || [ -z "$expression" ]
            then
               expression="."
         fi
# Parsing xml depending on where the torrent url is located inside <link> tags or as the value for the attribute enclosure
         if  fgrep -iq enclosure <<< "$xmldata"
            then
                file=$(echo "$xmldata" | xsltproc -v --novalid parse_enclosure.xsl - 2>> DEBUG )
           else
              file=$(echo "$xmldata" | xsltproc -v --novalid parse_link.xsl - 2>> DEBUG)
        fi
        file=$(tr ' \\\t\r' '_/__' <<< "$file") >> $DEBUG # doesnt work on Synology DS207
         file=${file//\ /_}
        file=${file//\\/}
        for url in $file
        do
            if  echo "$url" | egrep -i $expression &> /dev/null
              then
                 torrent=$(sed 's/\([^#]*\)#.*/\1/' <<< "$url")
# Another case to cope with demonoid
                 if echo "$subscription" | egrep -i "demonoid" &> /dev/null #should beable to do this without nameing an exception
                    then
#                       echo "$LINENO Demonoid Torrent" >> $DEBUG
# the parsed page has no download link just a link to a html page with an embedded download link and login .
# got a match download the html page
                        downloadpage=$(wget $(echo "$url" | sed 's/\([^#]*\)#.*/\1/') -q -O -)
# the line we need is   example <a href="/files/download/2208330/775222">Click here to download the torrent</a>
                                          #   <a href="/files/download/2208289/6976998">Click here to download the torrent</a>
# had problems here sed seemed to only return the entire webpage so had to break it down into a smaler chunk firs
                        downloadurl=$(echo "$downloadpage" | grep -i '<a href="/files/download/[0-9]*/[0-9]*">Click here to download the torrent</a>')
                        torrenturl=$(echo "$downloadurl" | sed 's/.*href="\(\/files\/download\/[^"]*\)">Click here to download the torrent<\/a>.*/\1/' )
                        torrent="http://www.demonoid.com$torrenturl"
# Put the new torrent file url back into the $url catch file entry will be correct.
                        torrentname=$( echo "$url" | sed -e 's/^.*\#//' -e :a -e 's/[\/]/-/;ta' -e 's/[\_]//;ta' -e 's/[]]//;ta' -e 's/[[]//;ta')
                        url="$torrent#$torrentname"
                  fi
# if not in the catch.log get the torrent
                 if ! fgrep -i "$torrent" catch.log > /dev/null
                  then
#                     echo -e "$LINENO Not in catch.log" >> $DEBUG
# URL and Mininova fixer
                        torrent=$(echo "$torrent" | sed -e "s/mininova.org\/tor/mininova.org\/get/g")
# parse the filename from the end of the $url variable (after the #) and get rid of some of the crap
                     torrentname=$( echo "$url" | sed -e 's/^.*\#//' -e :a -e 's/[\/]/-/;ta' -e 's/[\_]//;ta' -e 's/[]]//;ta' -e 's/[[]//;ta')
# append .torrent on the end
                     torrentname=$torrentname.torrent
# File may not be in Catch.log but still remain on disk
                        if [ -f "$torrentdir$torrentname" ]
                           then
                              echo "File with name $torrentname exists NOT DOWNLOADING!!" >> $DEBUG
                           else
# Get the torrent, name it correctly and put it in the right directory
# First check if we need to load the cookie
                              if ! [ -z "$private_feed" ]
                                 then
                                    echo "$LINENO loading cookie for wget $torrent file" >> $DEBUG
                                    if wget --connect-timeout=10 --tries=2 -qncH --load-cookies="$cookie_file" -O $torrentdir$torrentname $torrent
                                       then
                                          echo -e "$torrentname downloaded using $cookie_file from $torrent" >> $DEBUG
                                       else
                                          echo -e "Failed to get $torrentname using $cookie_file from $torrent" >> $DEBUG
                                 fi
                                 else
                                    if wget --connect-timeout=10 --tries=2 -qncH -O $torrentdir$torrentname $torrent
                                       then
                                          echo -e "$torrentname downloaded from $torrent" >> $DEBUG
                                       else
                                          echo -e "Failed to get $torrentname from $torrent" >> $DEBUG
                                 fi
                           fi
                     fi # droping out here so that the check for 0kb and adding to the catch.log fire
# Delete the torrent file if it's an empty file
                     if [ -s "$torrentdir$torrentname" ]
                         then
# ***to fix or leave alone name change from sanitization but will break if ! fgrep -i "$torrent" catch.log
                           echo "$url" >> catch.log
                        else
                            echo -e "\n$torrentname is 0kb... deleting..." >> $DEBUG
                            rm $torrentdir$torrentname
                     fi
                 fi
           fi
           echo "$url" >> rssdata
         done
   fi
done < $PRIVATEFEED
echo -e "Run Completed Check for errors above $(date)" >> $DEBUG


I ended up installing iconv from the libiconv library but on my travels I also installed via ipkg gcc, ruby and a few others that some posters had indicated would allow the use of iconv. They do not need to be installed.

The only versions of libiconv I could find were tars so I needed to install toochains the method I chose was to implement
Code: Select all
ipkg install optware-devel

This takes some time to run and seem to install everything including the kitchen sink.

I settled on this page http://www.gnu.org/software/libiconv/ for the information on libiconv and downloaded libiconv-1.13.tar.gz from ftp://ftp.gnu.org/gnu/libiconv/libiconv-1.13.tar.gz
running the usual commands

tar xvfz libiconv-1.13.tar.gz
./configure --prefix=/usr/local
make
make install

as you would with any tar

Hope this is of use to some of you or inspires you to try something yourself as it did me.

Have fun
John
John_P
I'm New!
I'm New!
 
Posts: 4
Joined: Wed Apr 07, 2010 2:30 pm

Next

Return to Torrent Engines/Download Station Mods

Who is online

Users browsing this forum: No registered users and 1 guest