Block Search Engines

Questions about the Synology Photo Station can be placed here.
Forum rules
Synology Community is the new platform for the enthusiasts' interaction, and it will soon be available to replace the Forum.
User avatar
Goner
Seasoned
Seasoned
Posts: 590
Joined: Tue Mar 06, 2012 2:27 pm
Location: Rotterdam, Netherlands

Re: Block Search Engines

Unread post by Goner » Sat Apr 21, 2012 1:42 pm

gwp wrote:I then went to http://tool.motoricerca.info/robots-checker.phtml to test it, but it can only find it if I have port 80 forwarded on my router and enabled on my DS.
I have port 80 open and forwarded to my DS and it still doesn't find it ... :?

NAS : DS212j with 2 ST2000DL003 in SHR / DSM 5.2-5644 Update 5
LAN : Fritz!Box 7170, 5 Devolo 200/500Mbps homeplugs, 2 5-port switches, Maxxter ACT-WNP-RP-002
HW : Raspberry Pi 2B & 3B, Conceptronic CHD3NET, ACRyan Playon!HD, Eminent EM7075dts, Wii, Wii U, PS2, D-Link DCS-930L
OS : Linux Mint 16 Cinnamon

User avatar
myCloud
Skilled
Skilled
Posts: 648
Joined: Fri Mar 23, 2012 11:28 am

Re: Block Search Engines

Unread post by myCloud » Sat Apr 21, 2012 5:08 pm

@gwp, are you using the https in the URL you give the robot checker when you don't port-forward port 80? https should cause the checker to use port 443 (I can't try this myself because all my DS ports are in a VPN).

Try giving it https://yourdomain/photo/robots.txt

In fact, just put this in the URL field of a browser from a machine outside your router.
DS 1512+ w/3GB, 5 x 3TB Seagate ST3000DM001 8.2TB RAID 6, half files/half Time Machine.
Icy Dock MB559U3S-1SB enclosure w/4TB Hitachi UltraStar via USB 3 for files backup
UVERSE to AirPort Extreme + 2 AirPort Express w/speakers. TRENDnet TV-IP312WN camera
CyberPower CP1500PFCLCD Sine Wave UPS
DSM 4.1-2661 w/SSH + SFTP, VPN Server, Syslog Server, Media Server, Mail Server, Mail Station,
Audio Station, Surveillance Station, Photo Station, Web Station - DS Apps on iPad & iPod Touch.

gwp
I'm New!
I'm New!
Posts: 3
Joined: Sun Mar 04, 2012 8:10 pm

Re: Block Search Engines

Unread post by gwp » Sun Apr 22, 2012 3:24 am

myCloud wrote:@gwp, are you using the https in the URL you give the robot checker when you don't port-forward port 80? https should cause the checker to use port 443 (I can't try this myself because all my DS ports are in a VPN).

Try giving it https://yourdomain/photo/robots.txt

In fact, just put this in the URL field of a browser from a machine outside your router.
When inputting https in the robot checker it just reloads the page and asks me to re-enter the URL. When I put the URL in a browser outside my router, is it just supposed to show the content of the txt file?
User-agent: *
Disallow: /

Otherwise, does it seem like I have my security set up properly for https on photostation?

User avatar
myCloud
Skilled
Skilled
Posts: 648
Joined: Fri Mar 23, 2012 11:28 am

Re: Block Search Engines

Unread post by myCloud » Sun Apr 22, 2012 11:07 am

Being able to view the contents of robots.txt with the URL in the browser ensures you anyone can read it.

Your security falls toward the fairly easy to use, not too tight end of the spectrum. Your viewers need not worry about HTTP vs HTTPS -- they just need to know your domain name AND /photo. On the other hand, with plain HTTP available and no redirection, their usernames and passwords will be able to be seen on an open, unencrypted wifi network -- at the coffee shop, for example.

One way to encourage the use of HTTPS during their login is to have a main HTTP webpage for your domain that viewers access without username and password, with an easy to find HTTPS link they click on to get to photos and have to log in (you need a certificate, of course). If they bookmark your photos page, the HTTPS link will be in the bookmark. Using a robots.txt in your main webpage ensures the bots see it at the root of your domain website, where they expect it to be, rather than in /photo.

My own personal preference is to have two DiskStations, an "expendable" one-bay for public data, such as a website, in a DMZ and the main one for private data on the inside LAN accessible only via a VPN through its firewall, with different usernames and passwords than on the one-bay. By expendable, I mean not the only copy of the data (which should never be the case anyway). Some would consider that overkill.
DS 1512+ w/3GB, 5 x 3TB Seagate ST3000DM001 8.2TB RAID 6, half files/half Time Machine.
Icy Dock MB559U3S-1SB enclosure w/4TB Hitachi UltraStar via USB 3 for files backup
UVERSE to AirPort Extreme + 2 AirPort Express w/speakers. TRENDnet TV-IP312WN camera
CyberPower CP1500PFCLCD Sine Wave UPS
DSM 4.1-2661 w/SSH + SFTP, VPN Server, Syslog Server, Media Server, Mail Server, Mail Station,
Audio Station, Surveillance Station, Photo Station, Web Station - DS Apps on iPad & iPod Touch.

Redflee
Beginner
Beginner
Posts: 29
Joined: Fri Feb 17, 2012 2:10 am

Re: Block Search Engines

Unread post by Redflee » Fri Apr 27, 2012 10:19 pm

deltaf508 wrote:Figured it out! I was telnetting in as admin assuming that would the the highest authoritative account on my synology. Evidentially not. I read root somewhere and it hit me. Telnet in with "root" using the admin password and bingo. It created it like it should. So since there does not seem to be a definitive explanation of what to create and where I'll post what I've done and everyone can let me know if this is right or not. Seems there are still others wondering how to do this as well.
  • Enable telnet on NAS
  • Telnet in as root using admin password
  • Use command "cd /volume1/@appstore/PhotoStation/photo" to go to appropriate folder
  • Create robots.txt file using vi edtitor as follows:
    "vi robots.txt" at command prompt
    (enter i) to go into (i)nput mode then type:
    User-agent: *
    Disallow: /
    press (esc) to exit out of (i)nput mode
    ":wq" to save the file
  • enter "ls -la" at command prompt to confirm your robots.txt file is there.
Is this correct? If so, it seems synology could put a simple check box in the gui somewhere and automatically create this file?

Now onto my next question which was does this prevent search engines from reporting just your "main" photo station or does it also prevent the user account photo stations (ie. photostation/~username/photo)? (which is really what I'm after since I will be sharing one of those out to family.)

Thanks everyone!
Thanks very much...drop-dead simple.
DS212

johndigital
Beginner
Beginner
Posts: 26
Joined: Thu May 05, 2011 8:20 am

Re: Block Search Engines

Unread post by johndigital » Sun May 06, 2012 11:04 am

Please excuse my lack of knowledge. How do I find out if Google is looking at my Photo Station.
Thanks
John

User avatar
myCloud
Skilled
Skilled
Posts: 648
Joined: Fri Mar 23, 2012 11:28 am

Re: Block Search Engines

Unread post by myCloud » Sun May 06, 2012 12:26 pm

DS 1512+ w/3GB, 5 x 3TB Seagate ST3000DM001 8.2TB RAID 6, half files/half Time Machine.
Icy Dock MB559U3S-1SB enclosure w/4TB Hitachi UltraStar via USB 3 for files backup
UVERSE to AirPort Extreme + 2 AirPort Express w/speakers. TRENDnet TV-IP312WN camera
CyberPower CP1500PFCLCD Sine Wave UPS
DSM 4.1-2661 w/SSH + SFTP, VPN Server, Syslog Server, Media Server, Mail Server, Mail Station,
Audio Station, Surveillance Station, Photo Station, Web Station - DS Apps on iPad & iPod Touch.

drash
I'm New!
I'm New!
Posts: 3
Joined: Sat Jun 09, 2012 12:06 am

Re: Block Search Engines

Unread post by drash » Sat Jun 09, 2012 12:38 am

deltaf508 wrote:Figured it out! I was telnetting in as admin assuming that would the the highest authoritative account on my synology. Evidentially not. I read root somewhere and it hit me. Telnet in with "root" using the admin password and bingo. It created it like it should. So since there does not seem to be a definitive explanation of what to create and where I'll post what I've done and everyone can let me know if this is right or not. Seems there are still others wondering how to do this as well.
  • Enable telnet on NAS
  • Telnet in as root using admin password
  • Use command "cd /volume1/@appstore/PhotoStation/photo" to go to appropriate folder
  • Create robots.txt file using vi edtitor as follows:
    "vi robots.txt" at command prompt
    (enter i) to go into (i)nput mode then type:
    User-agent: *
    Disallow: /
    press (esc) to exit out of (i)nput mode
    ":wq" to save the file
  • enter "ls -la" at command prompt to confirm your robots.txt file is there.
Is this correct? If so, it seems synology could put a simple check box in the gui somewhere and automatically create this file?

Now onto my next question which was does this prevent search engines from reporting just your "main" photo station or does it also prevent the user account photo stations (ie. photostation/~username/photo)? (which is really what I'm after since I will be sharing one of those out to family.)

Thanks everyone!
Thank you for this!

FYI, I might add to this that after doing this I went to http://tool.motoricerca.info/robots-checker.phtml to check on the validity of the robots.txt I had just created, and it mentions that the robots.txt NEEDS to be at the root level of the domain, not in the photo folder.

So, after "cd /volume1/web" and creating a robots.txt there, http://tool.motoricerca.info/robots-checker.phtml stopped complaining and gave me its seal of approval.

deltaf508
Trainee
Trainee
Posts: 10
Joined: Thu Jul 14, 2011 11:29 pm

Re: Block Search Engines

Unread post by deltaf508 » Sat Jun 23, 2012 10:26 pm

drash wrote:
Thank you for this!

FYI, I might add to this that after doing this I went to http://tool.motoricerca.info/robots-checker.phtml to check on the validity of the robots.txt I had just created, and it mentions that the robots.txt NEEDS to be at the root level of the domain, not in the photo folder.

So, after "cd /volume1/web" and creating a robots.txt there, http://tool.motoricerca.info/robots-checker.phtml stopped complaining and gave me its seal of approval.
This makes sense. When I ran the robots checker I got the same message "robots.txt NEEDS to be at the root level of the domain". However, I'm not seeing the "/volume1/web" folder??? Anyone know why I would not see this?

Thanks in advance.

Boss77
Beginner
Beginner
Posts: 21
Joined: Wed Aug 22, 2012 3:57 am

Re: Block Search Engines

Unread post by Boss77 » Fri Sep 14, 2012 2:46 am

Has anyone done this on OSX using Terminal? The instructions below don't seem to work.

deltaf508 wrote:Figured it out! I was telnetting in as admin assuming that would the the highest authoritative account on my synology. Evidentially not. I read root somewhere and it hit me. Telnet in with "root" using the admin password and bingo. It created it like it should. So since there does not seem to be a definitive explanation of what to create and where I'll post what I've done and everyone can let me know if this is right or not. Seems there are still others wondering how to do this as well.
  • Enable telnet on NAS
  • Telnet in as root using admin password
  • Use command "cd /volume1/@appstore/PhotoStation/photo" to go to appropriate folder
  • Create robots.txt file using vi edtitor as follows:
    "vi robots.txt" at command prompt
    (enter i) to go into (i)nput mode then type:
    User-agent: *
    Disallow: /
    press (esc) to exit out of (i)nput mode
    ":wq" to save the file
  • enter "ls -la" at command prompt to confirm your robots.txt file is there.
Is this correct? If so, it seems synology could put a simple check box in the gui somewhere and automatically create this file?

Now onto my next question which was does this prevent search engines from reporting just your "main" photo station or does it also prevent the user account photo stations (ie. photostation/~username/photo)? (which is really what I'm after since I will be sharing one of those out to family.)

Thanks everyone!

klen
Versed
Versed
Posts: 200
Joined: Wed Oct 21, 2009 3:05 pm

Re: Block Search Engines

Unread post by klen » Mon Dec 03, 2012 12:30 am

The robots.txt file in the root of your webshare is enough.
You can put it in /var/services/web or /volume1/web folder
Or easier, temporarily share the web volume via AFP or Samba via the DSM console and put the file via the network on it.
If you wish you can stop the share again afterwards.

But just a word of caution, if you are saving one of the RSS urls in for instance google reader, google will continue to visit your site, just to see if there are new entries in your feed.

For instance, it is very easy to automatically upload your photos from your mobile device to your NAS once you enter your home-wifi, and the pictures might be added to a Public folder or to a 'Mobile-uploads' folder. But if you happen to be subscribed to it, you can expect crawlers to check for new entries.
DS-412+
DSM latest
DS-107+
DSM 2.3-1157
DS-112j
DSM latest

kingfloo
I'm New!
I'm New!
Posts: 8
Joined: Thu May 26, 2011 5:03 pm

Block Search Engines

Unread post by kingfloo » Sun Dec 23, 2012 10:18 pm

I have tried to read all advice here on blocking search engines from my PhotoStation.
So far, I have not succeeded, even after waiting a few days/weeks. Bing.com still finds it.

SITUATION
*********

I have this robots.txt file
User-agent: *
Disallow: /

... in the folder
/usr/syno/synoman/

I run DS 4.1 on a DS210j.

PROBLEM
*******

I hear people refer to a 'web' directory, but I usually don't find it where references say I would find it.
Example: /var/services, /Volume1, /usr/syno/synoman and so on do *not* contain a 'web' directory. I only found one under /usr/syno/synoman/phpsrc, and tonight I have copied the robots.txt file there as well.


I would appreciate any help why I don't find the 'web' directory where Iam apparently supposed to.
Thanks in advance!

/C
Last edited by kingfloo on Tue Dec 25, 2012 8:49 pm, edited 1 time in total.

User avatar
Seed
Versed
Versed
Posts: 225
Joined: Sun Mar 21, 2010 6:28 pm

Re: Block Search Engines

Unread post by Seed » Mon Dec 24, 2012 12:36 am

You should be able to see the robots.txt file by going to http(s)://<your_nas_ip>/robots.txt

The content is served from /var/services/web:

Code: Select all

cat /usr/syno/apache/conf/httpd.conf-user
<Directory "/var/services/web">

Code: Select all

curl -I http://(yournasip)/test
HTTP/1.1 404 Not Found

Code: Select all

cd /var/services/web

Code: Select all

echo test >> test.html

Code: Select all

[code]curl -I http://(yournasip)/test
HTTP/1.1 200 OK

If not then perhaps you need to "Enable Web Station" in "Web Services"
DS412+ 5.0-4458 | 4x3TB Seagate
CloudStation | CrashPlan | Time Machine | Surveillance Station

kingfloo
I'm New!
I'm New!
Posts: 8
Joined: Thu May 26, 2011 5:03 pm

Re: Block Search Engines

Unread post by kingfloo » Tue Dec 25, 2012 8:49 pm

Thanks a bunch. Web Station was not enabled (didn't think it was needed given that Photo Station could be accessed from www). Now that it is, I find the /var/services/web directory.

Appreciated.

/C

Moezer
I'm New!
I'm New!
Posts: 7
Joined: Thu Aug 15, 2013 2:12 am

Re: Block Search Engines

Unread post by Moezer » Sun Aug 18, 2013 6:27 am

Couple questions and please bare with me - I am a networking noob.

1) I believe the immediate previous poster answered my question but I want to verify. Can the search engines (Google, Bing, etc) index/see my photostation without "Web Station" being enabled?

2) If yes, the only way to stop this is with the robots.txt file mentioned on the previous 4 pages? HTTPS doesn't stop it? Requiring login doesn't stop it?

Appreciate the great info everyone - thank you!

Post Reply

Return to “Photo Station”