btrfs Deduplication on Synology with duperemove

Questions and mods regarding system management may go here
Forum rules
1) This is a user forum for Synology users to share experience/help out each other: if you need direct assistance from the Synology technical support team, please use the following form:

https://account.synology.com/support/su ... p?lang=enu



2) To avoid putting users' DiskStation at risk, please don't paste links to any patches provided by our Support team as we will systematically remove them. Our Support team will provide the correct patch for your DiskStation model.
grintor
I'm New!
I'm New!
Posts: 1
Joined: Thu Dec 29, 2016 4:03 pm

btrfs Deduplication on Synology with duperemove

Unread post by grintor » Thu Dec 29, 2016 4:23 pm

After reading about btrfs block-level deduplication capabilities https://btrfs.wiki.kernel.org/index.php/Deduplication, I expected to find support for this in my Synology NAS. When I didn't I decided to make it so.

I thought I would share with the community my findings. I statically compiled http://markfasheh.github.io/duperemove/ for the Synology. I compiled both the stable v0.10 and the v0.11 beta 4 (latest)

I ran it on my synology and it works, but it's much too memory intensive to just point at the root of the filesystem, unfortunately (even using --hashfile)

I wish synology would integrate btrfs deduplication into the OS. It could be done in a less resource intensive way -- just a low priority process that's always running in the background and looking for duplicate blocks -- saving it's findings to a file rather than memory.

duplicate blocks -- that's the amazing thing to me. This isn't a file-level feature, it's block-level. So even if two files are 90% different, it can deduplicate away that 10%.

Anyway, here are the binaries and my notes on how I statically compiled and ran it:

https://drive.google.com/open?id=0B211Q ... XdWTTNsUWM

User avatar
Eideen
Enlightened
Enlightened
Posts: 416
Joined: Sat Jun 16, 2012 11:57 am
Location: Norway

Re: btrfs Deduplication on Synology with duperemove

Unread post by Eideen » Sun Jan 08, 2017 2:57 pm

Great work man.

I send synology support and ask if they support it.
DS412+ with DX510/ DSM 6.1b / 2xWD 4TB red+2(+1)xWD6TB (RAID5), 240GB SSD
DS215j / DSM 6.0 / 2xSG3TB (RAID 1) / remote backup
cyberpower BS650E, 1x2TB external drive for backup.

Christian72D
I'm New!
I'm New!
Posts: 1
Joined: Fri Mar 21, 2014 10:21 am
Location: Germany

Re: btrfs Deduplication on Synology with duperemove

Unread post by Christian72D » Mon Apr 10, 2017 5:15 am

Can you tell me how to use the tool?
Do i need to run it any x hours per cron?
How do i need to start ist?

THIS was THE killing feature of btrfs for ME...

xRoThx
Trainee
Trainee
Posts: 11
Joined: Tue Sep 12, 2017 9:34 pm

Re: btrfs Deduplication on Synology with duperemove

Unread post by xRoThx » Sat Sep 30, 2017 8:25 pm

Hello Synology

I would like to bring this topic to your attention again.

Please consider this feature as for me, and my customers indirectly.. This could be a huge money-saver.

Maybe, as the need for personal use would be pretty low, you can make it as a paid application available on the application store.
I would be very interested.

Thank you for considering this.

Thomas :)

User avatar
Eideen
Enlightened
Enlightened
Posts: 416
Joined: Sat Jun 16, 2012 11:57 am
Location: Norway

Re: btrfs Deduplication on Synology with duperemove

Unread post by Eideen » Sat Sep 30, 2017 9:11 pm

According to Synology (I can't find the statement), and the Deduplication info on Btrfs, [Deduplcation] typically requires large amounts of RAM to store the lookup table of known block hashes.
DS412+ with DX510/ DSM 6.1b / 2xWD 4TB red+2(+1)xWD6TB (RAID5), 240GB SSD
DS215j / DSM 6.0 / 2xSG3TB (RAID 1) / remote backup
cyberpower BS650E, 1x2TB external drive for backup.

xRoThx
Trainee
Trainee
Posts: 11
Joined: Tue Sep 12, 2017 9:34 pm

Re: btrfs Deduplication on Synology with duperemove

Unread post by xRoThx » Sun Oct 01, 2017 7:54 am

Hello

Than they could offer it to all Synology Plus models with 8gb ram?
As most of the home users won't need this feature, they could be aiming on the higher end of their products.

Grtz
T

ByteSizedAlex
I'm New!
I'm New!
Posts: 1
Joined: Tue Oct 24, 2017 8:50 pm

Re: btrfs Deduplication on Synology with duperemove

Unread post by ByteSizedAlex » Tue Oct 24, 2017 8:56 pm

Eideen wrote:According to Synology (I can't find the statement), and the Deduplication info on Btrfs, [Deduplcation] typically requires large amounts of RAM to store the lookup table of known block hashes.
In-line dedupe certainly would require a large memory set to minimise the performance impact of hash lookup. If it was implemented as a background/scheduled process the memory requirement could be avoided with the acceptance the task takes longer to complete - it really just comes down to how accepting you are of a write impact. Would certainly be nice to have the option to run dedupe on a Synology, perhaps as a hidden/advanced option with caveats and disclaimers to try and avoid people implementing without realising the consequences.

Rhubarb
I'm New!
I'm New!
Posts: 7
Joined: Fri Mar 25, 2016 10:20 pm

Re: btrfs Deduplication on Synology with duperemove

Unread post by Rhubarb » Sun Apr 08, 2018 2:04 am

grintor wrote:After reading about btrfs block-level deduplication capabilities https://btrfs.wiki.kernel.org/index.php/Deduplication, I expected to find support for this in my Synology NAS. When I didn't I decided to make it so.

I thought I would share with the community my findings. I statically compiled http://markfasheh.github.io/duperemove/ for the Synology. I compiled both the stable v0.10 and the v0.11 beta 4 (latest)

I ran it on my synology and it works, but it's much too memory intensive to just point at the root of the filesystem, unfortunately (even using --hashfile)

I wish synology would integrate btrfs deduplication into the OS. It could be done in a less resource intensive way -- just a low priority process that's always running in the background and looking for duplicate blocks -- saving it's findings to a file rather than memory.

duplicate blocks -- that's the amazing thing to me. This isn't a file-level feature, it's block-level. So even if two files are 90% different, it can deduplicate away that 10%.

Anyway, here are the binaries and my notes on how I statically compiled and ran it:

https://drive.google.com/open?id=0B211Q ... XdWTTNsUWM
Hi Grintor, I realise it's now some time since your above post ( Thu Dec 29, 2016 3:23 pm). Just a couple of questions:
1) Are you still implementing this on your NAS?
I note that block level dedupe using the 'extreme binning' method referred to in the quoted pdf gives about a 13.3 times savings in storage space (4.5TB reduced to about .33TB - my guess is that that would apply to normal data files- emails, memos, documents, etc, but not media like movie.mkv's,.mp3s, etc)? What read and write speeds are you getting now/were you getting with your dedupe method?
2) What memory do you have in your Synology box. What Synology box/CPU; disk storage.
3) Given a Synology unit such as a DS3615xs or DS3617xs with 16GB RAM and, say 50TB storage space 40% populated (ie. 20 TB used), before dedupe: what would you expect in terms of dedupe performance; time to actually perform the initial dedupe; read and write speed following implementation; etc, etc.

tks, Rhubarb

likeoff
I'm New!
I'm New!
Posts: 1
Joined: Fri Jun 15, 2018 9:34 am

Re: btrfs Deduplication on Synology with duperemove

Unread post by likeoff » Fri Jun 15, 2018 9:39 am

You may use Docker to enable duperemove and file-state with block-state deduplication.

Here is step-by-step guide (use google translate as page in Russian language, but screenshots shows everything)

https://www.hwp.ru/articles/Nastraivaem ... gy_148308/

Post Reply

Return to “System Managment Mods”