I'm Annoyed at Crashplan Now - NTH Among All

Plesk, Updating CentOS & sftp-server HLDS on EC2
!"# %&&'()* %+ ,-%./01%& &'2

Ive been using crashplan for the last 10 months the combination of low price, family pack of 10 computers
and Windows & Linux support means it beat out BackBlaze for my online backup of choice. However, Ive run
into multiple problems with it, all in one day.
CrashPlan was seemingly stalling when trying to backup my latest set of pictures Itd be stuck at Analysing
2012-07-29 > _MG_6076.xmp for a long time, before (seemingly) moving onto the next .xmp le. This was
ridiculous, so I looked into why it was doing that, and found a symptom:
0. CrashPlan was starting & stopping for no rhyme or reason.
Spoiler: It was dying and being restarted by a watchdog of
some sort.
I went looking for what could cause it to start & stop with such regularly, and found the rst problem:
1. Their software is memory hungry. Ridiculously so.
By default its setup to use a maximum of 256MB of RAM. This is a hard limit imposed on the Java VM when it
runs. I have it running on my media server, which has been specced with 1GB of RAM. Itd regularly hit the
max, but it didnt seem to have problems, so I chalked it up to the use of Java and left it at that. Im not the
only guy whos noticed this: One guy has it hitting 1.5GB of RAM.
However, the hard limit led me to another problem:
2. The CrashPlan program crashes and burns
Ive been getting weekly reports on my backups since I installed it. I quite enjoyed this because my media
server is headless, so set-it-up-and-forget-it backups were awesome. Id periodically go in via VNC and
check up, and the desktop interface reported everything was just ne.
Except it wasnt. It seems that beyond a certain number of les, Java hits the hard memory limit and dies:
[07.31.12 08:13:52.777 ERROR QPub-BackupMgr backup42.service.backup.BackupController] OutOfMemoryError
occurred...RESTARTING! message=OutOfMemoryError in BackupQueue!
Because the le set changes very irregularly, once it starts crashing, itll continue to crash until you intervene
and manually raise the memory limit. I raised the memory limit to 1.5GB, and Javas only using 932MB, so
theres some headroom for growth if necessary.
In the conguration directory, I found a bunch of restart.log les while hunting for the le which dened the
memory limits. Upwards of 260k of them . (Im not kidding, I started a new SSH session to kill ls the rst time I
tried to list the directory because it was Taking. So. Long. I actually thought ls crashed.)
Each and every le seems to have been created when the CrashPlan Engine restarts. So that means CrashPlan
ran out of memory and restarted at least 260 thousand times without me knowing.
Which leads me to the third problem:
RECENT POSTS
TF2 on DigitalOcean
Fixing a mangled NTFS partition: success
Musings on the Mythical Man-Month
Chapter 2
Musings on The Mythical Man-Month
Chapter 1
CyanogenMod 11 Nightly on a GSM Galaxy
Nexus
July 2012
M T W T F S S
Jun Aug
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
BLOGROLL
Coding Horror
Je!rey Friedl's Blog
Joel on Software
Kalzumeus Software
Rands in Repose
The Daily WTF
The Old New Thing
You Are Not So Smart
FRIENDS
'Uncle Edna'
Justin
Levy XD
The Wull
ARCHIVES
July 2014
April 2014
December 2013
October 2013
September 2013
June 2013
May 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
May 2011
April 2011
March 2011
February 2011
November 2010
October 2010
September 2010
March 2010
February 2010
December 2009
July 2009
December 2008
Are you trying to make stupidity thread-safe?
nnTThh aammoonngg aallll
Home About Code and stu! Xen on Fedora
I'm annoyed at Crashplan now - nTh among all http://kyl191.net/2012/07/im-annoyed-at-crashplan-now/
1 of 6 10/09/14 8:56 am
3. Backups have at least one edge case where theyll fail silently.
This is a screenshot of the most recent CrashPlan report that I got sent:
Spoiler: Last Backup: 7 mins is false. It might have connected 7 mins ago, but it hasnt been backing up.
First is a laptop that hasnt been connected, so Ill ignore that. But helium is my media server, and thats
running just ne, right?
Nope. Backups have been failing since March 16, based on the rst restart.log le I had.
I had an easy way to check the main thing that I was backing up was my pictures, and those are sorted
automatically into folders based on the date they were taken. So I pulled up the web browser to look at the
restorable les after restarting CrashPlan.
My list of picture folders on the CrashPlan web interface
And promptly had a mini-freak out. There was a gap between 2012-03-13 and 2012-07-29. And not just
because I hadnt been taking photos that much.
Which means one thing: backups werent succeeding, but I was told everything was OK.
This was Not. Good. And I was Not. Impressed.
My Last backup times apparently meant Last connected. Which means:
4. The Backup status report is misleading
Ironically, the backup status report sample that CrashPlan has in their docs (dated June 15th 2009) xes this
problem.
I imagine people saw the two times and were confused, so CrashPlan merged the two times. In my case, thats
an oops since it covered up a serious problem.
Now, none of this is truly serious for the simple fact that I havent lost any data, and Im thankful for that. Im
just glad I found this before any data loss did occur, because Ive been lax in backing up to an alternate
location. In fact. its vaguely amusing at how an easily xed root cause (Crashplan running out of RAM to
presumably stash le metadata) coupled with a over-simplied status report and ine!ective monitoring
created a much more serious e!ect.
And how CrashPlan can probably fix this:
Quick x: Catch the outofmemory errors, and tell the user or resolve it automatically.
It seems to be based on the size of the les that people are backing up, not the number of les, which to me
makes me think that their block detection/hashing algorithm is whats chewing up memory. Id guess that
99% of the clients will probably never hi, say, 1.5TB of les.
Search
2 of 6 10/09/14 8:56 am
COMMENTS (24) TRACKBACKS (1) RELATED POSTS
backup, crashplan, sysadmin
But for the 1% that do, notify them instead of failing silently. I could have had the problem resolved quickly if I
had been told about it. Missing a week (assuming weekly status reports) is far preferable to missing 4 and a
half months.
Alternatively, where its just a ag in a command line, rewrite the ag automatically. UAC on Windows will
cause some problems, but CrashPlan is writing to the Program Data folder perfectly ne, so moving the ags
there should work. Some fancy scaling could be brought in (eg. limit it to no more than 1/2 the system RAM,
up the size by 32MB each time you hit an OOM error) to make it even
Long term x: Drastically reduce memory usage.
Im backing up ~3TB of les and its using just under a gig of RAM to hold everything. It sounds pretty
unoptimal, especially considering that the error is a heap error (which tends to mean memory leaks
somewhere). Cant speculate much without knowing the internals, so I wont.
And as far as I can tell there are no docs on the Crashplan website that explain how to up the memory limit.
So for the record,under Windows the cong le is at C:\Program Files\CrashPlan\CrashPlanService.ini, and at
/usr/local/crashplan/bin/run.conf on Linux.
Youre looking for the option -Xmx256M I changed it to -Xmx1536M, you can change it to something
and test it.
Tip for Windows users: Run your text editor as Administrator if youre using Vista/7. Were trying to edit a le
in Program Files, UAC will silently block saves, which is incredibly annoying until you know what causes it.
Also, I thought they might at least standardize on a le location, if not name, where theyre using supposedly
cross-platform Java. But nope!
Google+
Kyle Lexmond
This entry was posted on July 31, 2012, 4:22 pm and is led under Sysadmin. You can follow any responses to this entry
through RSS 2.0. You can leave a response, or trackback from your own site.
#1 by Aubrey Bailey on August 2, 2012 - 10:27 pm
agreed. Once you get up around 3TB everything goes to hell. If you get a little past that it actually
(repeatably with a clean install on ubuntu 10.04) will cease to communicate. apparently the more
expensive versions dont have these problems as my university has a shared one with ~20TB
#2 by Kyle Lexmond on August 3, 2012 - 2:15 am
Wish I could say Nah, thats got to be some unrelated cause, but I can totally believe that itd
seize up at >3TB.
Im at 2.7TB right now, though Im about to break it up into 2 separate systems thankfully.
#3 by Aubrey Bailey on August 3, 2012 - 3:01 am
I can only reproduce it on the one system, but I only have one system with >3TB to test. I have 3
other nodes in the same cloud. Their tech support is ba"ed and gave me a line about unsupported
modications. Its a fairly vanilla 10.04 upgraded to 11.04 through the dist-upgrade command.
Other than that Ive congured the hell out of apache.
The best part is that even archiving di!erent subsets of the total 7TB, it breaks just before 3.5
every time before nishing the initial sync.
Maybe Ill blow it all up and try again at some point, but put me down as unsatised.
#4 by Kyle Lexmond on August 6, 2012 - 4:56 pm
Jeez. Thats annoying, particularly where Ill be likely to use more storage in the near future itll
be fairly easy for me to hit 3.5TB on one system now that Im consolidating everything onto one
system to move to uni.
At least I can handle editing the cong les cant imagine what someone else would have to do.
=|
#5 by Kyle on December 19, 2012 - 9:24 pm
I ran into this problem recently too. Under Crashplan Settings > Buckup > Advanced Settings there
is an option to Watch le system in real time. It seems to me that the problem either has to do
with the number of les being backed up, or the total size of the backup (perhaps both) that
causes this issue. Any insight as to whether disabling real-time le system monitoring could curb
this problem?
#6 by Kyle Lexmond on December 20, 2012 - 12:10 am
Well, disabling the real-time watch had some e!ect my memory usage dropped by something
like 100MB I think. Total memory usage is still o! the charts insane though I was seeing ~2.5GB
used, sadly I didnt take a screenshot.
Im increasingly sure memory usage is a function of the number of les, not the size of les, based
on the fact that the memory usage dropped a fair bit when I compressed a bunch (~50k) of old
archive les into a single zip le. (By a few hundred MB if Im remembering correctly, since I
remember being happy to bring it down to ~1.5GB used.)
3 of 6 10/09/14 8:56 am
#7 by xYZ on December 21, 2012 - 7:28 am
Ive got around 2.7T uploaded. Notorious memory leaks. ~1Mbps upload speed (on 50Mbps link).
So Id say a soft limit on this unlimited-hah service is ~2T.
#8 by AgentX on January 1, 2013 - 6:13 pm
Thank you so much for this blog post, it solved the issue I was having with CrashPlan silently dying
and refusing to backup past 94% on my 1.5TB data set. I adjusted the memory limit and watched
the memory usage in task manager jump over the previous limit of 512MB (presumably where it
was dying before).
Again, thanks. Is there anyway to get this information to CrashPlan so the x can be a little more
accessible?
#9 by Kyle Lexmond on January 2, 2013 - 1:43 pm
Im glad it helped!
As for CrashPlan Well, they know about it I opened a ticket with them, and they replied with
practically the same information. I dont know why they dont have it in their support site since so
many people seem to be having the same problem and their response seemed very copy-and-
pasted too. Searching Xmx site:support.crashplan.com just gave a single link to a guide on
stopping and restarting the backup engine.
(Yep, its copy-and-pasted. The email I got was the same as the email another guy got, with the
exception of Windows vs Mac.
#10 by trk on January 9, 2013 - 5:21 am
Gday,
Glad I found this post. I was the same as you every so often Id get an email telling me how great
the backups were going so I gured things were going swimmingly. Then I happened to notice that
the percentage done hadnt increased in months uh oh.
The annoying part is that its on a headless box sitting in a corner so I never had the tray icon to do
a quick check. Ended up using Xmimg and putty with X11 forwarding to get a pretty GUI on my
Windows desktop to check it over. Between that and tailing the service.log.0 le I saw the error you
mentioned as it restarted it scanning Java was running out of memory and crashing. I already
have real time le watch disabled (and other tweaks mentioned on the Crashplan Pro support
site) but recently made a backup of a website that handles lots of small les which I think pushed
the memory usage over the edge.
Used your suggestion (but with -Xmx2048m.. Ive got 8GB of RAM in that box to play with so a bit
more wont hurt) and its happily syncing my ~2TB / ~2 million le count backup again.
Thanks for taking the time to post this, it helped a lot.
PS My run.conf had -Xmx512m as default (which clearly wasnt enough)
#11 by Pavel Uhliar on February 18, 2013 - 4:26 pm
Hi, guys
Thanks for info in this thread. It seems to be same problem for me, but I seem to hit it with just
below 600GB, 92k les (cca 82% of my photos backup).
#12 by Torleif on March 10, 2013 - 3:56 pm
According to their customer support there is a native client on the way which will hopefully have a
lot better memory usage than the current Java thing
#13 by Pavel Uhliar on March 13, 2013 - 6:26 pm
Great news, I am becoming tired of having 4G of RAM taken just after windows startup :/
#14 by me on March 14, 2013 - 10:14 pm
I can conrm that increasing the memory allocation to 1.5G or more resolves most of the issues
with Crashplan stuck and/or crashing somewhere around 400K les (for me). The low memory
issue also resulted in very high cpu usage (on a dual quad core Xeon Poweredge server with 16GB
of RAM), more than most java apps ive seen before.
#15 by Guillaume Paumier on May 2, 2013 - 2:30 pm
Hi. I found your post while re-investigating the issue.
I encountered the same problem last year (CPU cycling as the engine was crashing and restarting
over and over) and the CrashPlan Support guys told me how to up the memory limit. It worked very
well so I was happy.
Until now. Ive reached ~4.4 TB of data backed up, but now no matter how much I up the memory
limit, it wont do. It crashes a few minutes at most after it starts. This is very annoying.
If its true that a more e#cient native client is on the way, Im looking forward to it, because right
now Im pretty much stuck.
#16 by Mark Laing on May 4, 2013 - 5:04 pm
Superb tip worked like a charm. Thanks again for posting!
4 of 6 10/09/14 8:56 am
Mark
#17 by Don on September 9, 2013 - 9:24 am
Have done all that my problem is slightly di!erent: VERY high CPU usage particularly when Im
away, but even when present. So much so that it interferes with my work by pausing almost
everything for long periods. Have limited CPU to 10% when user is present, 60% when away.
Backups run ne (though its not big < 100 MB there are many les accounting programs make
a LOT of tiny les!).
It's all about hogging the CPU and RAM (8 GB on the box, 2GB given to Crashplan as above). When I
connect via Terminal Services/Remote Desktop I have to "sleep" Crashplan to make the machine
usable.
Any ideas?
#18 by Adrian N. on September 29, 2013 - 7:02 pm
What do you think its the approach to try to make crashplan work on a headless client (Qnap) with
only 256mb of ram installed, other than replacing it for another NAS?
Thanks
#19 by Kyle Lexmond on September 29, 2013 - 10:40 pm
A lot of swap space would probably do the trick. Id guess that CP doesnt actively touch stu!
(hashes I guess) that it creates in memory, so the OS would be able to swap stu! out to disk.
#20 by Mike on November 5, 2013 - 2:16 pm
Im running the windows client, with just over 346GB, and it was taking 400Megs of RAM.
Contacted support, and was told a new native client is coming. So its only been a year. I guess Ill
stick with Carbonite.
#21 by yuri on December 25, 2013 - 6:59 pm
I have about 4TB to back up. About two weeks ago I started seeing the restarts. Support article
recommends to allocate 1GB memory for every TB of back up, hence my xmx value should be
4096MB. (I have 32GB RAM, BTW) . I opened .ini le and changed the value. When I tried to start
the service, it starts and immediately stops I decreased the value until I can restart the service.
For me its 1536M..
I guess I will start to look for alternative backup solution
#22 by Jay Heiser on March 26, 2014 - 7:24 pm
Lets start by saying that Ive got 8Gigs of RAM on my Windows 7 (64bit) machine, and my net
bandwidth usually measures at about 60Mbs uplink.
I tested CrashPlan for a couple days, and it seemed to chunk right along about 5Mbs, which wasnt
taking advantage of my full bandwidth, but was fast enough to be useful. So I sent for the Seed
drive, lled it up to the max 1 TB, and sent it back.
Someone installed my data, notied me it was ready, and then left for 2 days. After her return, and
after wasting 15 hours of my time doing a local chkdsk, they gured out what they hadnt done
right, and it started in on the remaining 600 Megs of my data.
Late Fri night, it comes to a complete halt. So I ask for help, they provide me a suggestion to
improve memory (remember, Ive got 8Gig). He goes away for 2 days, and I nd that now I cant
even run their client.
We spend two days dicking around trying to get the client working, with the nal solution being
reinstall it.
Now Im back to the Crashplan saying that it has completed the backup/100,997 les left. It seems
that my support guy is gone again.
Maybe online backup of 1.6TB of pictures is unrealistic with todays technology? Its not for the lack
of bandwidth or system memory on my side.
Ive spent 22 days to get 2/3 of my data uploaded, and Ive spent hours dicking with this, and
getting half-assed support from 2 people who are either over-worked, or just dont pay attention.
Has anybody failed with CrashPlan, but succeeded with BackBlaze?
#23 by Cuong on August 20, 2014 - 7:41 pm
Weve had Crashplan installed by an IT company on an XP machine to back up our Synology NAS to
their server, while they monitor status emails for us. We have only 60G of data.
Everything went well and they got OK messages everyday. But recently our NAS got hijacked
(Synolocker, everything was encrypted) and we thought it would be a matter of just restoring from
the backup.
However, as it turns out from the backup log, the backup scan got messed up about two months
ago and since then it backs up almost no data at all. Although all our shares are selected for
backup, every rescan results in only 1.9G of data, the rest is simply ignored.
So we have lost all data from the last 2 months, a lot of work down the drain.
Crashplan goes down the same drain and we are looking for another solution.
5 of 6 10/09/14 8:56 am
Fusion theme by digitalnature | powered by WordPress
Entries (RSS) and Comments (RSS) ^
Submit Comment
Name (required)
E-Mail (required) (will not be published)
Website
Notify me of follow-up comments by email.
Notify me of new posts by email.
#24 by Aubrey on August 20, 2014 - 8:56 pm
Well, to be fair, you had a 2 year old error. Also, you are running an unsupported, unpatchable OS.
This is mostly your fault. I say mostly because honestly the jre memory thing is pretty silly.
Seriously, go reconsider your IT choices.
6 of 6 10/09/14 8:56 am

I'm Annoyed at Crashplan Now - NTH Among All

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

I'm Annoyed at Crashplan Now - NTH Among All

Загружено:

Авторское право:

Доступные форматы

Plesk, Updating CentOS & sftp-server HLDS on EC2

!"# %&&'()* %+ ,-%./01%& &'2

Вам также может понравиться