Ive been using crashplan for the last 10 months the combination of low price, family pack of 10 computers and Windows & Linux support means it beat out BackBlaze for my online backup of choice. However, Ive run into multiple problems with it, all in one day. CrashPlan was seemingly stalling when trying to backup my latest set of pictures Itd be stuck at Analysing 2012-07-29 > _MG_6076.xmp for a long time, before (seemingly) moving onto the next .xmp le. This was ridiculous, so I looked into why it was doing that, and found a symptom: 0. CrashPlan was starting & stopping for no rhyme or reason. Spoiler: It was dying and being restarted by a watchdog of some sort. I went looking for what could cause it to start & stop with such regularly, and found the rst problem: 1. Their software is memory hungry. Ridiculously so. By default its setup to use a maximum of 256MB of RAM. This is a hard limit imposed on the Java VM when it runs. I have it running on my media server, which has been specced with 1GB of RAM. Itd regularly hit the max, but it didnt seem to have problems, so I chalked it up to the use of Java and left it at that. Im not the only guy whos noticed this: One guy has it hitting 1.5GB of RAM. However, the hard limit led me to another problem: 2. The CrashPlan program crashes and burns Ive been getting weekly reports on my backups since I installed it. I quite enjoyed this because my media server is headless, so set-it-up-and-forget-it backups were awesome. Id periodically go in via VNC and check up, and the desktop interface reported everything was just ne. Except it wasnt. It seems that beyond a certain number of les, Java hits the hard memory limit and dies: [07.31.12 08:13:52.777 ERROR QPub-BackupMgr backup42.service.backup.BackupController] OutOfMemoryError occurred...RESTARTING! message=OutOfMemoryError in BackupQueue! Because the le set changes very irregularly, once it starts crashing, itll continue to crash until you intervene and manually raise the memory limit. I raised the memory limit to 1.5GB, and Javas only using 932MB, so theres some headroom for growth if necessary. In the conguration directory, I found a bunch of restart.log les while hunting for the le which dened the memory limits. Upwards of 260k of them . (Im not kidding, I started a new SSH session to kill ls the rst time I tried to list the directory because it was Taking. So. Long. I actually thought ls crashed.) Each and every le seems to have been created when the CrashPlan Engine restarts. So that means CrashPlan ran out of memory and restarted at least 260 thousand times without me knowing. Which leads me to the third problem: RECENT POSTS TF2 on DigitalOcean Fixing a mangled NTFS partition: success Musings on the Mythical Man-Month Chapter 2 Musings on The Mythical Man-Month Chapter 1 CyanogenMod 11 Nightly on a GSM Galaxy Nexus July 2012 M T W T F S S Jun Aug 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 BLOGROLL Coding Horror Je!rey Friedl's Blog Joel on Software Kalzumeus Software Rands in Repose The Daily WTF The Old New Thing You Are Not So Smart FRIENDS 'Uncle Edna' Justin Levy XD The Wull ARCHIVES July 2014 April 2014 December 2013 October 2013 September 2013 June 2013 May 2013 March 2013 February 2013 January 2013 December 2012 November 2012 October 2012 September 2012 August 2012 July 2012 June 2012 May 2012 April 2012 March 2012 February 2012 January 2012 December 2011 November 2011 October 2011 September 2011 August 2011 May 2011 April 2011 March 2011 February 2011 November 2010 October 2010 September 2010 March 2010 February 2010 December 2009 July 2009 December 2008 Are you trying to make stupidity thread-safe? nnTThh aammoonngg aallll Home About Code and stu! Xen on Fedora I'm annoyed at Crashplan now - nTh among all http://kyl191.net/2012/07/im-annoyed-at-crashplan-now/ 1 of 6 10/09/14 8:56 am 3. Backups have at least one edge case where theyll fail silently. This is a screenshot of the most recent CrashPlan report that I got sent: Spoiler: Last Backup: 7 mins is false. It might have connected 7 mins ago, but it hasnt been backing up. First is a laptop that hasnt been connected, so Ill ignore that. But helium is my media server, and thats running just ne, right? Nope. Backups have been failing since March 16, based on the rst restart.log le I had. I had an easy way to check the main thing that I was backing up was my pictures, and those are sorted automatically into folders based on the date they were taken. So I pulled up the web browser to look at the restorable les after restarting CrashPlan. My list of picture folders on the CrashPlan web interface And promptly had a mini-freak out. There was a gap between 2012-03-13 and 2012-07-29. And not just because I hadnt been taking photos that much. Which means one thing: backups werent succeeding, but I was told everything was OK. This was Not. Good. And I was Not. Impressed. My Last backup times apparently meant Last connected. Which means: 4. The Backup status report is misleading Ironically, the backup status report sample that CrashPlan has in their docs (dated June 15th 2009) xes this problem. I imagine people saw the two times and were confused, so CrashPlan merged the two times. In my case, thats an oops since it covered up a serious problem. Now, none of this is truly serious for the simple fact that I havent lost any data, and Im thankful for that. Im just glad I found this before any data loss did occur, because Ive been lax in backing up to an alternate location. In fact. its vaguely amusing at how an easily xed root cause (Crashplan running out of RAM to presumably stash le metadata) coupled with a over-simplied status report and ine!ective monitoring created a much more serious e!ect. And how CrashPlan can probably fix this: Quick x: Catch the outofmemory errors, and tell the user or resolve it automatically. It seems to be based on the size of the les that people are backing up, not the number of les, which to me makes me think that their block detection/hashing algorithm is whats chewing up memory. Id guess that 99% of the clients will probably never hi, say, 1.5TB of les. Search I'm annoyed at Crashplan now - nTh among all http://kyl191.net/2012/07/im-annoyed-at-crashplan-now/ 2 of 6 10/09/14 8:56 am COMMENTS (24) TRACKBACKS (1) RELATED POSTS backup, crashplan, sysadmin But for the 1% that do, notify them instead of failing silently. I could have had the problem resolved quickly if I had been told about it. Missing a week (assuming weekly status reports) is far preferable to missing 4 and a half months. Alternatively, where its just a ag in a command line, rewrite the ag automatically. UAC on Windows will cause some problems, but CrashPlan is writing to the Program Data folder perfectly ne, so moving the ags there should work. Some fancy scaling could be brought in (eg. limit it to no more than 1/2 the system RAM, up the size by 32MB each time you hit an OOM error) to make it even Long term x: Drastically reduce memory usage. Im backing up ~3TB of les and its using just under a gig of RAM to hold everything. It sounds pretty unoptimal, especially considering that the error is a heap error (which tends to mean memory leaks somewhere). Cant speculate much without knowing the internals, so I wont. And as far as I can tell there are no docs on the Crashplan website that explain how to up the memory limit. So for the record,under Windows the cong le is at C:\Program Files\CrashPlan\CrashPlanService.ini, and at /usr/local/crashplan/bin/run.conf on Linux. Youre looking for the option -Xmx256M I changed it to -Xmx1536M, you can change it to something and test it. Tip for Windows users: Run your text editor as Administrator if youre using Vista/7. Were trying to edit a le in Program Files, UAC will silently block saves, which is incredibly annoying until you know what causes it. Also, I thought they might at least standardize on a le location, if not name, where theyre using supposedly cross-platform Java. But nope! Google+ Kyle Lexmond This entry was posted on July 31, 2012, 4:22 pm and is led under Sysadmin. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site. #1 by Aubrey Bailey on August 2, 2012 - 10:27 pm agreed. Once you get up around 3TB everything goes to hell. If you get a little past that it actually (repeatably with a clean install on ubuntu 10.04) will cease to communicate. apparently the more expensive versions dont have these problems as my university has a shared one with ~20TB #2 by Kyle Lexmond on August 3, 2012 - 2:15 am Wish I could say Nah, thats got to be some unrelated cause, but I can totally believe that itd seize up at >3TB. Im at 2.7TB right now, though Im about to break it up into 2 separate systems thankfully. #3 by Aubrey Bailey on August 3, 2012 - 3:01 am I can only reproduce it on the one system, but I only have one system with >3TB to test. I have 3 other nodes in the same cloud. Their tech support is ba"ed and gave me a line about unsupported modications. Its a fairly vanilla 10.04 upgraded to 11.04 through the dist-upgrade command. Other than that Ive congured the hell out of apache. The best part is that even archiving di!erent subsets of the total 7TB, it breaks just before 3.5 every time before nishing the initial sync. Maybe Ill blow it all up and try again at some point, but put me down as unsatised. #4 by Kyle Lexmond on August 6, 2012 - 4:56 pm Jeez. Thats annoying, particularly where Ill be likely to use more storage in the near future itll be fairly easy for me to hit 3.5TB on one system now that Im consolidating everything onto one system to move to uni. At least I can handle editing the cong les cant imagine what someone else would have to do. =| #5 by Kyle on December 19, 2012 - 9:24 pm I ran into this problem recently too. Under Crashplan Settings > Buckup > Advanced Settings there is an option to Watch le system in real time. It seems to me that the problem either has to do with the number of les being backed up, or the total size of the backup (perhaps both) that causes this issue. Any insight as to whether disabling real-time le system monitoring could curb this problem? #6 by Kyle Lexmond on December 20, 2012 - 12:10 am Well, disabling the real-time watch had some e!ect my memory usage dropped by something like 100MB I think. Total memory usage is still o! the charts insane though I was seeing ~2.5GB used, sadly I didnt take a screenshot. Im increasingly sure memory usage is a function of the number of les, not the size of les, based on the fact that the memory usage dropped a fair bit when I compressed a bunch (~50k) of old archive les into a single zip le. (By a few hundred MB if Im remembering correctly, since I remember being happy to bring it down to ~1.5GB used.) I'm annoyed at Crashplan now - nTh among all http://kyl191.net/2012/07/im-annoyed-at-crashplan-now/ 3 of 6 10/09/14 8:56 am #7 by xYZ on December 21, 2012 - 7:28 am Ive got around 2.7T uploaded. Notorious memory leaks. ~1Mbps upload speed (on 50Mbps link). So Id say a soft limit on this unlimited-hah service is ~2T. #8 by AgentX on January 1, 2013 - 6:13 pm Thank you so much for this blog post, it solved the issue I was having with CrashPlan silently dying and refusing to backup past 94% on my 1.5TB data set. I adjusted the memory limit and watched the memory usage in task manager jump over the previous limit of 512MB (presumably where it was dying before). Again, thanks. Is there anyway to get this information to CrashPlan so the x can be a little more accessible? #9 by Kyle Lexmond on January 2, 2013 - 1:43 pm Im glad it helped! As for CrashPlan Well, they know about it I opened a ticket with them, and they replied with practically the same information. I dont know why they dont have it in their support site since so many people seem to be having the same problem and their response seemed very copy-and- pasted too. Searching Xmx site:support.crashplan.com just gave a single link to a guide on stopping and restarting the backup engine. (Yep, its copy-and-pasted. The email I got was the same as the email another guy got, with the exception of Windows vs Mac. #10 by trk on January 9, 2013 - 5:21 am Gday, Glad I found this post. I was the same as you every so often Id get an email telling me how great the backups were going so I gured things were going swimmingly. Then I happened to notice that the percentage done hadnt increased in months uh oh. The annoying part is that its on a headless box sitting in a corner so I never had the tray icon to do a quick check. Ended up using Xmimg and putty with X11 forwarding to get a pretty GUI on my Windows desktop to check it over. Between that and tailing the service.log.0 le I saw the error you mentioned as it restarted it scanning Java was running out of memory and crashing. I already have real time le watch disabled (and other tweaks mentioned on the Crashplan Pro support site) but recently made a backup of a website that handles lots of small les which I think pushed the memory usage over the edge. Used your suggestion (but with -Xmx2048m.. Ive got 8GB of RAM in that box to play with so a bit more wont hurt) and its happily syncing my ~2TB / ~2 million le count backup again. Thanks for taking the time to post this, it helped a lot. PS My run.conf had -Xmx512m as default (which clearly wasnt enough) #11 by Pavel Uhliar on February 18, 2013 - 4:26 pm Hi, guys Thanks for info in this thread. It seems to be same problem for me, but I seem to hit it with just below 600GB, 92k les (cca 82% of my photos backup). #12 by Torleif on March 10, 2013 - 3:56 pm According to their customer support there is a native client on the way which will hopefully have a lot better memory usage than the current Java thing #13 by Pavel Uhliar on March 13, 2013 - 6:26 pm Great news, I am becoming tired of having 4G of RAM taken just after windows startup :/ #14 by me on March 14, 2013 - 10:14 pm I can conrm that increasing the memory allocation to 1.5G or more resolves most of the issues with Crashplan stuck and/or crashing somewhere around 400K les (for me). The low memory issue also resulted in very high cpu usage (on a dual quad core Xeon Poweredge server with 16GB of RAM), more than most java apps ive seen before. #15 by Guillaume Paumier on May 2, 2013 - 2:30 pm Hi. I found your post while re-investigating the issue. I encountered the same problem last year (CPU cycling as the engine was crashing and restarting over and over) and the CrashPlan Support guys told me how to up the memory limit. It worked very well so I was happy. Until now. Ive reached ~4.4 TB of data backed up, but now no matter how much I up the memory limit, it wont do. It crashes a few minutes at most after it starts. This is very annoying. If its true that a more e#cient native client is on the way, Im looking forward to it, because right now Im pretty much stuck. #16 by Mark Laing on May 4, 2013 - 5:04 pm Superb tip worked like a charm. Thanks again for posting! I'm annoyed at Crashplan now - nTh among all http://kyl191.net/2012/07/im-annoyed-at-crashplan-now/ 4 of 6 10/09/14 8:56 am Mark #17 by Don on September 9, 2013 - 9:24 am Have done all that my problem is slightly di!erent: VERY high CPU usage particularly when Im away, but even when present. So much so that it interferes with my work by pausing almost everything for long periods. Have limited CPU to 10% when user is present, 60% when away. Backups run ne (though its not big < 100 MB there are many les accounting programs make a LOT of tiny les!). It's all about hogging the CPU and RAM (8 GB on the box, 2GB given to Crashplan as above). When I connect via Terminal Services/Remote Desktop I have to "sleep" Crashplan to make the machine usable. Any ideas? #18 by Adrian N. on September 29, 2013 - 7:02 pm What do you think its the approach to try to make crashplan work on a headless client (Qnap) with only 256mb of ram installed, other than replacing it for another NAS? Thanks #19 by Kyle Lexmond on September 29, 2013 - 10:40 pm A lot of swap space would probably do the trick. Id guess that CP doesnt actively touch stu! (hashes I guess) that it creates in memory, so the OS would be able to swap stu! out to disk. #20 by Mike on November 5, 2013 - 2:16 pm Im running the windows client, with just over 346GB, and it was taking 400Megs of RAM. Contacted support, and was told a new native client is coming. So its only been a year. I guess Ill stick with Carbonite. #21 by yuri on December 25, 2013 - 6:59 pm I have about 4TB to back up. About two weeks ago I started seeing the restarts. Support article recommends to allocate 1GB memory for every TB of back up, hence my xmx value should be 4096MB. (I have 32GB RAM, BTW) . I opened .ini le and changed the value. When I tried to start the service, it starts and immediately stops I decreased the value until I can restart the service. For me its 1536M.. I guess I will start to look for alternative backup solution #22 by Jay Heiser on March 26, 2014 - 7:24 pm Lets start by saying that Ive got 8Gigs of RAM on my Windows 7 (64bit) machine, and my net bandwidth usually measures at about 60Mbs uplink. I tested CrashPlan for a couple days, and it seemed to chunk right along about 5Mbs, which wasnt taking advantage of my full bandwidth, but was fast enough to be useful. So I sent for the Seed drive, lled it up to the max 1 TB, and sent it back. Someone installed my data, notied me it was ready, and then left for 2 days. After her return, and after wasting 15 hours of my time doing a local chkdsk, they gured out what they hadnt done right, and it started in on the remaining 600 Megs of my data. Late Fri night, it comes to a complete halt. So I ask for help, they provide me a suggestion to improve memory (remember, Ive got 8Gig). He goes away for 2 days, and I nd that now I cant even run their client. We spend two days dicking around trying to get the client working, with the nal solution being reinstall it. Now Im back to the Crashplan saying that it has completed the backup/100,997 les left. It seems that my support guy is gone again. Maybe online backup of 1.6TB of pictures is unrealistic with todays technology? Its not for the lack of bandwidth or system memory on my side. Ive spent 22 days to get 2/3 of my data uploaded, and Ive spent hours dicking with this, and getting half-assed support from 2 people who are either over-worked, or just dont pay attention. Has anybody failed with CrashPlan, but succeeded with BackBlaze? #23 by Cuong on August 20, 2014 - 7:41 pm Weve had Crashplan installed by an IT company on an XP machine to back up our Synology NAS to their server, while they monitor status emails for us. We have only 60G of data. Everything went well and they got OK messages everyday. But recently our NAS got hijacked (Synolocker, everything was encrypted) and we thought it would be a matter of just restoring from the backup. However, as it turns out from the backup log, the backup scan got messed up about two months ago and since then it backs up almost no data at all. Although all our shares are selected for backup, every rescan results in only 1.9G of data, the rest is simply ignored. So we have lost all data from the last 2 months, a lot of work down the drain. Crashplan goes down the same drain and we are looking for another solution. I'm annoyed at Crashplan now - nTh among all http://kyl191.net/2012/07/im-annoyed-at-crashplan-now/ 5 of 6 10/09/14 8:56 am Fusion theme by digitalnature | powered by WordPress Entries (RSS) and Comments (RSS) ^ Submit Comment Name (required) E-Mail (required) (will not be published) Website Notify me of follow-up comments by email. Notify me of new posts by email. #24 by Aubrey on August 20, 2014 - 8:56 pm Well, to be fair, you had a 2 year old error. Also, you are running an unsupported, unpatchable OS. This is mostly your fault. I say mostly because honestly the jre memory thing is pretty silly. Seriously, go reconsider your IT choices. I'm annoyed at Crashplan now - nTh among all http://kyl191.net/2012/07/im-annoyed-at-crashplan-now/ 6 of 6 10/09/14 8:56 am