Вы находитесь на странице: 1из 21

Log Files

Log files are files that contain messages about the system, including the kernel, services, and applications running on it. There are different log files for different information. For example, there is a default system log file, a log file just for security messages, and a log file for cron tasks. Log files can be very useful when trying to troubleshoot a problem with the system such as trying to load a kernel driver or when looking for unauthorized log in attempts to the system. This chapter discusses where to find log files, how to view log files, and what to look for in log files. Some log files are controlled by a daemon called syslogd. A list of log messages maintained by syslogd can be found in the/etc/syslog.conf configuration file.

Locating Log Files


Most log files are located in the /var/log directory. Some applications such as httpd and samba have a directory within /var/logfor their log files. Notice the multiple files in the log file directory with numbers after them. These are created when the log files are rotated. Log files are rotated so their file sizes do not become too large. The logrotate package contains a cron task that automatically rotates log files according to the /etc/logrotate.conf configuration file and the configuration files in the /etc/logrotate.d directory. By default, it is configured to rotate every week and keep four weeks worth of previous log files.

Viewing Log Files


Most log files are in plain text format. You can view them with any text editor such as Vi or Emacs. Some log files are readable by all users on the system; however, root priviledges are required to read most log files. To view system log files in an interactive, real-time application, use the Log Viewer. To start the application, go to the Main Menu Button (on the Panel) => System

Tools => System Logs, or type the command redhat-logviewer at a shell prompt.

System log files: /var/log/messages - system messages /secure - Logging by PAM of network access attemts /dmesg - Log of system boot. Also see command dmesg /boot.log - Log of system init process /xferlog.1 - File transfer log /lastlog - Requires the use of the lastlog command to examine contents /maillog - log fromm sendmail daemon Note: The lastlog command prints time stamp of the last login of system users. (Interprets file: /var/log/lastlog) logrotate Rotate log files: Many system and server application programs such as Apache, generate log files. If left unchecked they would grow large enough to burden the system and application. The logrotate program will periodically backup the log file by renameing it. The program will also allow the system administrator to set the limit for the number of logs or their size. There is also the option to compress the backed up files. Configuration file: /etc/logrotate.conf Directory for logrotate configuration scripts: /etc/logrotate.d/ Example logrotate configuration script: /etc/logrotate.d/process-name /var/log/process-name.log { rotate 12 monthly errors root@localhost missingok postrotate /usr/bin/killall -HUP process-name 2> /dev/null || true endscript 2

} The configuration file lists the log file to be rotated, the process kill command to momentarily shut down and restart the process, and some configuration parameters listed in the logrotate man page. Linux is a stable and reliable environment. But any computing system can have unforeseen events, such as hardware failures. Having a reliable backup of critical configuration information and data is part of any responsible administration plan. There is a wide variety of approaches to doing backups in Linux. Techniques range from very simple script-driven methods to elaborate commercial software. Backups can be done to remote network devices, tape drives, and other removable media. Backups can be file-based or drive-image based. There are many options available and you can mix and match your techniques to design the perfect backup plan for your circumstances.

Whats your strategy


There are many different approaches to backing up a system. For some perspectives on this, you may want to read the article "Introduction to Backing Up and Restoring Data" listed in the Resources section at the end of this article. What you back up depends a lot on your reason for backing up. Are you trying to recover from critical failures, such as hard drive problems? Are you archiving so that old files can be recovered if needed? Do you plan to start with a cold system and restore, or a preloaded standby system?

What to back up?


The file-based nature of Linux is a great advantage when backing up and restoring the system. In a Windows system, the registry is very system specific. Configurations and software installations are not simply a matter of dropping files on a system. Therefore, restoring a system requires software that can deal with these idiosyncrasies. In Linux, the story is different. Configuration files are text based and, except for when they deal directly with hardware, are largely systems independent. The modern approach to hardware drivers is to have them available as modules that are dynamically loaded, so kernels are becoming more system independent. Rather than a backup having to deal with the intricacies of how the operating system is installed on your system and hardware, Linux backups are about packaging and unpackaging files. In general, there are some directories that you want to back up:

/etc

contains all of your core configuration files. This includes your network configuration, system name, firewall rules, users, groups, and other global system items.

/var contains information used by your systems daemons (services) including DNS configurations, DHCP leases, mail spool files, HTTP server files, db2 instance configuration, and others.

/home contains the default user home directories for all of your users. This includes their personal settings, downloaded files, and other information your users dont want to lose.

/root is the home directory for the root user.

/opt is where a lot of non-system software will be installed. IBM software goes in here. OpenOffice, JDKs, and other software is also installed here by default.

There are directories that you should consider not backing up.

/proc should never be backed up. It is not a real-file system, but rather a virtualized view of the running kernel and environment. It includes files such as /proc/kcore, which is a virtual view of the entire running memory. Backing these up only wastes resources.

/dev contains the file representations of your hardware devices. If you are planning to restore to a blank system, then you can back up /dev. However, if you are planning to restore to an installed Linux base, then backing up /dev will not be necessary.

The other directories contain system files and installed packages. In a server environment, much of this information is not customized. Most customization occurs in the /etc and /home directories. But for completeness, you may wish to back them up. In a production environment where I wanted to be assured that no data would be lost, I would back up the entire system, except for the /proc directory. If I were mostly worried about users and configuration, I would back up only the /etc, /var, /home, and /root directories.

Backup tools
As mentioned before, Linux backups are largely about packaging and unpackaging files. This allows you to use existing system utilities and scripting to perform your backups rather than having to purchase a commercial software package. In many cases, this type of backup will be adequate, and it provides a great deal of control for the administrator. The backup script can be automated using the cron command, which controls scheduled events in Linux. tar tar is a classic UNIX command that has been ported into Linux. tar is short for tape archive, and was originally designed for packaging files onto tape. You have probably already encountered tar files if you have downloaded any source code for Linux. It is a file-based command that essentially serially stacks the files end to end. Entire directory trees can be packaged with tar, which makes it especially suited to backups. Archives can be restored in their entirety, or files and directories can be expanded individually. Backups can go to file-based devices or tape devices. Files can be redirected upon restoration to replace to a different directory (or system) from where they were originally saved. tar is file system-independent. It can be used on ext2, ext3, jfs, Reiser, and other file systems. Using tar is very much like using a file utility, such as PKZip. You point it toward a destination, which is a file or a device, and then name the files that you want to package. You can compress archives on the fly with standard compression types, or specify an external compression program of your choice. To compress or uncompress files through bzip2, use tar -z. To back up the entire file system using tar to a SCSI tape drive, excluding the /proc directory: tar -cpf /dev/st0 / --exclude=/proc In the above example, the -c switch indicates that the archive is being created. The p switch indicates that we want to preserve the file permissions, critical for a good backup. The -f switch points to the filename for the archive. In this case, we are using the raw tape device, /dev/st0. The / indicates what we want to back up. Since we wanted the entire file system, we specified the root. tarautomatically recurses when pointed to a directory (ending in a /). Finally, we exclude the /proc directory, since it doesnt contain anything we need to save. If the backup will not fit on a single tape, we will add the -M switch (not shown), for multi-volume. 5

To restore a file or files, the tar command is used with the extract switch (-x): tar -xpf /dev/st0 -C / The -f switch again points to our file, and -p indicates that we want to restore archived permissions. The -x switch indicates an extraction of the archive. The C / indicates that we want the restore to occur from /. tar normally restores to the directory from which the command is run. The -C switch makes our current directory irrelevant. The two other tar commands that you will probably use often are the -t and d switches. The -t switch lists the contents of an archive. The -d switch compares the contents of the archive to current files on a system. For ease of operation and editing, you can put the files and directories that you want to archive in a text file, which you reference with the -T switch. These can be combined with other directories listed on the command line. The following line backs up all the files and directories listed in MyFiles, the /root directory, and all of the iso files in the /tmp directory: tar -cpf /dev/st0 -T MyFiles /root /tmp/*.iso The file list is simply a text file with the list of files or directories. Heres an example: /etc /var /home /usr/local /opt Please note that the tar -T (or files-from) command cannot accept wildcards. Files must be listed explicitly. The example above shows one way to reference files separately. You could also execute a script to search the system and then build a list. Here is an example of such a script: #!/bin/sh cat MyFiles > TempList find /usr/share -iname *.png >> TempList find /tmp -iname *.iso >> TempList

tar -cpzMf /dev/st0 -T TempList The above script first copies all of our existing file list from MyFiles to TempList. Then it executes a couple of find commands to search the file system for files that match a pattern and to append them to the TempList. The first search is for all files in the /usr/share directory tree that end in .png. The second search is for all files in the /tmp directory tree that end in .iso. Once the list is built, then tar is run to create a new archive on the file device /dev/st0 (the first SCSI tape drive), which is compressed using the gzip format and retains all of the file permissions. The archive will span Multiple volumes. The file names to be archived will be Taken from the file TempList. Scripting can also be used to perform much more elaborate actions such as incremental backups. An excellent script is listed by Gerhard Mourani in his book Securing and Optimizing Linux, which you will find listed in the Resources section at the end of this article. Scripts can also be written to restore files, though restoration is often done manually. As mentioned above, the -x switch for extract replaces the -c switch. Entire archives can be restored, or individual files or directories can be specified. Wildcards are okay to reference files in the archive. You can also use switches to dump and restore. dump and restore dump can perform functions similar to tar. However, dump tends to look at file systems rather than individual files. Quoting from thedump man file: "dump examines files on an ext2 filesystem and determines which files need to be backed up. These files are copied to the given disk, tape, or other storage medium for safe keeping. A dump that is larger than the output medium is broken into multiple volumes. On most media, the size is determined by writing until an end-of-media indication is returned." The companion program to dump is restore, which is used to restore files from a dump image. The restore command performs the inverse function of dump. A full backup of a file system may be restored and subsequent incremental backups layered on top of it. Single files and directory subtrees may be restored from full or partial backups. Both dump and restore can be run across the network, so you can back up or restore from remote devices. dump and restore work with tape drives and file devices providing a wide range of options. However, both are limited to the ext2 and ext3 file systems. If you are working with JFS, Reiser, or other file systems, you will need to use a different utility, such as tar.

Backing up with dump Running a backup with dump is fairly straightforward. The following command does a full backup of Linux with all ext2 and ext3 file systems to a SCSI tape device: dump 0f /dev/nst0 /boot dump 0f /dev/nst0 / In this example, our system has two file systems. One for /boot and another for / a common configuration. They must be referenced individually when a backup is executed. The /dev/nst0 refers to the first SCSI tape, but in a non-rewind mode. This ensures that the volumes are put back-to-back on the tape. An interesting feature of dump is its built-in incremental backup functionality. In the example above, the 0 indicates a level 0, or base-level, backup. This is the full system backup that you would do periodically to capture the entire system. On subsequent backups you can use other numbers (1-9) in place of the 0 to change the level of the backup. A level 1 backup would save all of the files that had changed since the level 0 backup was done. Level 2 would backup everything that had changed from level 1 and so on. The same function can be done with tar, using scripting, but it requires the script creator to have a mechanism to determine when the last backup was done. dump has its own mechanism, writing an update file (/etc/dumpupdates) when it performs a backup. The update file is reset whenever a level 0 backup is run. Subsequent levels leave their mark until another level 0 is done. If you are doing a tape-based backup, dump will automatically track multiple volumes. Restoring with restore To restore information saved with dump, the restore command is used. Like tar, dump has the ability to list (-t) and compare archives to current files (-C). Where you must be careful with dump is in restoring data. There are two very different approaches, and you must use the correct one to have predictable results. Rebuild (-r) Remember that dump is designed with file systems in mind more than individual files. Therefore, there are two different styles of restoring files. To rebuild a file system, use the -r switch. Rebuild is designed to work on an empty file system and restore it back to the saved state. Before running rebuild, you should have created, formatted, and mounted the file system. You should not run rebuild on a file system that contains files. Here is an example of doing a full rebuild from the dump that we executed above.

restore -rf /dev/nst0 The above command needs to be run for each file system being restored. This process could be repeated to add the incremental backups if required. Extract (-x) If you need to work with individual files, rather than full file systems, you must use the -x switch to extract them. For example, to extract only the /etc directory from our tape backup, use the following command: restore -xf /dev/nst0 /etc Interactive restore (-i) One more feature that restore provides is an interactive mode. Using the command: restore -if /dev/nst0 will place you in an interactive shell, showing the items contained in the archive. Typing "help" will give you a list of commands. You can then browse and select the items you wish to be extracted. Bear in mind that any files that you extract will go into your current directory. dump vs. tar Both dump and tar have their followings. Both have advantages and disadvantages. If you are running anything but an ext2 or ext3 file system, then dump is not available to you. However, if this is not the case, dump can be run with a minimum of scripting, and has interactive modes available to assist with restoration. I tend to use tar, because I am fond of scripting for that extra level of control. There are also multi-platform tools for working with .tar files. Other tools Virtually any program that can copy files can be used to perform some sort of backup in Linux. There are references to people usingcpio and dd for backups. cpio is another packaging utility along the lines of tar. It is much less common. dd is a file system copy utility that makes binary copies of file systems. dd might be used to make an image of a hard drive, similar to using a product like Symantecs Ghost. However, dd is not file based, so you can only restore data to an identical hard drive partition.

Commercial backup products There are several commercial backup products available for Linux. Commercial products generally provide a convenient interface and reporting system, whereas with tools such as dump and tar, you have to roll your own. The commercial offerings are broad and offer a range of features. The biggest benefit you will gain from using a commercial package is a pre-built strategy for handling backups that you can just put to work. Commercial developers have already made many of the mistakes that you are about to, and the cost of their wisdom is cheap compared to the loss of your precious data. Tivoli Storage Manager Probably the best commercial backup and storage management utility available now for Linux is the Tivoli Storage Manager. Tivoli Storage Manager Server runs on several platforms, including Linux, and the client runs on many more platforms. Essentially a Storage Manager Server is configured with the devices appropriate to back up the environment. Any system that is to participate in the backups loads a client that communicates with the server. Backups can be scheduled, performed manually from the Tivoli Storage Manager client interface, or performed remotely using a Web-based interface. The policy-based nature of TSM means that central rules can be defined for backup behavior without having to constantly adjust a file list. Additionally, IBM Tivoli Storage Resource Manager can identify, evaluate, control, and predict the utilization of enterprise storage assets, and can detect potential problems and automatically apply self-healing adjustments. See the Tivoli Web site (see the link in theResources section) for more details. Figure 1. Tivoli Storage Manager menu

Backups and restores are then handled through the remote device. Using rsync to make a backup The rsync utility is a very well-known piece of GPLd software, written originally by Andrew Tridgell and Paul Mackerras. If you have a common Linux or UNIX

10

variant, then you probably already have it installed; if not, you can download the source code fromrsync.samba.org. Rsyncs specialty is efficiently synchronizing file trees across a network, but it works fine on a single machine too. Basics Suppose you have a directory called source, and you want to back it up into the directory destination. To accomplish that, youd use: rsync -a source/ destination/ (Note: I usually also add the -v (verbose) flag too so that rsync tells me what its doing). This command is equivalent to: cp -a source/. destination/ except that its much more efficient if there are only a few differences. Just to whet your appetite, heres a way to do the same thing as in the example above, but with destination on a remote machine, over a secure shell: rsync -a -e ssh source/ username@remotemachine.com:/path/to/destination/ Trailing Slashes Do MatterSometimes This isnt really an article about rsync, but I would like to take a momentary detour to clarify one potentially confusing detail about its use. You may be accustomed to commands that dont care about trailing slashes. For example, if a and b are two directories, then cp -a a b is equivalent to cp -a a/ b/. However, rsync does care about the trailing slash, but only on the source argument. For example, let a and b be two directories, with the file foo initially inside directory a. Then this command: rsync -a a b produces b/a/foo, whereas this command: rsync -a a/ b produces b/foo. The presence or absence of a trailing slash on the destination argument (b, in this case) has no effect. Using the --delete flag If a file was originally in both source/ and destination/ (from an earlier rsync, for example), and you delete it from source/, you probably want it to be deleted from destination/ on the next rsync. However, the default behavior is to leave the copy atdestination/ in place. Assuming you want rsync to delete any file from destination/ that is not in source/, youll need to use the --delete flag: rsync -a --delete source/ destination/ 11

Be lazy: use cron One of the toughest obstacles to a good backup strategy is human nature; if theres any work involved, theres a good chance backups wont happen. (Witness, for example, how rarely my roommates home PC was backed up before I created this system). Fortunately, theres a way to harness human laziness: make cron do the work. To run the rsync-with-backup command from the previous section every morning at 4:20 AM, for example, edit the root cron table: (as root) crontab -e Then add the following line: 20 4 * * * rsync -a --delete source/ destination/ Finally, save the file and exit. The backup will happen every morning at precisely 4:20 AM, and root will receive the output by email. Dont copy that example verbatim, though; you should use full path names (such as /usr/bin/rsync and /home/source/) to remove any ambiguity. Incremental backups with rsync Since making a full copy of a large filesystem can be a time-consuming and expensive process, it is common to make full backups only once a week or once a month, and store only changes on the other days. These are called "incremental" backups, and are supported by the venerable old dump and tar utilities, along with many others. However, you dont have to use tape as your backup medium; it is both possible and vastly more efficient to perform incremental backups with rsync. The most common way to do this is by using the rsync -b --backupdir= combination. I have seen examples of that usage here, but I wont discuss it further, because there is a better way. If youre not familiar with hard links, though, you should first start with the following review. Review of hard links We usually think of a files name as being the file itself, but really the name is a hard link. A given file can have more than one hard link to itselffor example, a directory has at least two hard links: the directory name and . (for when youre inside it). It also has one hard link from each of its sub-directories (the .. file inside each one). If you have the stat utility installed on your machine, you can find out how many hard links a file has (along with a bunch of other information) with the command:

12

stat filename Hard links arent just for directoriesyou can create more than one link to a regular file too. For example, if you have the file a, you can make a link called b: ln a b Now, a and b are two names for the same file, as you can verify by seeing that they reside at the same inode (the inode number will be different on your machine): ls -i a 232177 a ls -i b 232177 b So ln a b is roughly equivalent to cp a b, but there are several important differences: 1. The contents of the file are only stored once, so you dont use twice the space. 2. If you change a, youre changing b, and vice-versa. 3. If you change the permissions or ownership of a, youre changing those of b as well, and vice-versa. 4. If you overwrite a by copying a third file on top of it, you will also overwrite b, unless you tell cp to unlink before overwriting. You do this by running cp with the --remove-destination flag. Notice that rsync always unlinks before overwriting!!. Note, added 2002.Apr.10: the previous statement applies to changes in the file contents only, not permissions or ownership. But this raises an interesting question. What happens if you rm one of the links? The answer is that rm is a bit of a misnomer; it doesnt really remove a file, it just removes that one link to it. A files contents arent truly removed until the number of links to it reaches zero. In a moment, were going to make use of that fact, but first, heres a word about cp. Using cp -al In the previous section, it was mentioned that hard-linking a file is similar to copying it. It should come as no surprise, then, that the standard GNU coreutils cp command comes with a -l flag that causes it to create (hard) links instead of copies (it doesnt hard-link directories, though, which is good; you might want to think about why that is). Another handy switch for the cp command is -a(archive), which causes it to recurse through directories and preserve file owners, timestamps, and access permissions.

13

Together, the combination cp -al makes what appears to be a full copy of a directory tree, but is really just an illusion that takes almost no space. If we restrict operations on the copy to adding or removing (unlinking) filesi.e., never changing one in placethen the illusion of a full copy is complete. To the end-user, the only differences are that the illusion-copy takes almost no disk space and almost no time to generate. 2002.05.15: Portability tip: If you dont have GNU cp installed (if youre using a different flavor of *nix, for example), you can use findand cpio instead. Simply replace cp -al a b with cd a && find . -print | cpio -dpl ../b. Thanks to Brage Frland for that tip. Putting it all together We can combine rsync and cp -al to create what appear to be multiple full backups of a filesystem without taking multiple disks worth of space. Heres how, in a nutshell: rm -rf backup.3 mv backup.2 backup.3 mv backup.1 backup.2 cp -al backup.0 backup.1 rsync -a --delete source_directory/ backup.0/ If the above commands are run once every day, then backup.0, backup.1, backup.2, and backup.3 will appear to each be a full backup of source_directory/ as it appeared today, yesterday, two days ago, and three days ago, respectivelycomplete, except that permissions and ownerships in old snapshots will get their most recent values (thanks to J.W. Schultz for pointing this out). In reality, the extra storage will be equal to the current size of source_directory/ plus the total size of the changes over the last three days exactly the same space that a full plus daily incremental backup with dump or tar would have taken. Update (2003.04.23): As of rsync-2.5.6, the --link-dest flag is now standard. Instead of the separate cp -al and rsync lines above, you may now write: mv backup.0 backup.1 rsync -a --delete --link-dest=../backup.1 source_directory/ backup.0/ This method is preferred, since it preserves original permissions and ownerships in the backup. However, be sure to test itas of this writing some users are still having trouble getting --link-dest to work properly. Make sure you use version 2.5.7 or later.

14

Update (2003.05.02): John Pelan writes in to suggest recycling the oldest snapshot instead of recursively removing and then re-creating it. This should make the process go faster, especially if your file tree is very large: mv backup.3 backup.tmp mv backup.2 backup.3 mv backup.1 backup.2 mv backup.0 backup.1 mv backup.tmp backup.0 cp -al backup.1/. backup.0 rsync -a --delete source_directory/ backup.0/ 2003.06.02: OOPS! Rsyncs link-dest option does not play well with J. Pelans suggestionthe approach I previously had written above will result in unnecessarily large storage, because old files in backup.0 will get replaced and not linked. Please only use Dr. Pelans directory recycling if you use the separate cp -al step; if you plan to use --link-dest, start with backup.0 empty and pristine. Apologies to anyone Ive misled on this issue. Thanks to Kevin Everets for pointing out the discrepancy to me, and to J.W. Schultz for clarifying --link-dests behavior. Also note that I havent fully tested the approach written above; if you have, please let me know. Until then, caveat emptor! Im used to dump or tar! This seems backward! The dump and tar utilities were originally designed to write to tape media, which can only access files in a certain order. If youre used to their style of incremental backup, rsync might seem backward. I hope that the following example will help make the differences clearer. Suppose that on a particular system, backups were done on Monday night, Tuesday night, and Wednesday night, and now its Thursday. With dump or tar, the Monday backup is the big ("full") one. It contains everything in the filesystem being backed up. The Tuesday and Wednesday "incremental" backups would be much smaller, since they would contain only changes since the previous day. At some point (presumably next Monday), the administrator would plan to make another full dump. With rsync, in contrast, the Wednesday backup is the big one. Indeed, the "full" backup is always the most recent one. The Tuesday directory would contain data only for those files that changed between Tuesday and Wednesday; the Monday directory would contain data for only those files that changed between Monday and Tuesday. A little reasoning should convince you that the rsync way is much better for network-based backups, since its only necessary to do a full backup once, instead of once per week. Thereafter, only the changes need to be copied. Unfortunately, you cant rsync to a tape, and thats probably why the dump and tar incremental backup

15

models are still so popular. But in your authors opinion, these should never be used for network-based backups now that rsync is available. Isolating the backup from the rest of the system If you take the simple route and keep your backups in another directory on the same filesystem, then theres a very good chance that whatever damaged your data will also damage your backups. In this section, we identify a few simple ways to decrease your risk by keeping the backup data separate. The easy (bad) way In the previous section, we treated /destination/ as if it were just another directory on the same filesystem. Lets call that the easy (bad) approach. It works, but it has several serious limitations: If your filesystem becomes corrupted, your backups will be corrupted too. If you suffer a hardware failure, such as a hard disk crash, it might be very difficult to reconstruct the backups. Since backups preserve permissions, your usersand any programs or viruses that they runwill be able to delete files from the backup. That is bad. Backups should be read-only. If you run out of free space, the backup process (which runs as root) might crash the system and make it difficult to recover. The easy (bad) approach offers no protection if the root account is compromised. Fortunately, there are several easy ways to make your backup more robust. Keep it on a separate partition If your backup directory is on a separate partition, then any corruption in the main filesystem will not normally affect the backup. If the backup process runs out of disk space, it will fail, but it wont take the rest of the system down too. More importantly, keeping your backups on a separate partition means you can keep them mounted readonly; well discuss that in more detail in the next chapter. Keep that partition on a separate disk If your backup partition is on a separate hard disk, then youre also protected from hardware failure. Thats very important, since hard disks always fail eventually, and often take your data with them. An entire industry has formed to service the needs of those whose broken hard disks contained important data that was not properly backed up.

16

Important: Notice, however, that in the event of hardware failure youll still lose any changes made since the last backup. For home or small office users, where backups are made daily or even hourly as described in this document, thats probably fine, but in situations where any data loss at all would be a serious problem (such as where financial transactions are concerned), a RAID system might be more appropriate. RAID is well-supported under Linux, and the methods described in this document can also be used to create rotating snapshots of a RAID system. Keep that disk on a separate machine If you have a spare machine, even a very low-end one, you can turn it into a dedicated backup server. Make it standalone, and keep it in a physically separate placeanother room or even another building. Disable every single remote service on the backup server, and connect it only to a dedicated network interface on the source machine. On the source machine, export the directories that you want to back up via read-only NFS to the dedicated interface. The backup server can mount the exported network directories and run the snapshot routines discussed in this article as if they were local. If you opt for this approach, youll only be remotely vulnerable if: 1. a remote root hole is discovered in read-only NFS, and 2. the source machine has already been compromised. Id consider this "pretty good" protection, but if youre (wisely) paranoid, or your job is on the line, build two backup servers. Then you can make sure that at least one of them is always offline. If youre using a remote backup server and cant get a dedicated line to it (especially if the information has to cross somewhere insecure, like the public internet), you should probably skip the NFS approach and use rsync -e ssh instead. It has been pointed out to me that rsync operates far more efficiently in server mode than it does over NFS, so if the connection between your source and backup server becomes a bottleneck, you should consider configuring the backup machine as an rsync server instead of using NFS. On the downside, this approach is slightly less transparent to users than NFSsnapshots would not appear to be mounted as a system directory, unless NFS is used in that direction, which is certainly another option (I havent tried it yet though). Thanks to Martin Pool, a lead developer of rsync, for making me aware of this issue. Heres another example of the utility of this approachone that I use. If you have a bunch of windows desktops in a lab or office, an easy way to keep them all backed up is to share the relevant files, read-only, and mount them all from a dedicated backup server using SAMBA. The backup job can treat the SAMBA-mounted shares just like regular local directories. 17

Making the backup as read-only as possible In the previous section, we discussed ways to keep your backup data physically separate from the data theyre backing up. In this section, we discuss the other side of that coinpreventing user processes from modifying backups once theyre made. We want to avoid leaving the snapshot backup directory mounted read-write in a public place. Unfortunately, keeping it mounted read-only the whole time wont work eitherthe backup process itself needs write access. The ideal situation would be for the backups to be mounted read-only in a public place, but at the same time, read-write in a private directory accessible only by root, such as/root/snapshot. There are a number of possible approaches to the challenge presented by mounting the backups read-only. After some amount of thought, I found a solution which allows root to write the backups to the directory but only gives the users read permissions. Ill first explain the other ideas I had and why they were less satisfactory. Its tempting to keep your backup partition mounted read-only as /snapshot most of the time, but unmount that and remount it read-write as /root/snapshot during the brief periods while snapshots are being made. Dont give in to temptation!. Bad: mount/umount A filesystem cannot be unmounted if its busythat is, if some process is using it. The offending process need not be owned by root to block an unmount request. So if you plan to umount the read-only copy of the backup and mount it read-write somewhere else, dontany user can accidentally (or deliberately) prevent the backup from happening. Besides, even if blocking unmounts were not an issue, this approach would introduce brief intervals during which the backups would seem to vanish, which could be confusing to users. Better: mount read-only most of the time A better but still-not-quite-satisfactory choice is to remount the directory read-write in place: mount -o remount,rw /snapshot [ run backup process ] mount -o remount,ro /snapshot Now any process that happens to be in /snapshot when the backups start will not prevent them from happening. Unfortunately, this approach introduces a new problemthere is a brief window of vulnerability, while the backups are being made, during which a user process could write to the backup directory. Moreover, if any process opens a backup file for writing during that window, it will prevent the backup from being remounted read-only, and the backups will stay vulnerable indefinitely.

18

Tempting but doesnt seem to work: the 2.4 kernels mount --bind Starting with the 2.4-series Linux kernels, it has been possible to mount a filesystem simultaneously in two different places. "Aha!" you might think, as I did. "Then surely we can mount the backups read-only in /snapshot, and read-write in /root/snapshot at the same time!" Alas, no. Say your backups are on the partition /dev/hdb1. If you run the following commands, mount /dev/hdb1 /root/snapshot mount --bind -o ro /root/snapshot /snapshot then (at least as of the 2.4.9 Linux kernelupdated, still present in the 2.4.20 kernel), mount will report /dev/hdb1 as being mounted read-write in /root/snapshot and read-only in /snapshot, just as you requested. Dont let the system mislead you! It seems that, at least on my system, read-write vs. read-only is a property of the filesystem, not the mount point. So every time you change the mount status, it will affect the status at every point the filesystem is mounted, even though neither /etc/mtab nor/proc/mounts will indicate the change. In the example above, the second mount call will cause both of the mounts to become read-only, and the backup process will be unable to run. Scratch this one. Update: I have it on fairly good authority that this behavior is considered a bug in the Linux kernel, which will be fixed as soon as someone gets around to it. If you are a kernel maintainer and know more about this issue, or are willing to fix it, Id love to hear from you! My solution: using NFS on localhost This is a bit more complicated, but until Linux supports mount --bind with different access permissions in different places, it seems like the best choice. Mount the partition where backups are stored somewhere accessible only by root, such as /root/snapshot. Then export it, read-only, via NFS, but only to the same machine. Thats as simple as adding the following line to /etc/exports: /root/snapshot 127.0.0.1(secure,ro,no_root_squash) then start nfs and portmap from /etc/rc.d/init.d/. Finally mount the exported directory, read-only, as /snapshot: mount -o ro 127.0.0.1:/root/snapshot /snapshot And verify that it all worked: mount ...

19

/dev/hdb1 on /root/snapshot type ext3 (rw) 127.0.0.1:/root/snapshot on /snapshot type nfs (ro,addr=127.0.0.1) At this point, well have the desired effect: only root will be able to write to the backup (by accessing it through /root/snapshot). Other users will see only the read-only /snapshot directory. For a little extra protection, you could keep mounted read-only in/root/snapshot most of the time, and only remount it readwrite while backups are happening. Damian Menscher pointed out this CERT advisory which specifically recommends against NFS exporting to localhost, though since Im not clear on why its a problem, Im not sure whether exporting the backups read-only as we do here is also a problem. If you understand the rationale behind this advisory and can shed light on it, would you please contact me? Thanks! Extensions: hourly, daily, and weekly snapshots With a little bit of tweaking, we make multiple-level rotating snapshots. On my system, for example, I keep the last four "hourly" snapshots (which are taken every four hours) as well as the last three "daily" snapshots (which are taken at midnight every day). You might also want to keep weekly or even monthly snapshots too, depending upon your needs and your available space. Keep an extra script for each level This is probably the easiest way to do it. I keep one script that runs every four hours to make and rotate hourly snapshots, and another script that runs once a day rotate the daily snapshots. There is no need to use rsync for the higher-level snapshots; just cp -al from the appropriate hourly one. Run it all with cron To make the automatic snapshots happen, I have added the following lines to roots crontab file: 0 */4 * * * /usr/local/bin/make_snapshot.sh 0 13 * * * /usr/local/bin/daily_snapshot_rotate.sh They cause make_snapshot.sh to be run every four hours on the hour and daily_snapshot_rotate.sh to be run every day at 13:00 (that is, 1:00 PM). I have included those scripts in the appendix. If you tire of receiving an email from the cron process every four hours with the details of what was backed up, you can tell it to send the output of make_snapshot.sh to /dev/null, like so: 0 */4 * * * /usr/local/bin/make_snapshot.sh >/dev/null 2>&1

20

Understand, though, that this will prevent you from seeing errors if make_snapshot.sh cannot run for some reason, so be careful with it. Creating a third script to check for any unusual behavior in the snapshot periodically seems like a good idea, but I havent implemented it yet. Alternatively, it might make sense to log the output of each run, by piping it through tee, for example. mRgOBLIN wrote in to suggest a better (and obvious, in retrospect!) approach, which is to send stdout to /dev/null but keep stderr, like so: 0 */4 * * * /usr/local/bin/make_snapshot.sh >/dev/null Presto! Now you only get mail when theres an error. Backup Scheduling: Tar is quite useful for copying directory trees, and is much more powerful than cp. To copy directory /home/myhome/myimportantfiles to /share/myhome/myimportantfiles: cd /home/myhome/myimportantfiles tar -cvf - . | tar -C /share/myhome/myimportantfiles/ -xv To schedule this to happen every day at 1am: crontab -e vi will run. If you are unfamiliar with vi, push i to insert, and enter: 0 1 * * 0-6 cd /home/myhome/myimportantfiles;tar -cvf * | tar -C /share/myhome/myimportantfiles/ -xv push escape, :, wq (or escape, ZZ) to save your crontab entry. Verify your new job with crontab -l: $ crontab -l # DO NOT EDIT THIS FILE - edit the master and reinstall. # (/tmp/crontab.17295 installed on Thu May 3 07:58:27 2001) # (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $) 0 1 * * 0-6 cd /home/myhome/myimportantfiles;tar -cvf - * | tar -C /share/myhome/myimportantfiles/ -xv

21

Вам также может понравиться