It is commonplace for many outside of the professional handlers of storage to consider backup and archiving the same thing. Little could be further from the truth. They are distinct technology practices with sharply differing requirements.
The payoff of each technology is also different. As organizations slowly come to archiving, cost reduction, improved performance, and verifiable compliance with regulatory laws become achievable business goals.
Archiving can mean several things, but for the purposes of this summary, it describes the process of consolidating and migrating data from a primary online storage medium (such as Fibre Channel disk arrays or enterprise SATA arrays) to less costly nearline or offline storage media. In some cases, archiving emphasizes data longevity and authenticity, especially for emails, instant message texts, document files, and other semi-structured or unstructured data.
Under any definition, archiving assumes comparatively fast file level access to data; archiving also assumes a robust search and retrieval software. Ultimately, it is a large, well-indexed repository that permits users to search and access it. However, archives are routinely identified with less frequently used data. Therefore, the search and access is not expected to be accessed frequently.
Archiving, then, is fundamentally different from backup, which involves making point-in-time copies of data to protect against routine hardware failures or catastrophic data loss. Backup typically reaches not only transactional or operational data, but operating systems and applications packages as well. The life expectancy of backup volumes is only a matter of days as a rule, at which point they are replace by new, incremental or differential volumes.
The contrast between the handling of archived data and backed-up data is significantly different. In backup, the IT staffer is really imaging or copying data. In an archive, the data is actually being moved, perhaps leaving a stub file or an index reference behind. So although the data looks like it is on tier-one disk, it is actually elsewhere on less expensive, longer term media.
In part because of the confusion between backup and archiving, some users tend to use archiving only in a niche in their environments and take less advantage of it than they should. Some observers believe that people let their data sit around, either that or delete it outright. Organizations should establish a tier of storage, less expensive, with "good enough" recovery performance. This is not backup.