Block level de-duplication is at best a good start. Thus, I would expect the better de-duplication schemes to treat files as glorified streams, and to de-duplicate stream fragments rather than operating at the block level. two disk images that contain mostly the same files at mostly the same offsets).ĭe-duplicating the boilerplate content in two Word documents, however, requires not only discovering that the content is the same, but also dealing with the fact that the content in question likely spans multiple blocks, and more to the point, dealing with the fact that the content will almost always span those blocks differently in different files. That would work for eliminating actual duplicated files, but would be nearly useless for eliminating portions of files unless those files happen to be block-structured themselves (e.g. I think it depends on which scheme you're talking about.īasic de-duplication techniques might focus only on blocks being identical.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |