Class FixedChunker


  • public class FixedChunker
    extends Chunker
    The fixed chunker is an implementation of the Chunker. It implements a simple fixed-offset chunking, i.e. it breaks files at multiples of the given chunk size parameter.

    While it is very fast due to its offset-based approach (and not content-based), it performs very badly when bytes are added or removed from the beginning of a file.

    Details can be found in chapter 3.4 of the thesis at blog.philippheckel.com. The FixedChunker implements the chunker described in chapter 3.4.2.

    • Constructor Summary

      Constructors 
      Constructor Description
      FixedChunker​(int chunkSize)
      Creates a new fixed offset chunker with the default file/chunk checksum algorithm SHA1.
      FixedChunker​(int chunkSize, java.lang.String checksumAlgorithm)
      Creates a new fixed offset chunker.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      Chunker.ChunkEnumeration createChunks​(java.io.File file)
      Opens the given file and creates enumeration of Chunks.
      java.lang.String getChecksumAlgorithm()
      Returns the checksum algorithm used by the chunker to calculate the chunk and file checksums.
      java.lang.String toString()
      Returns a string representation of the chunker implementation.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Constructor Detail

      • FixedChunker

        public FixedChunker​(int chunkSize)
        Creates a new fixed offset chunker with the default file/chunk checksum algorithm SHA1.
        Parameters:
        chunkSize - Size of a chunk in bytes
      • FixedChunker

        public FixedChunker​(int chunkSize,
                            java.lang.String checksumAlgorithm)
        Creates a new fixed offset chunker.
        Parameters:
        chunkSize - Size of a chunk in bytes
        checksumAlgorithm - Algorithm to calculare the chunk and file checksums (e.g. SHA1, MD5)
    • Method Detail

      • createChunks

        public Chunker.ChunkEnumeration createChunks​(java.io.File file)
                                              throws java.io.IOException
        Description copied from class: Chunker
        Opens the given file and creates enumeration of Chunks. This method should not read the file into memory at once, but instead read and emit new chunks when requested using nextElement().

        The enumeration must be closed by the close() method to remove any possible locks.

        Specified by:
        createChunks in class Chunker
        Parameters:
        file - The file that is supposed to be chunked
        Returns:
        An enumeration of individual chunks, must be closed at the end of processing
        Throws:
        java.io.IOException - If any file exceptions occur
      • getChecksumAlgorithm

        public java.lang.String getChecksumAlgorithm()
        Description copied from class: Chunker
        Returns the checksum algorithm used by the chunker to calculate the chunk and file checksums. For the deduplication process to function properly, the checksum algorithms of all chunkers must be equal.
        Specified by:
        getChecksumAlgorithm in class Chunker
      • toString

        public java.lang.String toString()
        Description copied from class: Chunker
        Returns a string representation of the chunker implementation.
        Specified by:
        toString in class Chunker