Class Chunker

  • Direct Known Subclasses:
    FixedChunker, TttdChunker

    public abstract class Chunker
    extends java.lang.Object
    The chunker implements a core part of the deduplication process by breaking files into individual Chunks. A chunker emits an enumeration of chunks, allowing the application to process one chunk after the other.

    Note: Implementations should never read the entire file into memory at once, but instead use an input stream for processing.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static interface  Chunker.ChunkEnumeration
      The chunk enumeration is implemented by the actual chunkers and emits a new chunk when nextElement() is called.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String PROPERTY_SIZE
      Property used by the config to indicate the exact or approximate size of a chunk.
    • Constructor Summary

      Constructors 
      Constructor Description
      Chunker()  
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      abstract Chunker.ChunkEnumeration createChunks​(java.io.File file)
      Opens the given file and creates enumeration of Chunks.
      abstract java.lang.String getChecksumAlgorithm()
      Returns the checksum algorithm used by the chunker to calculate the chunk and file checksums.
      abstract java.lang.String toString()
      Returns a string representation of the chunker implementation.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • PROPERTY_SIZE

        public static final java.lang.String PROPERTY_SIZE
        Property used by the config to indicate the exact or approximate size of a chunk. In bytes.
        See Also:
        Constant Field Values
    • Constructor Detail

    • Method Detail

      • createChunks

        public abstract Chunker.ChunkEnumeration createChunks​(java.io.File file)
                                                       throws java.io.IOException
        Opens the given file and creates enumeration of Chunks. This method should not read the file into memory at once, but instead read and emit new chunks when requested using nextElement().

        The enumeration must be closed by the close() method to remove any possible locks.

        Parameters:
        file - The file that is supposed to be chunked
        Returns:
        An enumeration of individual chunks, must be closed at the end of processing
        Throws:
        java.io.IOException - If any file exceptions occur
      • toString

        public abstract java.lang.String toString()
        Returns a string representation of the chunker implementation.
        Overrides:
        toString in class java.lang.Object
      • getChecksumAlgorithm

        public abstract java.lang.String getChecksumAlgorithm()
        Returns the checksum algorithm used by the chunker to calculate the chunk and file checksums. For the deduplication process to function properly, the checksum algorithms of all chunkers must be equal.