Friday, May 7, 2010

tee and digest output stream

I was working on storing files in Happy Archive, the way it is constructed currently I store files by writing the contents to an output stream that makes chunks, encodes, encrypts, and stores the chunks.

I also wanted a hash of the whole data stream to add to the file meta-data.

I considered making a decorator output stream that would calculate the hash of all the data that passed through it, but then also decided that it would be simpler to build a sink that does the hashing, and a tee to send the data to the two sinks (hashing, and encoding). Another argument for not using a filter is that the hashing does not change the data that is passing through.

The tee looks like this:
package org.yi.happy.archive;

import java.io.IOException;
import java.io.OutputStream;

/**
 * An output stream that writes to two output streams.
 */
public class TeeOutputStream extends OutputStream {
    private final OutputStream out1;
    private final OutputStream out2;

    /**
     * create an output stream that writes to two output streams.
     * 
     * @param out1
     *            the first stream to write to.
     * @param out2
     *            the second stream to write to.
     */
    public TeeOutputStream(OutputStream out1, OutputStream out2) {
        try {
            this.out1 = out1;
        } finally {
            this.out2 = out2;
        }
    }

    @Override
    public void write(int b) throws IOException {
        try {
            out1.write(b);
        } finally {
            out2.write(b);
        }
    }

    @Override
    public void write(byte[] b) throws IOException {
        try {
            out1.write(b);
        } finally {
            out2.write(b);
        }
    }

    @Override
    public void write(byte[] b, int off, int len) throws IOException {
        try {
            out1.write(b, off, len);
        } finally {
            out2.write(b, off, len);
        }
    }

    @Override
    public void flush() throws IOException {
        try {
            out1.flush();
        } finally {
            out2.flush();
        }
    }

    @Override
    public void close() throws IOException {
        try {
            out1.close();
        } finally {
            out2.close();
        }
    }
}

Also the hashing output stream looks like this:
package org.yi.happy.archive;

import java.io.IOException;
import java.io.OutputStream;
import java.security.MessageDigest;

import org.yi.happy.archive.key.HashValue;

/**
 * An output stream that calculates the digest of whatever is written to it.
 */
public class DigestOutputStream extends OutputStream {
    private MessageDigest md;
    private HashValue hash;
    private long size;

    /**
     * set up an output stream that calculates the digest of whatever is written
     * to it.
     * 
     * @param md
     *            the {@link MessageDigest} to use. The {@link MessageDigest} is
     *            assumed to be freshly created and not shared.
     */
    public DigestOutputStream(MessageDigest md) {
        this.md = md;
        this.hash = null;
        this.size = 0;
    }

    @Override
    public void write(int b) throws IOException {
        if (md == null) {
            throw new ClosedException();
        }

        md.update((byte) b);
        size += 1;
    }

    @Override
    public void write(byte[] b) throws IOException {
        if (md == null) {
            throw new ClosedException();
        }

        md.update(b);
        size += b.length;
    }

    @Override
    public void write(byte[] b, int off, int len) throws IOException {
        if (md == null) {
            throw new ClosedException();
        }

        md.update(b, off, len);
        size += len;
    }

    @Override
    public void close() throws IOException {
        if (md == null) {
            return;
        }

        hash = new HashValue(md.digest());
        md = null;
    }

    /**
     * Get the final digest value.
     * 
     * @return the final digest value.
     * @throws IllegalStateException
     *             if {@link #close()} has not been called.
     */
    public HashValue getHash() throws IllegalStateException {
        if (hash == null) {
            throw new IllegalStateException();
        }
        return hash;
    }

    /**
     * @return the number of bytes written to this stream.
     */
    public long getSize() {
        return size;
    }
}

HashValue is a value object that represents strings of bytes that are hashes. ClosedException is an IOException.

I also noticed that there is a DigestOutputStream in the Java standard library which is an output stream filter.

No comments:

Post a Comment