Cugu's blog

IT security & forensics

RSSTwitterGithub

Fast Python file entropy

I was looking for a fast and simple algorithm to calculate entropy of some bytes in Python. I found Ero Carreras post which contains a simple Python implementation. Using the collections module I could speed it up 4.5 times :) !

I tested it against scipy.stats.entropy and other variations but this was the fastest variation I found:

    def entropy(data):
        e = 0

        counter = collections.Counter(data)
        l = len(data)
        for count in counter.values():
            # count is always > 0
            p_x = count / l
            e += - p_x * math.log2(p_x)

        return e