cancel
Showing results for 
Search instead for 
Did you mean: 

Compare two files to see if they are the same

tom_saunders
New Contributor

Does anyone have any experience or recommendations about how to compare two files and check if they are identical? I don’t mean data files, I’m thinking about media files here so something along the lines of creating a hash of the files and comparing them?

1 ACCEPTED SOLUTION

tom_saunders
New Contributor

With some assistance from our good friends at SnapLogic I have a solution to this now. To get a hash of a file, you can use a Binary to Document snap, and then use a Mapper on the resulting document. The expression “Digest.sha256($content)” will generate an sha256 hash from the file. This hash can then be compared with other hashes to rcognise duplicate files.

In my use case I have run this process on files in two different directories and then done an outer Join on the hash values to produce a report on which files match and which do not.

Hash Compare

View solution in original post

1 REPLY 1

tom_saunders
New Contributor

With some assistance from our good friends at SnapLogic I have a solution to this now. To get a hash of a file, you can use a Binary to Document snap, and then use a Mapper on the resulting document. The expression “Digest.sha256($content)” will generate an sha256 hash from the file. This hash can then be compared with other hashes to rcognise duplicate files.

In my use case I have run this process on files in two different directories and then done an outer Join on the hash values to produce a report on which files match and which do not.

Hash Compare