tom_saunders
5 years agoNew Contributor
Compare two files to see if they are the same
 Does anyone have any experience or recommendations about how to compare two files and check if they are identical?  I don’t mean data files, I’m thinking about media files here so something along the...
- 5 years ago
With some assistance from our good friends at SnapLogic I have a solution to this now. To get a hash of a file, you can use a Binary to Document snap, and then use a Mapper on the resulting document. The expression “Digest.sha256($content)” will generate an sha256 hash from the file. This hash can then be compared with other hashes to rcognise duplicate files.
In my use case I have run this process on files in two different directories and then done an outer Join on the hash values to produce a report on which files match and which do not.