609. Find Duplicate File in System

Description

Here

Solution

Solution

Follow ups

  • Imagine you are given a real file system, how will you search files? DFS or BFS?
    • DFS consumes less memory
  • If the file content is very large (GB level), how will you modify your solution?
    • First compare size, then hash, then byte by byte.
  • If you can only read the file by 1kb each time, how will you modify your solution?
    • No modification
  • What is the time complexity of your modified solution? What is the most time-consuming part and memory consuming part of it? How to optimize?
    • Worst case O(N ^ 2 * k), N, number of files, k file size, when all the files are the same.
    • Hash computation.
  • How to make sure the duplicated files you find are not false positive?
    • byte by byte will guranatee that.

results matching ""

    No results matching ""