by Guest on 2024/12/28 05:11:52 PM
The thing is, there's literally no need to be copying file chunks at all. The current implementation of a single cache folder on a single drive is highly wasteful of bandwidth and presents a significant system bottleneck, as I've argued before.
If the "empty" target files are pre-loaded onto the target drive, then collating all the pieces to transfer into them elsewhere, on a different drive or not, involves more system handles for all those small piece files and more disk accesses (since you are then copying your completed piece into place in the target as well as deleting it from cache). Critically, ALL the torrent activity (every write, validation- and copy-read and subsequent cache deletions, as well as updates to the torrent completion bitfield which is also being stored in the cache) has to go through that SINGLE cache drive interface. The advantage of having the target files stored on different physical drives, spreading the bandwidth load across many drives, which might be helpful in some cases with cheaper "consumer" drives instead of server drives designed to handle heavy request loads, is COMPLETELY lost with a single cache on a single drive.
The far simpler implementation would be to use a single per-file handle and store all incoming data DIRECTLY into the target file(s) in its final location, meaning there is no central bandwidth bottleneck. There's no functional difference except you don't have to copy the data all over again, you can still verify in situ from the target since you know the byte location in the files of every piece, the system doesn't have to handle thousands of small files, and (most importantly) if there is a crash or other SNAFU, ALL the data that has been downloaded is IN PLACE in the target files, not stored in a piece file elsewhere that may not be usable by other clients or by Tixati on restart, so wasted. That may not be complete, but it might make a file viewable or usable in the very worst case instead of a complete chunk being missing. And you still have your completion bitfield in cache to track what should have been downloaded per piece.
Now I agree that there might still need to be some cache mechanism for pieces that cross into files that are not being downloaded, or where the empty target files are waiting to be initialised/written yet pieces are already coming in, and it would require a slight tweak to the initial piece validation algorithm where a piece crosses files (instead of the entire chunk of data conveniently being in its own file parcel in the cache), but these are miniscule programming inconveniences compared to the huge reduction in system resources this would provide. I'd suggest that might also improve end user experience in some systems.