If a process is reading data from a file on HDFS named fileA, then another process moves a file named fileA over the top of it, will the read of the original file break?
-
Here is what I think will happen: Say the original fileA is comprised of blocks 102, 103, and 104. When the read is initiated, fileA will be dereferenced to those blocks, and those blocks will be read. While the read is happening, fileA will be redirected (by the move operation) to point to, say, blocks 502, 503 and 504. But the old blocks are still around, and they will be read. But what if another operation is also writing to the cluster? What if it overwrites blocks 100, 101, 102, and 103? Or does HDFS remember that someone is reading those blocks, and not allocate them to hold the new data being written?
-
Answer:
No. New file is copied as a duplicate. You can avoid that in your load process though.
Raviteja Chirala at Quora Visit the source
Related Q & A:
- How to upload a file to another server using cURL and PHP?Best solution by Stack Overflow
- Unix tools: what if a file is named minus something?Best solution by Super User
- How to read a file?Best solution by Stack Overflow
- How to write binary data to a file?Best solution by Stack Overflow
- Do I need a special ISP package to do a file hosting site?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.