Use Hadoop Pig to load data from text file w/ each record on multiple lines?
-
I have my data file in the following format: U: john T: 2011-03-03 12:12:12 L: san diego, CA U: john T: 2011-03-03 12:12:12 L: san diego, CA What's the best way to read this file w/ Hadoop/pig/whatever for analysis?
-
Answer:
Is there any way you can control the way the data is being written? Writing an process that moves this to tab separated would help you do this out of the box. Otherwise, writing a custom record reader (in Pig or Java MapReduce) might be your only option. Neither is very hard.
dolan at Stack Overflow Visit the source
Related Q & A:
- How to skip columns empty in CSV file when importing into MySQL table using LOAD DATA INFILE?Best solution by Stack Overflow
- How to delete row from text file?Best solution by Stack Overflow
- How to extract information from text file in Python?Best solution by Stack Overflow
- How to write binary data to a file?Best solution by Stack Overflow
- Is there a limit on the size of a new file or a text file?Best solution by Stack Overflow
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.