How to use hadoop for text file?

How can I use CShell and awk to find some words which were not repeated in a text file?

  • I use CShell+Awk to process a text file, and I want find some words were not repeated. Could you give me some suggestions?

  • Answer:

    fmt -1 FILE | tr A-Z a-z | tr -dc 'a-z\n' | sort | uniq -u This produces a list of all the words that appear exactly once in the file FILE -- I think that's what you're asking for. It strips out punctuation and excess whitespace and lower-cases everything; you can adjust that to taste. PS - stop using csh. If you try to do anything more than very simple single commands, it will bite you. The classic "Csh Programming Considered Harmful": http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/ explains why, and despite the title it applies all the way down to very simple one-liners. Try bash, which is what most shell users use in 2012, or in any event something derived from the Bourne shell (sh, ksh, zsh, bash, and some others.)

Greg Price at Quora Visit the source

Was this solution helpful to you?

Other answers

awk '{for (i=1;i<=NF;i++) c[$i]++ } END { for (i in c) if (c[i]==1) print i }' yourfile.txt

Ronald Loui

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.