How to delete duplicate phrases in a text file?
-
Okay so I have about 1000 duplicated phrases in this file, so doing this manually is not an option. Note that these are PHRASES, not lines or words, and each "phrase" is about 10 lines long. I am trying to get rid of duplicate phrases, yet the only thing that renders an "item" (or phrase) a duplicate is the position syntax. For Example: class Item0 { position[]={4347.6001,0,3214.6399}; azimut=128.81599; special="NONE"; id=1; side="EMPTY"; vehicle="Land_fortified_nest_small"… lock="UNLOCKED"; skill=0.2; init="this setPos [4347.6, 3214.64, 0]; this setDir 128.816;"; }; class Item1 { position[]={4347.6001,0,3214.6399}; azimut=128.81599; special="NONE"; id=2; side="EMPTY"; vehicle="Land_fortified_nest_small"… lock="UNLOCKED"; skill=0.2; init="this setPos [4347.6, 3214.64, 0]; this setDir 128.816;"; }; Now the previous two phrases are duplicates, yet the ID and ITEM# are different, so the only way to identify duplicate phrases is through the position[]={} parameter. When 2 phrases have the same position, these two phrases ARE duplicates, regardless or the ID or ITEM#. So my goal is to use some type of code, script, program, or regular expression to delete all duplicate phrases, but leaving the first duplicate untouched. So if there are three duplicates, one phrase is left but the two are deleted. How would I go about doing this? An example of the desired input/output: Input: class Item0 { position[]={4347.6001,0,3214.6399}; azimut=128.81599; special="NONE"; id=1; side="EMPTY"; vehicle="Land_fortified_nest_small"… lock="UNLOCKED"; skill=0.2; init="this setPos [4347.6, 3214.64, 0]; this setDir 128.816;"; }; class Item1 { position[]={2156.6001,0,7513.6399}; azimut=128.81599; special="NONE"; id=2; side="EMPTY"; vehicle="Land_fortified_nest_small"… lock="UNLOCKED"; skill=0.2; init="this setPos [2156.6, 7531.64, 0]; this setDir 128.816;"; }; class Item2 { position[]={4347.6001,0,3214.6399}; azimut=128.81599; special="NONE"; id=3; side="EMPTY"; vehicle="Land_fortified_nest_small"… lock="UNLOCKED"; skill=0.2; init="this setPos [4347.6, 3214.64, 0]; this setDir 128.816;"; }; Output: class Item0 { position[]={4347.6001,0,3214.6399}; azimut=128.81599; special="NONE"; id=1; side="EMPTY"; vehicle="Land_fortified_nest_small"… lock="UNLOCKED"; skill=0.2; init="this setPos [4347.6, 3214.64, 0]; this setDir 128.816;"; }; class Item1 { position[]={2156.6001,0,7513.6399}; azimut=128.81599; special="NONE"; id=2; side="EMPTY"; vehicle="Land_fortified_nest_small"… lock="UNLOCKED"; skill=0.2; init="this setPos [2156.6, 7531.64, 0]; this setDir 128.816;"; };
-
Answer:
I'd do it in Java - put them in a linked list (using the LinkedList class) and then traverse the list checking values. Remove any dupes found and you're done.
Dylan at Yahoo! Answers Visit the source
Related Q & A:
- How to extract frequency range from a Wav file?Best solution by Stack Overflow
- How to delete an entry from a dictionary in Python?Best solution by Stack Overflow
- Is there a limit on the size of a new file or a text file?Best solution by Stack Overflow
- How to search for a particular string in a text file using java?Best solution by Stack Overflow
- How can i send some one a text through my computer?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.