How to run several commands with SSH.Net?

Parallel Computing: Do pipelined unix commands run faster on multicore?

  • if yes, how exactly does it happen and which commands/pipes would run faster than others? Also if it applies only to certain OS distributions or hardware platforms, what are they? This is a follow-up question to .

  • Answer:

    I did a fairly extensive test of this, and the proper way to characterize this is: Yes, they run in parallel, but "getting it right" is very hard.  So pragmatically, the answer is "no".  But, of course, it depends on the programs involved in the pipe.  Take the UNIX sort command, for example.   It runs in a single process, and doesn't start sorting anything until it's read an EOF from it's input.  Thus, if you have multiple sort commands in a pipe, they won't run in parallel.  Similarly, sort doesn't write it's output until it's completely done.  So, a command like this: sort somefile.txt | gzip > somefile_sorted.txt.gz Won't run any faster than: sort somefile.txt > somefile_sorted.txt gzip somefile_sorted.txt I'm sure there are some programs that can "stream" their input in chunks, and there may be a small benefit to using pipes in this case.  At the very least, using pipes saves disk I/O of having to write the file out twice. In addition, the UNIX/Linux socket buffer size is *very* small, so there isn't much room for pararrelism.  Processes (even well-coded streaming ones) early in the pipe block as they're writing to stdout if the processes further down the line are stalled and not reading their input.  So, if you have something like this: fast_io_bound_program input.txt | slow_cpu_bound | slower_cpu_bound They'll all end up blocking on the last item in the pipe, since it won't be reading it's stdin. You can do a fairly simple test by doing something like this: time cat /dev/urandom | hexdump -n 10000000 -e '1/ "%u:%u\n"' | sort -t: -k1n | gzip > /dev/null And compare it with: time (cat /dev/urandom | hexdump -n 10000000 -e '1/ "%u:%u\n"' | sort -t: -k1n > /tmp/file; cat /tmp/file | gzip > /dev/null ) Both these lines take almost the exact same time on my machine. (approx 7.7 seconds) An example that does show speedups via parallelism is something simple like this: First, we create an example random input file by doing: cat /dev/urandom | hexdump -e '1/ "%x:%x\n"' | head -c 256M > /tmp/hex.txt Then, we do a big list of sed's: time (cat /tmp/hex.txt | sed -e "s/0/M/g" | sed -e "s/1/N/g" | sed -e "s/2/N/g" | sed -e "s/3/O/g" | sed -e "s/4/P/g" | sed -e "s/5/Q/g" > /dev/null) real 0m34.710s user 0m59.050s sys 0m2.270s And compare that with the non-parallel version: time (cat /tmp/hex.txt | sed -e "s/0/M/g;s/1/N/g;s/2/N/g;s/3/O/g;s/4/P/g;s/5/Q/g" > /dev/null) real 0m44.219s user 0m43.230s sys 0m0.490s So it is slightly faster.  But, if you add in a complex CPU bound step at the end, like a bzip2 -9, both examples will be gated by this last step, and you'll lose most parallelism because everyone ends up blocked waiting for that last guy in the pipe to read his input.

Steve Lacy at Quora Visit the source

Was this solution helpful to you?

Other answers

Yes, as long as the the processes can work simultaneously, it should be faster. The simplest way to make piped processes work simultaneously is to ensure that the units of work being passed between the processes are smaller than the fifo buffers in between them. Consider that in the bash examples to which this question is following up, units of work are lines in a text file, and linux fifo buffers are typically 65k in size.  Even though this introduces latency in the overall time it takes for a single item to get processed, it ensures that every process in the queue almost always has some input to work on, as long as one of the processes isn't an obvious bottleneck.

Erik Frey

The separate processes will only sometimes run on different cores, and whether they do depends on the kernel you are using.  The thing to realize is that it may not always be better to run the processes on different cores, since you get better data locality when running on the same core.  The kernel has heuristics to determine if it is better to run the processes on the same core versus putting them on separate cores, and these heuristics have changed over time. A few years ago, I experimented with running "tar -z" versus "tar | gzip", since the latter could possibly have better performance due to parallelization that I assume tar itself wasn't doing.  (This was for archiving hundreds of gigabytes of data, so even though tar itself takes maybe 10% of the cpu time of gzip, the 10% improvement would be useful.)  On the old kernel I was using, it would always put tar and gzip on the same core, which gave worse performance than tar -z; I ended up manually pinning the processes to separate cores which improved performance. I tried this experiment again on a newer kernel, and the kernel would always put the processes on different cores.  I believe modern linux kernels are pretty good at making this decision now, so I believe the answer is that yes, when using unix pipes, your computer will automatically take advantage of the natural parallelism (though as the other answers point out there may not be any parallelism to take advantage of).

Kevin Modzelewski

If done correctly, yes. It saves you a lot of time spent in file io. Here's an article that shows the use of named pipes, unnamed pipes and process substitution and the performance improvements it provides. http://www.vincebuffalo.com/2013/08/08/the-mighty-named-pipe.html

TeckYian Lim

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.