Why did the command uniq -c put a whitespace at the beginning?

I have this code in a shell script:

sort input | uniq -c | sort -nr > output

The input file had no preceding white spaces, but the output does. How do I fix this? This is in bash

0

4 Answers

The default behavior of uniq is to right-justify the frequency in a line 7 spaces wide, then separate the frequency from the item with a single space.

Source : (Wayback Machine)

Remove the leading spaces with sed :

$ sort input | uniq -c | sort -nr | sed 's/^\s*//' > output
2

uniq -c adds leading whitespace. E.g.

$ echo test
test
$ echo test | uniq -c 1 test

You could add a command at the end of the pipeline to remove it. E.g.

$ echo test | uniq -c | sed 's/^\s*//'
1 test

FWIW you can use a different sorting tool for more flexibility. Python is one such tool.

Source

#!/usr/bin/python3
import sys, operator, collections
counter = collections.Counter(map(operator.methodcaller('rstrip', '\n'), sys.stdin))
for item, count in counter.most_common(): print(count, item)

In theory this would even be faster than the sort tool for large inputs since the above program uses a hash table to identify duplicate lines instead of a sorted list. (Alas it places lines of identical count in an arbitrary instead of a natural order; this can be amended and still be faster than two sort invocations.)

Output Format

If you want more flexibility on the output format you can look into the print() and format() built-in functions.

For instance, if you want to print the count number in octal with up to 7 leading zeros and followed by a tab instead of a space character with a NUL line terminator, replace the last line with:

 print(format(count, '08o'), item, sep='\t', end='\0')

Usage

Store the script in a file, say sort_count.py, and invoke it with Python:

python3 sort_count.py < input
uniq -c -i | tr -s ' ' | cut -c 2-

Translate leading whitespaces into single whitespace with tr -s and then print output from 2nd character with cut -c.

1

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like