I have a file like this.
M00425_ght_cgd2_2212_B_0_2 (newline)
ATGCCGTTAGAGCTAG
M00425_ght_cgd2_2213_B_0_3_1 (newline)
GTACATTGACATAGAGTACATAGCGAI want a file like this:
M00425_ght_cgd2_2212_B_0_2(tab)ATGCCGTTAGAGCTAG
M00425_ght_cgd2_2213_B_0_3_1(tab)GTACATTGACATAGAGTACATAGCGACan anybody help?
12 Answers
Simple sed command:
sed '$!N;s/\n/\t/' inputfile.txt > outputfile.txtWhich means join every pair of lines with a Tab delimiter.
This command joins two consecutive lines with the N command. They get joined with a \n character between them. The substitute command replaces this newline with a tab, thus joining every pair of lines with a Tab delimiter.
Also that looks like this sed '/$/N;s/\n/\t/' inputfile.txt command.
IF your file contains empty lines you can first delete all that lines by using below sed command:
sed -i '/^$/d' inputfile.txt 1 This is similar to Jacob's answer, but different enough that I thought it warranted mention. Instead of searching for the presence of a string, you could check if the line contained characters other than "GACT".
#!/usr/bin/env python
with open('input.txt','r') as f: lines = f.readlines()
for i in range(0,len(lines)): if len(lines[i].strip('GACT\n'))>0: lines[i] = lines[i].replace('\n','\t')
with open('output.txt','w') as f: f.writelines(lines)If your file does not meet the requirements of unique identifiers in the replacement lines, you can also make a different assumption. The following assumes that for every odd numbered line you want to replace the newline with a tab, and for every even numbered line you do not.
#!/usr/bin/env python
with open('input.txt','r') as f: lines = f.readlines()
for i in range(0,len(lines)): if i/2==i/2.0: lines[i] = lines[i].replace('\n','\t')
with open('output.txt','w') as f: f.writelines(lines)You could save these as, for example, lines2tabs.py then navigate to the directory in a terminal using cd and run it using python lines2tabs.py. Note that you'll have to change input.txt to the name of your file.