Ayuda con edición de cabeceras FASTA
Publicado: 2013-04-03 17:45 @781
Soy nuevo en esto, pero sé que para muchas secuencias es más fácil usar Perl para editar las cabeceras FASTA para evitar errores en análisis posteriores. Mi problema es no saber cómo meter una línea de comandos. Tengo archivos de entrada:
y quiero editarlo para tener un archivo de salida:
Espero puedan ayudarme.
Using text Syntax Highlighting
>gi|343457071|gb|JF424046.1| Streptomyces abikoensis strain KCTC 9662 recombinase A (recA) gene, partial cds
TCCACCGGGTCGACCGCTCTCGACGTCGCGCTCGGTGTCGGCGGCCTGCCGCGCGGCCGCGTGGTGGAGA
TCTACGGACCGGAGTCCTCCGGTAAGACGACCCTGACGTTGCACGCCGTGGCCAACGCCCAGCGGGCCGG
CGGCACCGTCGCCTTCGTGGACGCCGAGCACGCCCTCGACCCCGAGTACGCCAGAAAGCTCGGCGTCGAC
ATCGACAACCTGATCCTTTCCCAGCCGGACAACGGCGAGCAGGCTCTCGAGATCGTCGACATGCTGGTCC
>gi|78173104|gb|DQ234054.1| Streptomyces argillaceus strain ATCC 12956 RecA (recA) gene, complete cds
CCATGGCAGGCACCGACCGCGAGAAGGCCCTGGACGCCGCACTCGCACAGATTGAACGGCAATTCGGCAA
GGGCGCGGTCATGCGCATGGGCGACCGCTCGAAGGAGCCCATCGAGGTCATCCCGACCGGGTCGACCGCG
CTCGACGTGGCCCTCGGCGTCGGCGGTCTGCCGCGCGGCCGTGTCATCGAGGTCTACGGACCCGAGTCCT
CCGGCAAGACGACCCTGACCCTGCACGCGGTGGCGAACGCCCAGAGGGCCGGCGGCCAGGTGGCGTTCGT
GGACGCCGAGCACGCCCTCGACCCCGAGTACGCGCAGAAGCTCGGCGTGGACATCGACAACCTGATCCTG
TCCCAGCCGGACAACGGCGAGCAGGCCCTGGAGATCGTGGACATGCTCGTCCGCTCCGGGGCCCTCGACC
TGATCGTCATCGACTCCGTCGCCGCGCTCGTCCCGCGTGCGGAGATCGAGGGCGAGATGGGCGACAGCCA
CGTGGGTCTGCAGGCCCGTCTGATGAGCCAGGCCCTGCGGAAGATCACCAGCGCGCTCAACCAGTCCAAG
ACCACCGCGATCTTCATCAACCAGCTCCGCGAGAAGATCGGTGTGATGTTCGGCTCCCCGGAGACCACGA
CCGGTGGCCGGGCGCTGAAGTTCTACGCCTCCGTGCGGCTCGACATCCGGCGCATCGAGACGCTGAAGGA
>gi|388462170|gb|JQ738389.1| Streptomyces labedae recombinase A (recA) gene, partial cds
GCACTCGCACAGATTGAACGCCAATTCGGCAAGGGCGCGGTCATGCGCATGGGCGAGCGGTCGAAGGAGC
CCATCGAGGTCATCCCGACCGGGTCGACCGCGCTCGACGTGGCCCTCGGCGTCGGCGGCCTGCCGCGTGG
CCGTGTGGTGGAGATCTACGGGCCGGAGTCCTCCGGTAAGACGACCCTGACCCTGCACGCGGTGGCGAAC
GCGCAGAAGGCCGGCGGCCAGGTCGCGTTCGTGGACGCGGAGCACGCCCTCGACCCCGAGTACGCGAAGA
TCCACCGGGTCGACCGCTCTCGACGTCGCGCTCGGTGTCGGCGGCCTGCCGCGCGGCCGCGTGGTGGAGA
TCTACGGACCGGAGTCCTCCGGTAAGACGACCCTGACGTTGCACGCCGTGGCCAACGCCCAGCGGGCCGG
CGGCACCGTCGCCTTCGTGGACGCCGAGCACGCCCTCGACCCCGAGTACGCCAGAAAGCTCGGCGTCGAC
ATCGACAACCTGATCCTTTCCCAGCCGGACAACGGCGAGCAGGCTCTCGAGATCGTCGACATGCTGGTCC
>gi|78173104|gb|DQ234054.1| Streptomyces argillaceus strain ATCC 12956 RecA (recA) gene, complete cds
CCATGGCAGGCACCGACCGCGAGAAGGCCCTGGACGCCGCACTCGCACAGATTGAACGGCAATTCGGCAA
GGGCGCGGTCATGCGCATGGGCGACCGCTCGAAGGAGCCCATCGAGGTCATCCCGACCGGGTCGACCGCG
CTCGACGTGGCCCTCGGCGTCGGCGGTCTGCCGCGCGGCCGTGTCATCGAGGTCTACGGACCCGAGTCCT
CCGGCAAGACGACCCTGACCCTGCACGCGGTGGCGAACGCCCAGAGGGCCGGCGGCCAGGTGGCGTTCGT
GGACGCCGAGCACGCCCTCGACCCCGAGTACGCGCAGAAGCTCGGCGTGGACATCGACAACCTGATCCTG
TCCCAGCCGGACAACGGCGAGCAGGCCCTGGAGATCGTGGACATGCTCGTCCGCTCCGGGGCCCTCGACC
TGATCGTCATCGACTCCGTCGCCGCGCTCGTCCCGCGTGCGGAGATCGAGGGCGAGATGGGCGACAGCCA
CGTGGGTCTGCAGGCCCGTCTGATGAGCCAGGCCCTGCGGAAGATCACCAGCGCGCTCAACCAGTCCAAG
ACCACCGCGATCTTCATCAACCAGCTCCGCGAGAAGATCGGTGTGATGTTCGGCTCCCCGGAGACCACGA
CCGGTGGCCGGGCGCTGAAGTTCTACGCCTCCGTGCGGCTCGACATCCGGCGCATCGAGACGCTGAAGGA
>gi|388462170|gb|JQ738389.1| Streptomyces labedae recombinase A (recA) gene, partial cds
GCACTCGCACAGATTGAACGCCAATTCGGCAAGGGCGCGGTCATGCGCATGGGCGAGCGGTCGAAGGAGC
CCATCGAGGTCATCCCGACCGGGTCGACCGCGCTCGACGTGGCCCTCGGCGTCGGCGGCCTGCCGCGTGG
CCGTGTGGTGGAGATCTACGGGCCGGAGTCCTCCGGTAAGACGACCCTGACCCTGCACGCGGTGGCGAAC
GCGCAGAAGGCCGGCGGCCAGGTCGCGTTCGTGGACGCGGAGCACGCCCTCGACCCCGAGTACGCGAAGA
Coloreado en 0.000 segundos, usando GeSHi 1.0.8.4
y quiero editarlo para tener un archivo de salida:
Using text Syntax Highlighting
>S_abikoensis
TCCACCGGGTCGACCGCTCTCGACGTCGCGCTCGGTGTCGGCGGCCTGCCGCGCGGCCGCGTGGTGGAGA
TCTACGGACCGGAGTCCTCCGGTAAGACGACCCTGACGTTGCACGCCGTGGCCAACGCCCAGCGGGCCGG
CGGCACCGTCGCCTTCGTGGACGCCGAGCACGCCCTCGACCCCGAGTACGCCAGAAAGCTCGGCGTCGAC
ATCGACAACCTGATCCTTTCCCAGCCGGACAACGGCGAGCAGGCTCTCGAGATCGTCGACATGCTGGTCC
>S_argillaceus
CCATGGCAGGCACCGACCGCGAGAAGGCCCTGGACGCCGCACTCGCACAGATTGAACGGCAATTCGGCAA
GGGCGCGGTCATGCGCATGGGCGACCGCTCGAAGGAGCCCATCGAGGTCATCCCGACCGGGTCGACCGCG
CTCGACGTGGCCCTCGGCGTCGGCGGTCTGCCGCGCGGCCGTGTCATCGAGGTCTACGGACCCGAGTCCT
CCGGCAAGACGACCCTGACCCTGCACGCGGTGGCGAACGCCCAGAGGGCCGGCGGCCAGGTGGCGTTCGT
GGACGCCGAGCACGCCCTCGACCCCGAGTACGCGCAGAAGCTCGGCGTGGACATCGACAACCTGATCCTG
TCCCAGCCGGACAACGGCGAGCAGGCCCTGGAGATCGTGGACATGCTCGTCCGCTCCGGGGCCCTCGACC
TGATCGTCATCGACTCCGTCGCCGCGCTCGTCCCGCGTGCGGAGATCGAGGGCGAGATGGGCGACAGCCA
CGTGGGTCTGCAGGCCCGTCTGATGAGCCAGGCCCTGCGGAAGATCACCAGCGCGCTCAACCAGTCCAAG
ACCACCGCGATCTTCATCAACCAGCTCCGCGAGAAGATCGGTGTGATGTTCGGCTCCCCGGAGACCACGA
CCGGTGGCCGGGCGCTGAAGTTCTACGCCTCCGTGCGGCTCGACATCCGGCGCATCGAGACGCTGAAGGA
>S_labedae
GCACTCGCACAGATTGAACGCCAATTCGGCAAGGGCGCGGTCATGCGCATGGGCGAGCGGTCGAAGGAGC
CCATCGAGGTCATCCCGACCGGGTCGACCGCGCTCGACGTGGCCCTCGGCGTCGGCGGCCTGCCGCGTGG
CCGTGTGGTGGAGATCTACGGGCCGGAGTCCTCCGGTAAGACGACCCTGACCCTGCACGCGGTGGCGAAC
GCGCAGAAGGCCGGCGGCCAGGTCGCGTTCGTGGACGCGGAGCACGCCCTCGACCCCGAGTACGCGAAGA
TCCACCGGGTCGACCGCTCTCGACGTCGCGCTCGGTGTCGGCGGCCTGCCGCGCGGCCGCGTGGTGGAGA
TCTACGGACCGGAGTCCTCCGGTAAGACGACCCTGACGTTGCACGCCGTGGCCAACGCCCAGCGGGCCGG
CGGCACCGTCGCCTTCGTGGACGCCGAGCACGCCCTCGACCCCGAGTACGCCAGAAAGCTCGGCGTCGAC
ATCGACAACCTGATCCTTTCCCAGCCGGACAACGGCGAGCAGGCTCTCGAGATCGTCGACATGCTGGTCC
>S_argillaceus
CCATGGCAGGCACCGACCGCGAGAAGGCCCTGGACGCCGCACTCGCACAGATTGAACGGCAATTCGGCAA
GGGCGCGGTCATGCGCATGGGCGACCGCTCGAAGGAGCCCATCGAGGTCATCCCGACCGGGTCGACCGCG
CTCGACGTGGCCCTCGGCGTCGGCGGTCTGCCGCGCGGCCGTGTCATCGAGGTCTACGGACCCGAGTCCT
CCGGCAAGACGACCCTGACCCTGCACGCGGTGGCGAACGCCCAGAGGGCCGGCGGCCAGGTGGCGTTCGT
GGACGCCGAGCACGCCCTCGACCCCGAGTACGCGCAGAAGCTCGGCGTGGACATCGACAACCTGATCCTG
TCCCAGCCGGACAACGGCGAGCAGGCCCTGGAGATCGTGGACATGCTCGTCCGCTCCGGGGCCCTCGACC
TGATCGTCATCGACTCCGTCGCCGCGCTCGTCCCGCGTGCGGAGATCGAGGGCGAGATGGGCGACAGCCA
CGTGGGTCTGCAGGCCCGTCTGATGAGCCAGGCCCTGCGGAAGATCACCAGCGCGCTCAACCAGTCCAAG
ACCACCGCGATCTTCATCAACCAGCTCCGCGAGAAGATCGGTGTGATGTTCGGCTCCCCGGAGACCACGA
CCGGTGGCCGGGCGCTGAAGTTCTACGCCTCCGTGCGGCTCGACATCCGGCGCATCGAGACGCTGAAGGA
>S_labedae
GCACTCGCACAGATTGAACGCCAATTCGGCAAGGGCGCGGTCATGCGCATGGGCGAGCGGTCGAAGGAGC
CCATCGAGGTCATCCCGACCGGGTCGACCGCGCTCGACGTGGCCCTCGGCGTCGGCGGCCTGCCGCGTGG
CCGTGTGGTGGAGATCTACGGGCCGGAGTCCTCCGGTAAGACGACCCTGACCCTGCACGCGGTGGCGAAC
GCGCAGAAGGCCGGCGGCCAGGTCGCGTTCGTGGACGCGGAGCACGCCCTCGACCCCGAGTACGCGAAGA
Coloreado en 0.000 segundos, usando GeSHi 1.0.8.4
Espero puedan ayudarme.