Foro - Perl en Español

por **abraham03** » 2017-01-24 18:50 @826

Hola, ¿qué tal?

Tengo este script que convierte un archivo fastq o una carpeta con varios archivos fastq en formato fasta.

Una de las opciones que agregué es la de definir el número de NNN repeticiones (NNNNNNNNNNNNNNNNNNATAGTGAAGAATGCGACGTACAGGATCATCTA), las cuales pueden excluirse si tienen un número determinado de letras 'N'. Ejemplo: si la opción -n es igual a 15 (-n 15) el script excluye a todas las secuencias que tengan 15 o más veces repetida la letra 'N'.

El problema que tengo es que en este caso algunos de los archivos fasta no presentan secuencias porque todas presentaron un número mayor del especificado y por lo cual son excluidas. Debido a ello quiero que esos archivos vacíos (sin secuencias) se eliminen, pero no he podido agregar un código que me resuelva el problema.

Código:

Sintáxis: [ Descargar ] [ Ocultar ]

Using perl Syntax Highlighting

#!/usr/bin/env perl
use strict;
use warnings;
use Getopt::Long;
 
my ($infile, $file_name, $file_format, $N_repeat, $help, $help_descp,
    $options, $options_descrp, $nofile, $new_file, $count);
 
my $fastq_extension = "\\.fastq";
 
GetOptions (
    'in=s'      => \$infile,
    'N|n=i'     =>\$N_repeat,
    'h|help'    =>\$help,
    'op'        =>\$options
);
 
 # Help
 
 $help_descp =(qq(              
              Ussaje:
              fastQF -in fastq_folder/ -n 15
                      or
              fastQF -in file.fastq -n 15
              ));
 
 $options_descrp =(qq(
                   
            -in      infile.fastq or fastq_folder/                  required
            -n       exclude sequences with more than N repeat      optional
            -h       Help description                               optional
            -op      option section                                 optional
                   ));
 
 $nofile =(qq(
            ERROR:  "No File or Folder Were Chosen !"
            
                Usage:
                    fastQF -in folder/
                    
                Or See -help or -op section
           ));
 
 # Check Files 
 
    if ($help){
        print "$help_descp\n";
        exit;
    }
    elsif ($options){
        print "$options_descrp\n";
        exit;
    }
 
    elsif (!$infile){
        print "$nofile\n";
        exit;
    }
 
 
 #Subroutine to convert from fastq to fasta
 
    sub fastq_fasta {
        
        my $file = shift;
        ($file_name = $file) =~ s/(.*)$fastq_extension.*/$1/;
 
# eliminate old files 
 
        my $oldfiles = $file_name.".fasta";
    
        if ($oldfiles){
            unlink $oldfiles;
        }
    
        open LINE,    '<',   $file             or die "can't read or open $file\n";
        open OUTFILE, '>>', "$file_name.fasta" or die "can't write $file_name\n";
 
        while (
            defined(my $head    = <LINE>)       &&
            defined(my $seq     = <LINE>)       &&
            defined(my $qhead   = <LINE>)       &&
            defined(my $quality = <LINE>)
        ) {
                substr($head, 0, 1, '>');
                
                
                if (!$N_repeat){
                    print OUTFILE $head, $seq;
                     
                    
                }
                
                elsif ($N_repeat){
 
                        my $number_n=$N_repeat-1;
 
                    if ($seq=~ m/(n)\1{$number_n}/ig){
                        next;
                    }
                    else{
                        print OUTFILE $head, $seq;
                    }
                }
        }
        
        close OUTFILE;
        close LINE;
    }
 
 # execute the subrutine to extract the sequences
 
    if (-f $infile) {           # -f es para folder !!
        fastq_fasta($infile);
    }
    else {
        foreach my $file (glob("$infile/*.fastq")) {
        fastq_fasta($file);
        }
    }
    
exit;
Coloreado en 0.005 segundos,  usando GeSHi 1.0.8.4

He intentado agregar un @rray dentro y fuera de la subrutina, pero no me ha funcionado, siempre me elimina solo el último archivo:

Sintáxis: [ Descargar ] [ Ocultar ]

Using perl Syntax Highlighting

@new_file =$file_name.".fasta";
        foreach (@new_file){
            
            if (-z $_) {
                $count++;
                if ($count==1){
                    print "\n\"The choosen File present not sequences\"\n";
                    print " \"or was excluded due to -n $N_repeat\"\n\n";
                
                }
                elsif ($count >=1){
                    print "\n\"$count Files present not sequences\"\n";
                    print " \" or were excluded due to -n $N_repeat\"\n\n";
                    
                }
                
                unlink $new_file;
            }
        }
Coloreado en 0.001 segundos,  usando GeSHi 1.0.8.4

Muchas gracias.

sub fastq_fasta {
 
    my $file = shift;
    (my $file_new = $file) =~ s/^(.*)$fastq_extension.*$/$1/;
    $file_new = "$file_new.fasta";
 
    # eliminate old files
    if (-f $file_new) {
        unlink $file_new;
    }
 
    return if not $N_repeat;                    # si no hay repetición, salimos
 
    my $result;
 
    open LINE, '<', $file               or die "can't read file $file\n";
 
    while ( defined(my $head    = <LINE>)
        and defined(my $seq     = <LINE>)
        and defined(my $qhead   = <LINE>)
        and defined(my $quality = <LINE>)
    ) {
        next if $seq =~ m/n{$N_repeat}/i;       # si hay n-repeticiones (o más), salta a la siguiente línea
 
        $result .= ">$head$seq";                # si no, vamos acumulando el resultado
    }
 
    close LINE;
 
    if ($result) {                              # si hay algo, lo guardamos
        open  OUTFILE, '>', $file_new   or die "can't write file $file_new\n";
        print OUTFILE $result;
        close OUTFILE;
    }
}
 
Coloreado en 0.002 segundos,  usando GeSHi 1.0.8.4

Foro - Perl en Español

Eliminar archivos vacíos en subrutina

Eliminar archivos vacíos en subrutina

Publicidad

Re: Eliminar archivos vacíos en subrutina

¿Quién está conectado?