Help with generation of PSI results #184

chandaayan · 2024-02-15T17:19:17Z

I have been trying to perform alternative splicing using SUPPA2. Briefly, I have 28 human samples on which RNAseq was performed. From those, I have generated the count files with Kallisto using gencode.v44 indices. Using the same gencode.v44 gtf file, I have generated the different .ioe files for alternate splicing events. Now, my original abundance.tsv file generated by kallisto looked like this:

$ head abundance_preprocessed.tsv
ENST00000456328.2 0.000000
ENST00000450305.2 0.000000
ENST00000488147.1 0.000000
ENST00000619216.1 0.000000
ENST00000473358.1 0.000000
ENST00000469289.1 0.000000
ENST00000607096.1 0.000000
ENST00000417324.1 0.000000
ENST00000461467.1 0.000000
ENST00000606857.1 0.000000
One of the generated ioe files looks like this:

$ head gencode.v44_A3_strict.ioe
seqname gene_id event_id alternative_transcripts total_transcripts
chr1 ENSG00000237491.10 ENSG00000237491.10;A3:chr1:779092-803919:779092-803951:+ ENST00000655765.1,ENST00000670700.1,ENST00000657896.1,ENST00000658648.1,ENST00000656571.1,ENST00000669749.1,ENST00000666217.1,ENST00000665719.1 ENST00000665719.1,ENST00000412115.2,ENST00000666217.1,ENST00000658648.1,ENST00000669749.1,ENST00000657896.1,ENST00000670700.1,ENST00000655765.1,ENST00000656571.1
chr1 ENSG00000237491.10 ENSG00000237491.10;A3:chr1:779092-803919:779092-803922:+ ENST00000655765.1,ENST00000670700.1,ENST00000657896.1,ENST00000658648.1,ENST00000656571.1,ENST00000669749.1,ENST00000666217.1,ENST00000665719.1 ENST00000665719.1,ENST00000443772.2,ENST00000666217.1,ENST00000658648.1,ENST00000669749.1,ENST00000657896.1,ENST00000670700.1,ENST00000655765.1,ENST00000656571.1
chr1 ENSG00000237491.10 ENSG00000237491.10;A3:chr1:807323-809622:807323-809658:+ ENST00000412115.2,ENST00000670700.1,ENST00000657896.1,ENST00000658648.1 ENST00000665719.1,ENST00000658648.1,ENST00000588951.5,ENST00000669749.1,ENST00000657896.1,ENST00000670700.1,ENST00000412115.2,ENST00000656571.1

So now I am trying to generate Psi results using the slurm:
#!/bin/bash
#SBATCH --job-name=suppa_psi_calculation
#SBATCH --output=/home/achanda/%x_%j.out
#SBATCH --error=/home/achanda/%x_%j.err
#SBATCH --time=04:00:00
#SBATCH --cpus-per-task=16
#SBATCH --mem=128G

Load Python or any required modules

module load python3

Directories

SUPPA_DIR=/software/SUPPA-2.3
EVENT_DIR=/AS_events
PROCESSED_DIR=/Processed # Correct directory for processed files
OUT_DIR=/psi_results

Create the output directory

mkdir -p ${OUT_DIR}

Iterate over each event type

for EVENT_FILE in ${EVENT_DIR}/*.ioe; do
EVENT_TYPE=$(basename ${EVENT_FILE} .ioe) # Correct extraction of the event type

# Iterate over each sample's processed file in the Processed directory
for PROCESSED_FILE in ${PROCESSED_DIR}/*_abundance_preprocessed.tsv; do
    SAMPLE_NAME=$(basename ${PROCESSED_FILE} _abundance_preprocessed.tsv)  # Correct extraction of the sample name
    OUTPUT_PREFIX="${OUT_DIR}/${SAMPLE_NAME}_${EVENT_TYPE}"

    echo "Starting processing of ${SAMPLE_NAME} for event type ${EVENT_TYPE}..."
    python3 ${SUPPA_DIR}/suppa.py psiPerEvent -i ${EVENT_FILE} -e ${PROCESSED_FILE} -o ${OUTPUT_PREFIX}

    if [ $? -eq 0 ]; then
        echo "Successfully processed ${SAMPLE_NAME} for event type ${EVENT_TYPE}."
    else
        echo "Failed to process ${SAMPLE_NAME} for event type ${EVENT_TYPE}. Check error logs for details."
    fi
done

done

However,

$ head suppa_psi_calculation_28165525.err
INFO:lib.tools:File ~/Sample1_abundance_preprocessed.tsv opened in reading mode.
INFO:psiCalculator:Buffering transcript expression levels.
ERROR:lib.tools:1, in line 2. Skipping line...
ERROR:lib.tools:2, in line 3. Skipping line...
ERROR:lib.tools:3, in line 4. Skipping line...
ERROR:lib.tools:4, in line 5. Skipping line...
ERROR:lib.tools:5, in line 6. Skipping line...
ERROR:lib.tools:6, in line 7. Skipping line...
ERROR:lib.tools:7, in line 8. Skipping line...
ERROR:lib.tools:8, in line 9. Skipping line...

Can someone help with what is going wrong? I can provide further details if needed.

The text was updated successfully, but these errors were encountered:

EduEyras · 2024-03-03T23:29:27Z

Hi Chandaayan,
your abundance file seems to have many zeroes.
Do you get non-zero abundances for your transcripts? That could trigger some of the errors.
E.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with generation of PSI results #184

Help with generation of PSI results #184

chandaayan commented Feb 15, 2024

EduEyras commented Mar 3, 2024

Help with generation of PSI results #184

Help with generation of PSI results #184

Comments

chandaayan commented Feb 15, 2024

Load Python or any required modules

module load python3

Directories

Create the output directory

Iterate over each event type

EduEyras commented Mar 3, 2024