You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying to perform alternative splicing using SUPPA2. Briefly, I have 28 human samples on which RNAseq was performed. From those, I have generated the count files with Kallisto using gencode.v44 indices. Using the same gencode.v44 gtf file, I have generated the different .ioe files for alternate splicing events. Now, my original abundance.tsv file generated by kallisto looked like this:
$ head abundance.tsv
target_id length eff_length est_counts tpm
ENST00000456328.2|ENSG00000290825.1|-|OTTHUMT00000362751.1|DDX11L2-202|DDX11L2|1657|lncRNA| 1657 1658 0 0
ENST00000450305.2|ENSG00000223972.6|OTTHUMG00000000961.2|OTTHUMT00000002844.2|DDX11L1-201|DDX11L1|632|transcribed_unprocessed_pseudogene| 632 633 0 0
ENST00000488147.1|ENSG00000227232.5|OTTHUMG00000000958.1|OTTHUMT00000002839.1|WASH7P-201|WASH7P|1351|unprocessed_pseudogene| 1351 1352 0 0
ENST00000619216.1|ENSG00000278267.1|-|-|MIR6859-1-201|MIR6859-1|68|miRNA| 68 69 0 0
ENST00000473358.1|ENSG00000243485.5|OTTHUMG00000000959.2|OTTHUMT00000002840.1|MIR1302-2HG-202|MIR1302-
Now I processed it to look like this:
$ head abundance_preprocessed.tsv
ENST00000456328.2 0.000000
ENST00000450305.2 0.000000
ENST00000488147.1 0.000000
ENST00000619216.1 0.000000
ENST00000473358.1 0.000000
ENST00000469289.1 0.000000
ENST00000607096.1 0.000000
ENST00000417324.1 0.000000
ENST00000461467.1 0.000000
ENST00000606857.1 0.000000
One of the generated ioe files looks like this:
So now I am trying to generate Psi results using the slurm:
#!/bin/bash
#SBATCH --job-name=suppa_psi_calculation
#SBATCH --output=/home/achanda/%x_%j.out
#SBATCH --error=/home/achanda/%x_%j.err
#SBATCH --time=04:00:00
#SBATCH --cpus-per-task=16
#SBATCH --mem=128G
Load Python or any required modules
module load python3
Directories
SUPPA_DIR=/software/SUPPA-2.3
EVENT_DIR=/AS_events
PROCESSED_DIR=/Processed # Correct directory for processed files
OUT_DIR=/psi_results
Create the output directory
mkdir -p ${OUT_DIR}
Iterate over each event type
for EVENT_FILE in ${EVENT_DIR}/*.ioe; do
EVENT_TYPE=$(basename ${EVENT_FILE} .ioe) # Correct extraction of the event type
# Iterate over each sample's processed file in the Processed directory
for PROCESSED_FILE in ${PROCESSED_DIR}/*_abundance_preprocessed.tsv; do
SAMPLE_NAME=$(basename ${PROCESSED_FILE} _abundance_preprocessed.tsv) # Correct extraction of the sample name
OUTPUT_PREFIX="${OUT_DIR}/${SAMPLE_NAME}_${EVENT_TYPE}"
echo "Starting processing of ${SAMPLE_NAME} for event type ${EVENT_TYPE}..."
python3 ${SUPPA_DIR}/suppa.py psiPerEvent -i ${EVENT_FILE} -e ${PROCESSED_FILE} -o ${OUTPUT_PREFIX}
if [ $? -eq 0 ]; then
echo "Successfully processed ${SAMPLE_NAME} for event type ${EVENT_TYPE}."
else
echo "Failed to process ${SAMPLE_NAME} for event type ${EVENT_TYPE}. Check error logs for details."
fi
done
done
However,
$ head suppa_psi_calculation_28165525.err
INFO:lib.tools:File ~/Sample1_abundance_preprocessed.tsv opened in reading mode.
INFO:psiCalculator:Buffering transcript expression levels.
ERROR:lib.tools:1, in line 2. Skipping line...
ERROR:lib.tools:2, in line 3. Skipping line...
ERROR:lib.tools:3, in line 4. Skipping line...
ERROR:lib.tools:4, in line 5. Skipping line...
ERROR:lib.tools:5, in line 6. Skipping line...
ERROR:lib.tools:6, in line 7. Skipping line...
ERROR:lib.tools:7, in line 8. Skipping line...
ERROR:lib.tools:8, in line 9. Skipping line...
Can someone help with what is going wrong? I can provide further details if needed.
The text was updated successfully, but these errors were encountered:
Hi Chandaayan,
your abundance file seems to have many zeroes.
Do you get non-zero abundances for your transcripts? That could trigger some of the errors.
E.
I have been trying to perform alternative splicing using SUPPA2. Briefly, I have 28 human samples on which RNAseq was performed. From those, I have generated the count files with Kallisto using gencode.v44 indices. Using the same gencode.v44 gtf file, I have generated the different .ioe files for alternate splicing events. Now, my original abundance.tsv file generated by kallisto looked like this:
$ head abundance.tsv
target_id length eff_length est_counts tpm
ENST00000456328.2|ENSG00000290825.1|-|OTTHUMT00000362751.1|DDX11L2-202|DDX11L2|1657|lncRNA| 1657 1658 0 0
ENST00000450305.2|ENSG00000223972.6|OTTHUMG00000000961.2|OTTHUMT00000002844.2|DDX11L1-201|DDX11L1|632|transcribed_unprocessed_pseudogene| 632 633 0 0
ENST00000488147.1|ENSG00000227232.5|OTTHUMG00000000958.1|OTTHUMT00000002839.1|WASH7P-201|WASH7P|1351|unprocessed_pseudogene| 1351 1352 0 0
ENST00000619216.1|ENSG00000278267.1|-|-|MIR6859-1-201|MIR6859-1|68|miRNA| 68 69 0 0
ENST00000473358.1|ENSG00000243485.5|OTTHUMG00000000959.2|OTTHUMT00000002840.1|MIR1302-2HG-202|MIR1302-
Now I processed it to look like this:
$ head abundance_preprocessed.tsv
ENST00000456328.2 0.000000
ENST00000450305.2 0.000000
ENST00000488147.1 0.000000
ENST00000619216.1 0.000000
ENST00000473358.1 0.000000
ENST00000469289.1 0.000000
ENST00000607096.1 0.000000
ENST00000417324.1 0.000000
ENST00000461467.1 0.000000
ENST00000606857.1 0.000000
One of the generated ioe files looks like this:
$ head gencode.v44_A3_strict.ioe
seqname gene_id event_id alternative_transcripts total_transcripts
chr1 ENSG00000237491.10 ENSG00000237491.10;A3:chr1:779092-803919:779092-803951:+ ENST00000655765.1,ENST00000670700.1,ENST00000657896.1,ENST00000658648.1,ENST00000656571.1,ENST00000669749.1,ENST00000666217.1,ENST00000665719.1 ENST00000665719.1,ENST00000412115.2,ENST00000666217.1,ENST00000658648.1,ENST00000669749.1,ENST00000657896.1,ENST00000670700.1,ENST00000655765.1,ENST00000656571.1
chr1 ENSG00000237491.10 ENSG00000237491.10;A3:chr1:779092-803919:779092-803922:+ ENST00000655765.1,ENST00000670700.1,ENST00000657896.1,ENST00000658648.1,ENST00000656571.1,ENST00000669749.1,ENST00000666217.1,ENST00000665719.1 ENST00000665719.1,ENST00000443772.2,ENST00000666217.1,ENST00000658648.1,ENST00000669749.1,ENST00000657896.1,ENST00000670700.1,ENST00000655765.1,ENST00000656571.1
chr1 ENSG00000237491.10 ENSG00000237491.10;A3:chr1:807323-809622:807323-809658:+ ENST00000412115.2,ENST00000670700.1,ENST00000657896.1,ENST00000658648.1 ENST00000665719.1,ENST00000658648.1,ENST00000588951.5,ENST00000669749.1,ENST00000657896.1,ENST00000670700.1,ENST00000412115.2,ENST00000656571.1
So now I am trying to generate Psi results using the slurm:
#!/bin/bash
#SBATCH --job-name=suppa_psi_calculation
#SBATCH --output=/home/achanda/%x_%j.out
#SBATCH --error=/home/achanda/%x_%j.err
#SBATCH --time=04:00:00
#SBATCH --cpus-per-task=16
#SBATCH --mem=128G
Load Python or any required modules
module load python3
Directories
SUPPA_DIR=
/software/SUPPA-2.3/AS_eventsEVENT_DIR=
PROCESSED_DIR=
/Processed # Correct directory for processed files/psi_resultsOUT_DIR=
Create the output directory
mkdir -p ${OUT_DIR}
Iterate over each event type
for EVENT_FILE in ${EVENT_DIR}/*.ioe; do
EVENT_TYPE=$(basename ${EVENT_FILE} .ioe) # Correct extraction of the event type
done
However,
$ head suppa_psi_calculation_28165525.err
INFO:lib.tools:File ~/Sample1_abundance_preprocessed.tsv opened in reading mode.
INFO:psiCalculator:Buffering transcript expression levels.
ERROR:lib.tools:1, in line 2. Skipping line...
ERROR:lib.tools:2, in line 3. Skipping line...
ERROR:lib.tools:3, in line 4. Skipping line...
ERROR:lib.tools:4, in line 5. Skipping line...
ERROR:lib.tools:5, in line 6. Skipping line...
ERROR:lib.tools:6, in line 7. Skipping line...
ERROR:lib.tools:7, in line 8. Skipping line...
ERROR:lib.tools:8, in line 9. Skipping line...
Can someone help with what is going wrong? I can provide further details if needed.
The text was updated successfully, but these errors were encountered: