-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Countmatrix and sample metadata table not uploading properly #16
Comments
Hi @Justin1609, |
Hi there Federico
Thanks so much, I managed to get it sorted out. I didn't realize that you
don't have to transpose the count matrix before inputting it. Why is it
that you don't transpose the data in your tool? By transpose I mean having
samples as objects and genes as variables. I am doing PCA on counts data
from RNA seq analysis for Saccharomyces cerevisiae. Would the output look
different if the count matrix was transposed as I described? I am also
having issues with the gene annotation file for S. cerevisiae as there is
no entry for this on the normal databases that are used in the examples for
using pcaExplorer. Do you maybe know of another database I can use for the
annotations of S. cerevisiae? I have a CSV file where I have the Gene IDs
of S.cerevisiae in column 1 and then the Standard gene names for each Gene
ID in the second column. If you could help out with these issues I would
really appreciate it.
Kind regards
Justin
…On Wed, Nov 24, 2021 at 1:47 AM Federico Marini ***@***.***> wrote:
Hi @Justin1609 <https://github.com/Justin1609>,
did you try to see if the files are in the csv format - despite of the
extension, sometimes Excel might not really be comma-delimited. You can do
so by opening these files in any text editor.
If that works: try to read them in offline (before calling the app), and
call the app by specifying the count matrix and the metadata table in the
respective parameters.
HTH,
Federico
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOVIGQVZGYTPWTZE4KOFD53UNQRZ5ANCNFSM5IN52KLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Hi Federico
Could you also please tell me how I can edit the title and the legend name
of the plot? And I would like to remove the sample labels from the plot and
change the color scheme of the different sample groups? I tried to do this
in R but pcaplot doesn't generate the correct object that ggplot requires
to be able to edit these details.
Kind regards
Justin
…On Wed, Nov 24, 2021 at 1:47 AM Federico Marini ***@***.***> wrote:
Hi @Justin1609 <https://github.com/Justin1609>,
did you try to see if the files are in the csv format - despite of the
extension, sometimes Excel might not really be comma-delimited. You can do
so by opening these files in any text editor.
If that works: try to read them in offline (before calling the app), and
call the app by specifying the count matrix and the metadata table in the
respective parameters.
HTH,
Federico
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOVIGQVZGYTPWTZE4KOFD53UNQRZ5ANCNFSM5IN52KLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Well, the reason is more like "historical" - in Bioinformatics, it is more common to see genes as features on the rows and samples on the columns. So I sticked to the "classical" version.
Sure - stick to the expected format, and it will be fine.
Not so much experience on yeast TBH - some annotation packages are available in Bioconductor, have a look at org.Sc.sgd.db Federico |
I guess for these types of request, probably you are best served by building the ggplot object from the scratch. |
Thanks very much @federicomarini/pcaExplorer
***@***.***> I
really appreciate it. I did have a chat with a co-supervisor of mine who is
a Biostatistician, but why would he have suggested that I transpose the
count matrix data? Apologies, I am very new to realm of Bioinformatics so I
would just like to understand how it would affect the PCA output, if at
all? For example, would the sample view in your program look different if
the data were transposed? Why would it need to be transposed? Is there any
difference between transposing the data versus the format that you use?
Thanks I will definitely check the database out in R.
Regarding editing the plot I receive an error when trying to insert pcaplot
object as the required "pcobj" for ggplot. I am not too familiar with the
coding side either, could you maybe direct me to resources for how to go
about altering the code, like that which is available on the user guide for
pcaplot, where I could for example change the colours of the circles and
edit the legend name and the title? This would be extremely helpful.
Finally, I realise this is not the intended purpose of the program but I
was wondering if you could perhaps give me some advice on how to perform an
OPLS analysis, perhaps something that is as user friendly as your program
and uses a similar input and method? I have tried the ropls package in R,
but I honestly cannot seem to figure it out too well. If you have any
advice I would truly appreciate it.
Kind regards
Justin
…On Wed, Nov 24, 2021 at 3:08 PM Federico Marini ***@***.***> wrote:
Hi there Federico Thanks so much, I managed to get it sorted out. I didn't
realize that you don't have to transpose the count matrix before inputting
it. Why is it that you don't transpose the data in your tool? By transpose
I mean having samples as objects and genes as variables.
Well, the reason is more like "historical" - in Bioinformatics, it is more
common to see genes as features on the rows and samples on the columns. So
I sticked to the "classical" version.
Yes, an even more classical biostatistics-tailored view would be indeed
the transposed one.
But hey... 🤷
I am doing PCA on counts data from RNA seq analysis for Saccharomyces
cerevisiae. Would the output look different if the count matrix was
transposed as I described?
Sure - stick to the expected format, and it will be fine.
I am also having issues with the gene annotation file for S. cerevisiae as
there is no entry for this on the normal databases that are used in the
examples for using pcaExplorer. Do you maybe know of another database I can
use for the annotations of S. cerevisiae? I have a CSV file where I have
the Gene IDs of S.cerevisiae in column 1 and then the Standard gene names
for each Gene ID in the second column. If you could help out with these
issues I would really appreciate it. Kind regards Justin
Not so much experience on yeast TBH - some annotation packages are
available in Bioconductor, have a look at org.Sc.sgd.db
<https://bioconductor.org/packages/release/data/annotation/html/org.Sc.sgd.db.html>
Federico
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOVIGQWUPZTYWJSIVSMCVBLUNTPTTANCNFSM5IN52KLQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
No problem, I am aware we in bioinformatics are doing things by default in a transposed way 😉 If you transpose it: well, in the end you do change the point of view on it: so, no more samples as linear combinations of the genes but the other way around! For editing the ggplot object: I would say some generic resource like a tutorial on ggplot would do it, I have at the moment none I can recommend, do check out https://datavizm20.classes.andrewheiss.com/, I used to recommend it for many other reasons! If you want to do an OPLS analysis, this is out of pcaExplorer's business, "per se", but very much in the whole dimensionality reduction business. Do have a look at Holmes & Huber MSMB book, available online. IIRC it had a couple of these alternatives to PCA introduced. Federico |
Hi there
I am trying to upload my own countmatrix and sample metadata table using the interactive version of the tool, but it doesn't seem to be reading my input tables correctly. I made my tables in excel and modeled it according to the "airway" demo data. I saved the excel file as csv file but it just doesn't seem to want to work. I urgently need to plot this data, any help would be greatly appreciated. I can send you my countmatrix and sample metadata table on request.
Many thanks
J
The text was updated successfully, but these errors were encountered: