Видео 69
Просмотров 881 354

2024 updated single-cell guide - Part 1: RNA preprocessing and quality control

40:12

Single-cell pseudotime and gene regulatory analysis with CellOracle

21:21

Processing single-cell RNAseq counts with simpleaf (alevin-fry)

9:36

RNAseq mapping with Salmon for differential expression

13:13

Pseudobulk single-cell analysis in Python with Scanpy and pyDeseq2

9:59

Differential expression in Python with pyDESeq2

16:19

2024 updated single-cell guide - Part 2: RNA Integration and annotation

In this video I integrate the single-cell RNA data together with scVI and use multiple methods of label transfer from reference datasets. I then verify and annotate the individual clusters using known marker genes. This video covers advanced analysis steps, such as tuning hyperparameters in our scVI model, making custom reference datasets, and more.
Main notebook:
github.com/mousepixels/sanbomics_scripts/blob/main/sc2024/annotation_integration.ipynb
Example of bad mapping:
github.com/mousepixels/sanbomics_scripts/blob/main/sc2024/bad_mapping.ipynb
Part 1:
ruclips.net/video/cmOlCTGX4Ik/видео.html
0:00 Celltypist transfer
13:49 scVI transfer
21:13 Integration
29:22 Dim reduction
34:00 Annotation

Видео

2024 updated single-cell guide - Part 1: RNA preprocessing and quality control

40:12

2024 updated single-cell guide - Part 1: RNA preprocessing and quality control

Просмотров 7 тыс.2 месяца назад

This is a comprehensive tutorial on the most up-to-date recommendations for single-cell sequencing. This is part 1 of a multi-part series. Here I download a dataset, remove background RNA, preform quality control, and remove low quality cells. Part 2 will cover dimension reduction and cell annotation. We will eventually get to in-depth analysis and scATAC analysis. Notebook: github.com/mousepix...

Single-cell pseudotime and gene regulatory analysis with CellOracle

21:21

Single-cell pseudotime and gene regulatory analysis with CellOracle

Просмотров 3,5 тыс.6 месяцев назад

CellOracle is a powerful suite of tools that can perform pseudotime analysis, gene regulatory network analysis, and in silico perturbation analysis on single-cell data in python. This is a simple tutorial covering the basics of CellOracle while analyzing a developmental pancreas dataset. Notebook: github.com/mousepixels/sanbomics_scripts/blob/main/celloracle_pseudotime_GRN.ipynb Reference: www....

Processing single-cell RNAseq counts with simpleaf (alevin-fry)

9:36

Processing single-cell RNAseq counts with simpleaf (alevin-fry)

Просмотров 2,2 тыс.9 месяцев назад

Simpleaf is a faster and more efficient alternative to other counters, such as cellranger, and it works with other single-cell chemistries. It is a wrapper for Alevin-fry and is made by the same lab that created the Salmon RNAseq aligner. Simpleaf is still in development as of making this video. Do not be surprised if the workflow changes slightly in the future. Github notes: github.com/mousepi...

RNAseq mapping with Salmon for differential expression

13:13

RNAseq mapping with Salmon for differential expression

Просмотров 7 тыс.11 месяцев назад

How do you process raw read data for the purpose of differential expression? In this video I map raw RNAseq reads using Salmon and follow up with differential expression analysis in R with Deseq2. notebook: github.com/mousepixels/sanbomics_scripts/blob/main/salmon_to_deseq.Rmd references: salmon.readthedocs.io/en/latest/salmon.html index preparation: combine-lab.github.io/alevin-tutorial/2019/s...

Pseudobulk single-cell analysis in Python with Scanpy and pyDeseq2

9:59

Pseudobulk single-cell analysis in Python with Scanpy and pyDeseq2

Просмотров 7 тыс.11 месяцев назад

It is now possible to do pseudobulk analysis directly in python on your scanpy object. I create the pseudobulk from single-cell data then analyze it with the python port of Deseq2. Notebook: github.com/mousepixels/sanbomics_scripts/blob/main/pseudobulk_pyDeseq2.ipynb

Differential expression in Python with pyDESeq2

16:19

Differential expression in Python with pyDESeq2

Просмотров 18 тыс.Год назад

Analyze RNAseq counts data with a Python implementation of DESeq2. I cover basic differential expression analysis, PCA plots, GSEA, heatmaps, and volcano plots. Github: github.com/mousepixels/sanbomics_scripts/blob/main/PyDeseq2_DE_tutorial.ipynb The samples include normal human cell control and replicative senescence cells from NCBI accession GSE171663 0:00 Intro 0:30 Differential expression 7...

Single-cell background decontamination in R and Python with SoupX

13:55

Single-cell background decontamination in R and Python with SoupX

Просмотров 4,8 тыс.Год назад

SoupX is an essential tool for ambient RNA decontamination in single-cell RNA sequencing data. Ambient RNA in solution is partitioned into droplets and confounds downstream analyses. This concise tutorial covers SoupX implementation for both R (Seurat objects) and Python (Scanpy objects), offering step-by-step guidance and expert insights to improve data quality and accuracy in your single-cell...

Can chatGPT do single-cell bioinformatic analysis?

17:51

Can chatGPT do single-cell bioinformatic analysis?

Просмотров 26 тыс.Год назад

Here I test if chatGPT with the GPT-4 model can do basic single-cell RNA analysis. In short, the results are impressive.

Comparing single-cell RNA integration methods | Which is the best?

20:09

Comparing single-cell RNA integration methods | Which is the best?

Просмотров 9 тыс.Год назад

Which single-cell integration method is the best? In this video I compare 5 different methods using 3 different challenging integration problems. I test Seurat CCA, Seurat RPCA, SCVI-tools, and Scanorama. I measure time and memory usage and also examine integration outcomes. Github: github.com/mousepixels/sanbomics_scripts/tree/main/integration_comparison Datasets: www.cell.com/cell/fulltext/S0...

Single-cell trajectory and pseudotime analysis with Monocle3 and Seurat in R

16:24

Single-cell trajectory and pseudotime analysis with Monocle3 and Seurat in R

Просмотров 14 тыс.Год назад

In this video I perform trajectory analysis in R on a large dataset of cells undergoing dedifferentiation into iPSCs. I use Seurat to load, merge, and preprocess the data. I use Monocle3 to calculate pseudotime and create the trajectories. I go over basic analyses and plotting using Monocle3. Notebook: github.com/mousepixels/sanbomics_scripts/blob/main/monocle3_tutorial.Rmd References: www.cell...

Easy RNAseq volcano plot with one line of code

5:28

Easy RNAseq volcano plot with one line of code

Просмотров 5 тыс.Год назад

Make a super easy and PRETTY volcano plot from differentially expressed genes with only one line of code. Plotting aesthetic figures can be challenging and/or time consuming. Here I show you how to make a pretty volcano plot without needed much prior coding knowledge. They are also highly customizable for more advanced users.

Applying random forest classifiers to single-cell RNAseq data

15:14

Applying random forest classifiers to single-cell RNAseq data

Просмотров 6 тыс.Год назад

Learn how to apply machine learning to single-cell data. Random forest is a powerful machine learning classifier and a great tool for analyzing single-cell RNAseq data. In addition to predicting classifications, you can extract the gene importance from the model as a way to identify genes that describe your populations. Here I use several examples to show you how to use the random forest model ...

Introduction to single cell ATAC data analysis in R

17:36

Introduction to single cell ATAC data analysis in R

Просмотров 14 тыс.Год назад

This is a primer for single cell/nuclei ATAC-seq data analysis. What is single cell ATACseq? How do you perform basic scATAC-seq analysis in R? I describe what scATACseq is. Then I use Seurat and Signac to do data analysis using a recent Nature communications paper. I do preprocessing, clustering, differential accessibility analysis, RNA activity estimation, and I make various plots. Notebook: ...

Complete single-cell RNAseq analysis walkthrough | Advanced introduction

1:18:40

Complete single-cell RNAseq analysis walkthrough | Advanced introduction

Просмотров 75 тыс.Год назад

This is a comprehensive introduction into single-cell analysis in python. I recreate the main single cell analyses from a recent Nature publication. I explain the basics of single-cell sequencing analysis and also introduce more advanced topics. I cover doublet removal, preprocessing, integration, clustering, cell identification, differential expression, gene-set enrichment, non-parametric stat...

Introduction to spatial sequencing data analysis

9:56

Introduction to spatial sequencing data analysis

Просмотров 8 тыс.Год назад

Introduction to spatial sequencing data analysis

Single-cell gene co-expression | single-cell RNAseq methods

5:59

Single-cell gene co-expression | single-cell RNAseq methods

Просмотров 4 тыс.Год назад

Single-cell gene co-expression | single-cell RNAseq methods

3 minute GSEA tutorial in R | RNAseq tutorials

3:05

3 minute GSEA tutorial in R | RNAseq tutorials

Просмотров 24 тыс.Год назад

3 minute GSEA tutorial in R | RNAseq tutorials

Simple guide to GSEA and plotting in python

6:40

Simple guide to GSEA and plotting in python

Просмотров 9 тыс.Год назад

Simple guide to GSEA and plotting in python

Convert h5ad anndata to a Seurat single-cell R object

4:36

Convert h5ad anndata to a Seurat single-cell R object

Просмотров 9 тыс.Год назад

Convert h5ad anndata to a Seurat single-cell R object

Guide to filtering and subsetting single-cell anndata and pandas objects | basic and advanced

11:38

Guide to filtering and subsetting single-cell anndata and pandas objects | basic and advanced

Просмотров 6 тыс.Год назад

Guide to filtering and subsetting single-cell anndata and pandas objects | basic and advanced

Label single-cells automatically in python | scVI label transfer

8:58

Label single-cells automatically in python | scVI label transfer

Просмотров 4,2 тыс.Год назад

Label single-cells automatically in python | scVI label transfer

Beautiful and customizable RNAseq volcano plots

9:47

Beautiful and customizable RNAseq volcano plots

Просмотров 11 тыс.2 года назад

Beautiful and customizable RNAseq volcano plots

How to remove single-cell doublets in python

6:55

How to remove single-cell doublets in python

Просмотров 3,1 тыс.2 года назад

How to remove single-cell doublets in python

Single-cell analysis with scVI machine-learning toolkit

13:00

Single-cell analysis with scVI machine-learning toolkit

Просмотров 8 тыс.2 года назад

Single-cell analysis with scVI machine-learning toolkit

Single-cell integration in python with scanpy

7:21

Single-cell integration in python with scanpy

Просмотров 8 тыс.2 года назад

Single-cell integration in python with scanpy

RNAseq volcano plot of differentially expressed genes

4:16

RNAseq volcano plot of differentially expressed genes

Просмотров 27 тыс.2 года назад

RNAseq volcano plot of differentially expressed genes

How to do gene ontology analysis in python

8:07

How to do gene ontology analysis in python

Просмотров 15 тыс.2 года назад

How to do gene ontology analysis in python

RNAseq analysis | Gene ontology (GO) in R

5:16

RNAseq analysis | Gene ontology (GO) in R

Просмотров 53 тыс.2 года назад

RNAseq analysis | Gene ontology (GO) in R

Single-cell gene set activity with AUCell

11:53

Single-cell gene set activity with AUCell

Просмотров 4,9 тыс.2 года назад

Single-cell gene set activity with AUCell

@asshimul1168 19 часов назад
Hello, Would you please make a tut on advance workflow (based on good paper) on sc RNA Seq by using R?
@asshimul1168 5 дней назад
Please make this same tutorial for R🙏
@JUNPENGYOU-us7mk 9 дней назад
Thank you for the informative tutorial video; it has been immensely beneficial to my scientific research!😄
@sanbomics 9 дней назад
Glad it was helpful!
@yanshixiong434 10 дней назад
great content, languages just different, don't have to be good or bad.
@sanbomics 10 дней назад
Exactly!
@sapienthought1103 12 дней назад
do i have to do this for each sample or what ?
@sanbomics 10 дней назад
You will have to run it, but not install it for every sample.
@freenergy777 13 дней назад
1:12, why do we have to predict doublet at each sample separately??
@sanbomics 10 дней назад
Because every sample is a little different. If there were more cells in one sample then the doublet rate will be higher for that sample. Also samples have different cell types etc
@aytacoksuzoglu2975 15 дней назад
Well it was really awesome. Im still undergreadute so it was little bit hard to understand bioinfo part but python code part was clear. Can you or did you do other integration methods or can you record another video ?
@sanbomics 10 дней назад
Yeah I actually have a video that compares multiple integration methods: ruclips.net/video/NFA2YGshATs/видео.html
@frutitadelosmares 16 дней назад
Hi! Thanks so much for such a great tutorial! Have a naïve question of someone who just started in this world: When raw data is not available, for example, you can only download normalised filtered values, do you skip the pre-processing step? Is it correct to pre-process normalised values, let's say tmm? Again, thanks so much for all the videos!
@sanbomics 10 дней назад
Yeah if there are no raw counts then you will have to skip the ambient removal. Unfortunately, this is the only way sometimes.
@hrisivanov3150 17 дней назад
Man, you absolutely saved me! Suddenly, everything makes sense now. Subscribed!
@sanbomics 10 дней назад
:)
@TheXu122 18 дней назад
Thank you so much for your videos! I am a grad student who recently started a sing cell project and since I found your channel, your explanations and code have been getting me through this tough time. I was wondering if you will be planning on doing cNMF in the future? It is something that I and our lab have had difficulty with. Thanks again!
@sanbomics 10 дней назад
I can definitely keep that in mind for a future video!
@gregj3913 20 дней назад
Thank you for this video! I am confused why you can use the Ensembl version 109 for TxImport--but you ran salmon with the Gencode transcripts fasta. Doesn't gencode differ in what transcripts are included versus Ensembl? Or this doesn't really matter?
@MKShams 20 дней назад
Hi there, thanks. But I have this error.I changed it to 6 but still I have this error !!!!! WARNING: --genomeSAindexNbases 14 is too large for the genome size=154478, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 7
@caspase888 24 дня назад
Your videos are amazing. Thanks a lot. Could I use 3050 with 64 GB RAM for this kind of analysis? Thanks a lot.
@sanbomics 10 дней назад
You can do a decent number of cells with 64 gb ram. I would think you could handle around ~200k in memory at the same time without too many issues. Some steps/algoirthms use a lot more memory though so it is highly dependent on what you do. In my experience 64 gb wont be enough for large datasets/atlases but you can def do small numbers of samples.
@JmandudE888 25 дней назад
While this is technically correct, it is not very statistically sound. When adjusting for the bonferroni and bh procedures, we typically change the cutoff point, not the actual p-value. multiplying the p-value by the number of tests can lead to p-values greater than one (specifically for the bonferroni method, the bh method is already accounted for by dividing by the rank), which is impossible since a p-value is a probability between 0 and 1. while the end conclusion is the same, it doesn't make sense from a statistical standpoint. you can absolutely use this method to get the right significance, but if you are presenting this to a statistician or publishing this work, you would need to adjust the p-value cutoff instead of the actual p-value or change any p-value that is greater than 1 to be exactly 1, but even this is a little more nuanced than just multiplication.
@sanbomics 10 дней назад
This is a bit pedantic: of course probabilities don't go above 1.You can simply clip the data frame column to have a max value of 1.
@danielpintard7382 26 дней назад
This man is an absolute God send, I can't even begin to count the amount of times he has came in clutch with a solution to issues I encounter in my personal projects and during my internship!
@sanbomics 10 дней назад
Glad I could help!
@AbelDavid-qc5xy 26 дней назад
Thank you for this very helpful tutorial. is there a function for plotting the euclidean distance map for each sample?
@azxcf2912 27 дней назад
@Sanbomics... good content but you really need to learn how to talk!
@sanbomics 27 дней назад
I done gone learned how to talk real good like enough. No idea what u r meaning. Such rude
@mehdiraouine2979 28 дней назад
we're still looking forward to the future part ;D
@sanbomics 28 дней назад
I know i know xD. I was going to start working on it this weekend. I have been very busy!
@sanbomics 10 дней назад
someday soon...
@Dumbo-eo5ps 9 дней назад
@@sanbomics we're all hoping for this series to be completed so we can implement it, we're rooting for you! we're grateful for anything you can share :D
@phuchoanglevn 29 дней назад
Thanks for your work. It's really useful.
@AbelDavid-qc5xy 29 дней назад
Can you cover batch correction in python?
@user-cj1sh8qu5h 29 дней назад
I love this, thank you!
@benjaminwehnert1893 Месяц назад
thank you very much. Great work. Maybe a dumb question, but how would you process bam files that contain SCdata from several cells? Essentially what I need is a table similar to yours with genes in the rows and cells in the columns (instead of whole bam files).
@sanbomics 28 дней назад
You can convert it back to fastq then run it through various single cell counters. e.g., if it is 10x data you can use cellranger bamtofastq then cellranger count
@young-kookkim5031 Месяц назад
Thank you so much! This video is perfect for those who want to analyze scRNA-seq data!
@celiagonzalezgil57 Месяц назад
Firstly thank you so much for your videos are very useful. I have used featureCounts to generate the count table, but I obtain a percentage of Unassigned_NoFeatures to high (around 50%). I checked that the annotation file used to the alignment and to generate the count table is the same, also I checked the type of stranded of the assay and I continue having the same problem. I tried to change the GTF.featureType to exon by gene and the % of Unassigned_NoFeatures decrease until 15% more or less. These results suggest me that I have a high content of introns or intergenic regiones in my results but when I checked with the IGV I don't observe that. I don't know if you can help me with this or tell me is this results are normal for human data. Thank you so much!!
@sanbomics 28 дней назад
Hmm. Sounds weird. It's hard for me to diagnose from here. See what happens if you use a pseudoaligner instead like salmon
@sanbomics 28 дней назад
You are definitely using the right annotation? xD
@celiagonzalezgil57 24 дня назад
@@sanbomics Thank you so much for your feedback. I am triying now with Salmon
@celiagonzalezgil57 24 дня назад
@@sanbomics yes, I check that several times 😅
@fsh9134 Месяц назад
Thanks for making very useful videos. I was wondering if you would like to make a video related to single cell analysis using Julius AI a data analysis AI.
@yaseminsucu416 Месяц назад
Hello! I am curious if you will have a much expanded version of the Spatial Seq Data analysis, also curious if there is any Proteomic analysis tutorial coming up! Thanks for your great work!
@sanbomics 28 дней назад
I might do a visium HD and/or a xenium tutorial soon but i have a lot of things in the queue and not enough time xD
@tamaterinha 23 дня назад
@@sanbomics Visium HD would be amazing! I´m new to spatial transcriptomics and my first project is on visium hd data, am desperately looking for a nice workflow of analysis
@ParthShah-hc8pw Месяц назад
hey mate Am getting this error followed all your steps except the filter one pls help I have 5 columns in my matrix (M10,M11,M12,M3 and M5) and EMBSEL gene ids to it The dds step is not working Error in checkForExperimentalReplicates(object, modelMatrix) : The design matrix has the same number of samples and coefficients to fit, so estimation of dispersion is not possible. Treating samples as replicates was deprecated in v1.20 and no longer
@Kelly-gg8eq Месяц назад
I never comment on youtube videos but thank you so much for this. It was so simple and straight to the point. I am new to coding and needed to get all of the outputs from 20 featurecount reads into one output file, and this was the only thing I could find that not only made sense, but also worked. Thank you thank you thank you!!!
@sanbomics Месяц назад
Wooo glad it helped you!
@davidstivenarboledaprado8731 Месяц назад
Do you have a code for doing the same in R studio, I've been trying to built a Seurat object with public available data, using the counts, position and images, with no success. Thanks
@sanbomics Месяц назад
Sorry :( only code for this in python atm. What step specifically are you having trouble with? The initial loading of the data?
@Amanda-re2vt Месяц назад
Hi Sam, do you have a video of how you’re downloading the data from NCBI (papers) because that part I don’t understand.
@sanbomics Месяц назад
I don't have a video. But I tweet about it sometimes if you follow me on twitter. I may make something like this in the future. You can check out my most recent video series for another example from a different dataset
@mhmmdbduh Месяц назад
great video, I learn a lot. But i was wondering what device you use? I did the same analysis but i could load the model because im out of memory.
@sanbomics Месяц назад
this computer has 128 gb memory. But you can try the analysis with fewer samples if you want to follow along still
@ykoy1577 Месяц назад
Dear sanbomic! I am wondering what is your first options for gene regulatory network analysis? when you have only single cell rna seq data or also have single cell ATAC data!!
@sanbomics Месяц назад
I like scenic and scenic+ if you have paired ATAC data. Scenic and cell oracle are also good choices.
@ykoy1577 Месяц назад
@@sanbomics Thank you so much
@ionutiordachi695 Месяц назад
Thank you, very usefull !
@saraalidadiani5881 Месяц назад
thank you for the nice video, regarding to the part for making the cell typ fraction plot (form this part of the code till end of this part: adata.obs.groupby(['sample']).count()) may you also please explain how to do it in R with the Seurat objecet? thanks
@ykoy1577 Месяц назад
Your older videos are also very helpful. Thank you for everything
@sanbomics Месяц назад
They never got much love though xD
@lizheltamon Месяц назад
Hi! thanks for this! just wondering if you have tried comparing results of CellOracle from SCENIC?
@sanbomics Месяц назад
I haven't, but there is someone in the lab working on that at the moment.
@catalyst1918 Месяц назад
Hello I'm use Star to align rna .fq using --quantmode when i cat ReadPerGene.out.tap it's indicate that my read only map to rRNAs ncRNA but when i check in IGV it shown that my reads mapped to coding gene ,how do i fix this I'm trying to use ReadPerGene.out.tab to further analysis.
@qwerty11111122 Месяц назад
Hi! Im memory limited, so I can only load in my dataset using the backed = 'r' option. How would I subset in this scenario?
@sanbomics Месяц назад
I avoid backed at all costs haha. I know I have had to do this before.. but it is so infrequent that I don't remember how off the top of my head and I don't remember where I can find an example. Hope you figured it out, sorry for slow response
@gracegregory4846 Месяц назад
Not sure if the DeseqDataSet parameters have changed since this tutorial but I had to change clinical to metadata when running: dds = DeseqDataSet( counts = counts, metadata=pb.obs, design_factors="tumour")
@sanbomics Месяц назад
Yup its changed a lot. I'll be remaking it soon!
@SamipSapkota-zg8hy Месяц назад
there is no complex heat map package
@zamhazri6240 Месяц назад
haii may I know if it runs .tsv file?
@sanbomics Месяц назад
if you import the tsv first as a pandas dataframe
@divyamishra2641 Месяц назад
Could you please help how to make a violin plot using AUCell package like you did for dimplot?
@MrQiushenfeng Месяц назад
Your tutorials are always incredibly helpful! Do you have a scripts that implements UMAP to spatial transition animation for the Xenium dataset? We have a beautiful new dataset that can share. sqiuatarizonadotedu. Thanks so much.
@aayushinotra7945 Месяц назад
Hey! Could you please explain what these computed counts are?
@issanmitro Месяц назад
Men U R a real hero
@sanbomics Месяц назад
No you are the real hero! (The Boys reference)
@duadpeada5068 Месяц назад
Very cool video! Could you please tell us how to do something similar to your introduction with the umap transforming to the logo??
@sanbomics Месяц назад
I have the video where I turn my cat into a UMAP. Let me know if that helps, if not, I can maybe post the code.
@zeinabbahari Месяц назад
my name is zeinab bahari . you can find me in research gat... i need help in rna seq data analysis
@sanbomics Месяц назад
If you need help you can check out sanbomics.com
@zeinabbahari Месяц назад
Hi.thanks for your good video.how can acsess to you dr. i need some emergenecy help in my data analysis.. please help me
@sanbomics Месяц назад
Hi, you can reach me through sanbomics.com
@cold_hardfacts Месяц назад
Im not a scientist. I came here from cancer research. I gave it a thumbs up. Clear and informational.
@georgieb1326 Месяц назад
Is there a reason you used CellTypist before integration? It means that the overclustering done by CellTypist is different to the overclustering done post-integration when annotating (which is making annotation a bit confusing in my case)
@sanbomics Месяц назад
You can do it after depending on how many cells you have. With this many cells it becomes almost impossible because it requires a dense matrix.

Sanbomics

Комментарии