What is RNAseq?

What is RNAseq? header graphic for Science Simplified

What is RNAseq?

One of the biggest questions we have here at TESS is, “how does SLC13A5 Epilepsy lead to seizures and other neurological problems?”

One way researchers are currently tackling this question is using a technique called RNA sequencing, or “RNAseq”. In this article, we’ll talk all about RNAseq, what it is, how it’s performed, and why it’s relevant to SLC13A5 epilepsy. Let’s get started!

RNAseq is a research tool that identifies all the RNA molecules in a cell or tissue. But what are RNA molecules and why would we want to identify them all?

If you recall from a previous blog post “What is DNA?”, genes in our DNA encode instructions for making proteins. Proteins are an essential component of every cell of our body. The proteins that are part of our different tissues and organs determine all of their characteristics, from their physical shape and features, to how they behave and respond to their environments, and ultimately, to their overall health. But to be made into proteins, genes must first be converted into intermediate molecules called RNA. RNA are practically copies of the DNA that relay the instructions from our genes to the cellular machinery that makes protein.

Figure 1 shows DNA converting to RNA, and then RNA converting to a protein. The Figure 1 caption reads, "Figure 1. DNA encodes instructions for building proteins. Remember, every cell of our body contains the same DNA instructions, but what gives different cells their unique characteristics are the genes that are active and turned into protein. To be made into protein, the DNA instructions that code for individual genes must first be converted into RNA. This means that each cell can have different RNA expressed!"

Our bodies are energy-efficient so the only genes that are “activated” and made into RNA and protein are the ones that are actually needed. Because of this, identifying all the RNA molecules can give us a snapshot of EVERY GENE that is actively turned on and being made into protein. This information can give us a pretty good picture of the health of the cells or tissue we’re looking at.

Let’s walk through an example using a hypothetical Gene X and Gene Y. In a disease, we see that the RNA of Gene X is increased and the RNA of Gene Y is decreased in the brain. Gene X produces a protein that is involved in unhealthy inflammation while Gene Y produces a protein that helps the brain clear cellular waste. Knowing that unhealthy Gene X is increased and healthy Gene Y is decreased in this disease can help scientists understand the disease better and ask new questions. For example, how does this disease cause Gene X-related inflammation? Or, does the decrease in Gene Y contribute to the disease’s symptoms?

In this example, we’re only looking at two genes, but in our own neurons (cells of the brain), there are THOUSANDS of genes that are actively being turned into protein at the same time. 

This is one big reason why RNAseq is so valuable. It identifies and quantifies all the RNA in a sample, which gives us information on these thousands of genes and can tell us a lot about its health. RNAseq can be used to answer questions such as, “how does SLC13A5 Epilepsy affect the genes that are being made into protein in the brain?” Answering this question can then be used to ask new questions. Using RNAseq can help scientists understand the effects of the SLC13A5 Epilepsy on our brain’s physiology, and help them generate new questions and find new drug targets to work with.

How does RNAseq work?

RNAseq is a long, time-intensive process, but can be distilled into four main steps.

Header says, "What are the steps in RNAseq?"

Step 1 shows cells being converted to pure RNA and says, "1) Looking for active genes: Isolating RNA
RNA is isolated from the sample (cells). This is a purification process that includes clearing unwanted cellular material (DNA, proteins, etc.) from our sample, leaving us with pure RNA. This is important for accurate, contaminant-free results."

Step 2 shows RNA being converted to DNA, and then the DNA being converted into fragmented DNA. It also says, "2) Making the genes easier to work with: Changing isolated RNA to DNA RNA is converted back into DNA because DNA is more stable and easier to work. We only want to sequence DNA from the genes that are being actively turned into protein (not the total DNA because it’s the same in every cell), so we must convert the RNA to DNA. The DNA is then fragmented into smaller pieces to avoid overwhelming the sequencing machine."

Step 3 shows the fragmented DNA being put into a sequencing machine and then the sequencing machine outputting a data file on a computer. It also says, "3) Moving from a test tube to a computer file: Sequencing the DNA
DNA fragments are 'read', or 'sequenced', in a sequencing machine, which produces a large data file on a computer."

Step 4 shows graphs on a computer and then a magnification of the volcano graph. It also reads, "4) Making sense of the data: Identifying and analyzing gene expression
The large data file is analyzed. Specific genes are identified within the data, quantified, and compared between treatment groups. This time-intensive step also involves several layers of quality control that check data quality and evaluate potential contamination."
  1. Looking for active genes—RNA isolation: In order to identify all the RNA, we first need to isolate the pure RNA. This usually involves a multi-step process of physically breaking open cells, and subsequent washes with different solutions to eliminate unwanted cellular material.
  2. Making the genes easier to work with—DNA preparation: Because RNA is a very unstable molecule, it is converted back into DNA, which is more stable and easier to work with. The DNA product is then fragmented into smaller pieces that the sequencing machine can handle.
  3. Moving from a test tube to a computer file—Sequencing the DNA: The DNA fragments are then “read”, or “sequenced”, in the sequencing machine which generates raw data on a computer to be analyzed.
  4. Making sense of the data—Gene identification and analysis: Specific genes are identified within the data and quantified.

To be clinically relevant, scientists compile RNAseq data and compare it between a disease group and a healthy control group to identify the genes affected by the disease. RNAseq data is then usually visualized as graphs that showcase these genes.

Header says, "Reading an RNAseq Plot"

The graphic is showing a volcano plot, with dots in blue, red, and gray. Colored dots indicate genes that were changed (genes whose RNA was increased or decreased) by the disease. The blue dots represent genes that are downregulated (genes whose RNA is decreased by the disease), the red dots represnet genes that are upregulated (genes whose RNA is increased by the disease), and the gray dots represent genes that are unchanged (genes whose RNA is neither increased or decreased by the disease). There's a horizontal dotted line across the graph, with colored dots above it and gray dots below it. The horizontal line shows that above the line, the changes are statistically significant: meaning likely an actual change in RNA levels, due to the disease.

Are there logistical limitations?

RNAseq is a thorough way to assess the genes that are active, but it does have some limitations. Although it is becoming increasingly affordable and accessible, it is still a time-intensive, expensive technique and can cost over $1,000 to run a single test.

How is RNAseq relevant to SLC13A5 Epilepsy now?

RNAseq is a powerful tool that is currently being used by TESS Research Foundation grantees and other researchers trying to understand SLC13A5 Epilepsy. Thanks to our TESS superheroes and TESS families, we now have cells with SLC13A5 pathogenic (disease-causing) variants, as well as healthy controls. This is an important research tool! Researchers are now using these cells (called induced Pluripotent Stem Cells or iPSCS) to investigate SLC13A5 Epilepsy. Researchers are currently turning the iPSCs into neurons (the cells in the brain), and looking for changes in genes using RNAseq. RNAseq has enormous potential for helping us understand how SLC13A5 mutations affect neuron function, thus helping us identify new targets for drug therapies.

Conclusion

  1. RNAseq is a powerful tool used to identify all active genes.
  2. RNAseq can help scientists understand how SLC13A5 deficiency affects the brain and has great potential for helping scientists develop new drugs.

Thank you to Erin Saito for writing this blog about RNAseq! Erin is currently a Cell Biology and Physiology PhD candidate at Brigham Young University. Her research is focused on understanding the effects of ketogenic diets on brain metabolism and cognition.

We want to hear from you! If you want to add to our list of topics for Science Simplified, please email Tanya Brown, PhD: tanya@tessfoundation.org.

Figures were created using BioRender.com.