FCF@UNIL_Logo
Volume 4 / Issue 3

Facility News

Hey all !

Officially jumping into national holiday seasons, between Canada , USA, France and then of course Switzerland, there's no shortage of parties, firework and drinks to celebrate where we come from and where we live ! But we all know that our heart and soul ultimately belong to Science, don't we ? And more importantly Flow Cytometry ! Come to think of it, how come we don't have an officially Flow Cytometry Day ? I can already imagine people happy debating over which fluorochrome should be the color of the day and which machine should be honored this year ❤️. Year after year, celebrating every colors of the Flow Rainbow ! Now, which day should have the honor to be our day ? Please reply to the letter and let us know !
In this month FACS Tips, we're investigating the implications of data transformation on downstream applications such as tSNE and UMAPs.
Each month, we will ask you 3 questions about the newsletter topic. If you win, you can enter the lottery to win a unique mug designed by the FCF team !

Please take few minutes to answer the quiz HERE.

FACS Tips

Data Transformations

When we see a plot like this, of course it doesn’t look quite right with its negative population squished, but how much damage can it really do to our analysis? This is surely a problem we’ve all run into. The default scaling in FlowJo can send samples crashing against the axis, obscuring our analysis. But if we just throw gates around it we can press on. After all, it's just the presentation of the plot in FlowJo right? Unfortunately, when it comes to dimension reduction analysis such as tSNE, it can affect quite a lot.
Above are two tSNE plots, if we focus on the orange populations, the left plot represents it as one clean population, while the other has it broken apart into 4. Just by sight we might think that this is a fascinating population discovery. Yet, this is the exact same sample, and the orange population is only CD14+ cells that should only exist as one population. So, what causes such a large tSNE difference in the same data set? It’s simply a visual artifact due to poor scaling of the negative population. Without adjusting to put everything on scale, the algorithm thinks the negative is two separate populations and clusters accordingly.
Certainly, this is quite a basic topic for many of you familiar with flow cytometry, but there can be some interesting little bits of information that can be discovered when exploring the data scaling of your experiment.
To quickly overview, data transformations or scaling are purely a visual adjustment, changing the space given to regions of data. Adjusting the data transformation in no way changes your recorded fluorescent data. While there are three main types of scaling; linear, logarithmic, and biexponential, we almost exclusively use biexponential scaling for fluorescent parameters as it best allows us to visualize data at both high and low ends of fluorescence intensity.
Adjusting the data scaling in FlowJo is done by pressing the T button to the right of your fluorescent parameter on a dot plot and choosing the customize axis label from the drop-down menu. There are 3 transformation settings to optimize; positive decades, extra negative decades, and width basis. When performing a data transformation, it's best to start with adjusting the width basis as it usually is enough to fix any data scaling problems. This value determines the number of channels compressed into the linear space around zero. This value should be set high enough to ensure all the histogram data is on screen, but without including extra white space to the left of the histogram. It’s in this space we can avoid a bifurcated negative population as seen in the example before.

Although you can also adjust the positive and negative decades as well, this should be done with greater attention. And this is where exploring the data transformations can be an added quality check to your samples. If a large Extra Negative Decade adjustment is necessary for your sample, it can even be indicative of a spillover error or spreading issues. Same is true for adding positive decades with spreading error and unmixing distortions. If hard tailing data pushes the biexponential adjustment to include much more negative than what would normally be anticipated, you might be better off fixing your panel design or single stain controls.

If your eventual downstream goal is a dimension reduction analysis, such as tSNE or UMAP, then it is recommended to adjust the transformation to display data across the whole plot, without putting too many events on the edges. This can additionally mean decreasing the positive decades to reduce blank space on the right of the plot. By doing this you can get improved separation of your populations, as demonstrated in the UMAP figures below.
Once a compensation matrix is applied to a file, any transformations made to the axis will be linked to that matrix and carried through the experiment.

While certainly not the most exciting topic to focus on, it’s the little details that can make or break our experiments. With the complexity and added room for confusion in analysis relying heavily on dimension reduction, remembering to properly scale your data before running the algorithm can make a huge difference in proper clustering of your data. As always, feel free to reach out to the FCF staff if you have any questions.