In Today’s Information Era,
data can be collected from anywhere, from mining of twitter feeds to traffic counters. How about video images? They are essentially data collected on the wavelength of light that enters a camera, from frame to frame. Implication are highly versatile but only after we are able to develop algorithms that can further quantify the observations in a video.
This study explores the usage of video images to determine the effects of genetic mutations in C. Elegans worms, also known as lab worms. Mutations can affect the mutant worm’s adaption for higher temperature and are characterized in two ways. First, the effect will be in the responses mechanism, which will reduce the ability to detect a change in temperature. Second classification will be an effect in the physiological mechanism and reduces the ability of organs to exhibit normal behavior when temperature increases.
59 locations of mutations were included in the study. Their performance was recorded in 80 second videos. The contrast was adjusted so the changes in the coloration of the pixel can counted by a software. The number of pixels that changed color in between frames tell us how much movement the lab worms were exhibiting (figure 1).
A mutation in the physiological mechanism results in a gradual decrease in movement ending in paralysis. On the other hand, a mutation in the ability to respond to an increase in temperature can be seen as a rapid decrease in movement to paralysis because the lab worms will not change their swimming behaviours until it is too late. Figure 2 shows the expectation of the thermal performance curve within our hypothesis.
27 videos of different worms were recorded for accuracy in determining the true effects of the mutations. The resulting data-set was still very noisy and contained many outliers. Simple statistical summaries and methods could not conclude what was happening. Our solution was Functional Data Analysis (FDA) that was available as an R package. FDA was able to solve our problem by looking at the observations as a continuum over time. A polynomial function was plotted for intervals of data and the polynomials were connected to make a soft continuum.
The next step in quantification was to numerically determine the rates of decrease in movement by analysis of derivative of the thermal performance curves. Linking back to our initial hypothesis, a gradual decrease in movement can be seen by a constant derivation and a spontaneous paralysis will result in rapid change in the derivation (figure 3).
Clustering was used to group the 59 strains into two buckets of classification. There were many strains that did not completely match the hypothesis, which was expected because most mutations do not exclusively affect just one mechanism. However for those strains that behaved accordingly, results of the analysis are similar to the expectation of our initial hypothesis (figure 4).
Videos in our study were limited to black and white and a multiplicity of the data by frame and sample size quickly transformed the data into big data. In order gather the information from the noise we needed to use powerful smoothing methods, such as the FDA program. We also considered Functional Principal Component Analysis (FPCA), which was more powerful but the FDA procedure was significant enough. Considerations into quantification of color image will require a powerful algorithm that mines the elementary data generated by the charge coupled device.