HL Mathematics: Analysis and Approaches Exploration

Modelling and Investigating the Properties of B-DNA Double Helix

Number of Pages: 18

Examination Session May 2021

In my biology course last semester, I was introduced to one of the most vital scientific structures: the DNA double helix. This structure was proposed by James Watson and Francis Crick in 1953, and consists of two intertwined strands in a right‑handed helical structure, coiled around the same axis (Figure 1)[1]. The process of replicating DNA is arguably one of the most crucial and complex processes of life. Each new cell formed through cell division must receive an exact replication of the DNA in the old cell. Breaking down the complexity of every living organism into one single structure of a helix was fascinating to me. I was even more intrigued to learn about the magnificent precision of living organisms, being able to carry out the replication process with extremely minimal error. As an aspiring biomedical engineer, I want to investigate further into the properties of the helix structure through a mathematical lens to better understand the size and structure of DNA molecules, ultimately to gain more insight into the reason for their behaviours within cells.

In mathematics, a helix is defined as a curve in 3-dimensional space, whose tangent forms a constant angle with a fixed line[2]. It is sometimes referred to as a space curve or coil. Since B-DNA is the most common double helical structure in nature[3], I decided to explore this form of DNA. B-DNA is a right-handed double helix containing a wider “major groove” and a narrower “minor groove” (Figure 1). It has a diameter of approximately 20 Å (1 angstrom = 1 x 10-10 meter) and the length of one full twist, or its pitch, is 34 Å. Parametrization is the use of parametric equations, by means of one or more variables, in order to represent a curve or a surface[4]. The parametrization of B-DNA will first be established through 3D modelling and the observation of patterns. Then, the relationship between the 3D parametrization and its corresponding 2D parameterization will be explored. Finally, using this 2D function, the properties of an individual B-DNA helix strand, including its arc length and curvature will be determined. These properties will ultimately allow me to calculate the dimension of nucleotide molecules in DNA and provide insight on its structure to complement scientific knowledge.

I originally planned to recreate the helix shape by wrapping a right-angled triangular piece of cardstock around a uniformly cylindrical can. This physical model seemed worthwhile as it would allow me to discover connections between the dimensions of a 2D triangle (Figure 2.1) and the 3D helix (Figure 2.2). I hypothesized that the base, represented by C, would serve as the circumference of the helix, the hypotenuse (S) would serve as the arc length, and the height (h)  as the length of one helical twist. As shown in Figure 2.2, the diameter (d) could then be calculated by dividing circumference C by π. The model would also be a good visual representation of the actual dimensions of B-DNA, as pictures found online all give very different impressions of the structure and are most likely not scaled proportionally. However, I recognized shortly after that this model does not yield high precision due to limitations of materials and building methods used, such as the ruler measurements and imprecise cutting of the paper. More sophisticated equipment such as springs and an adjustable ruler along with carpentry skills would be needed to build a more precise model.

After recognizing the physical model was not an accurate reference for modelling nor measurement analysis, I decided to recreate the model on a digital 3D calculator GeoGebra instead. In the three-dimensional Cartesian coordinate system, the x- and y-axes create a horizontal plane, and the z-axis, which is mutually perpendicular to x and y, adds a vertical component.[5] A helix can be broken down into its three components within the three axes: a cosine function in the x-axis, a sine function in the y-axis, and a linear function in the z-axis. The following is the simplest parametric definition of a helix[6]:

As the parameter t increases, the vector point (x, y, z) draws a right-handed helix (Figure 3.3). I will let each parameter represent distance in nanometers (nm) from the helix’s initial point and be a function of t, which represents time in seconds. I chose to work with nanometers because it allows me to work with numbers with the least number of digits, facilitating the observation of patterns. Similar to the physical model, I began with a cylinder, with a radius of 1 nm (Figure 3.1) and a height of 3.4 nm (Figure 3.2), to ensure it is to scale with one helical twist of an actual B-DNA structure.

I will apply various transformations to each parameter and find the most fitting parameters for the B-DNA helix shape, to ultimately find the final parametrization of a scaled DNA helix. The parametric equations with transformations can be defined as:

where a represents vertical dilation, k represents horizontal dilation, and c is vertical translation.

Starting with vertical dilation, I manipulated the variable a of the x- and y-parameter, by setting it equal to 0.5, 1, and 2. Including the original value of 1 as well as a lesser and greater value will allow me to explore a range sufficient for noticing a pattern. Since one helical twist corresponds to one period of a sinusoidal function, which equals to , I worked within the domain of , so that the curve created is exactly one complete twist. From the three manipulations, I noticed the curve became wider as the magnitude of a increased (Table 1).

 Table 1. Applying Transformations of Vertical Dilations to the Helix. a = 0.5 a = 1 a = 2

From this observation, I hypothesized that a corresponded to the magnitude of radius. I decided to evaluate this hypothesis by changing the radius of the cylinder to match the chosen a. Table 2 presents the initial, middle, and final coordinate points of the helix at the three different a-values, which were collected by setting the z-parameter equal to 0, 1.7, and 3.4 respectively.

 Table 2. Initial, Middle, and Final Coordinates of Helices with Various ‘a’ Values. a = 0.5 a = 1 a = 2 visual representation initial point (z = 0) (0.5, 0, 0) (1, 0, 0) (2, 0, 0) middle point (z = 1.7) (-0.5, 0, 1.7) (-1, 0, 1.7) (-2, 0, 1.7) final point (z = 3.4) (0.5, 0, 3.4) (1, 0, 3.4) (2, 0, 3.4)

The magnitude of the x-values of each coordinate, which represent the radius of the cylinder, are equal to their corresponding a-values, thus supporting my hypothesis. My hypothesis can also be confirmed visually as the curve seems to wrap perfectly around the cylinder when a corresponds to the radius.

Next, I manipulated the variable k by setting it equal to 0.5, 1, and 2, and noticed this transformation stretches the helix in the direction parallel to the z-axis. The visual representation of these three manipulations can be seen in Table 3 below. Since the aim is to manipulate a single helical twist so that its height corresponds to the height of the cylinder, I chose to work within the domain of , and restricted t to the height of the cylinder to more easily manipulate the curve. Since one helical twist corresponds to one cycle of a sinusoidal function, I recognized that when t = 3.4, the product of  in  and  must equal the period of 2π for the curve to have a height of exactly one cycle. Thus:

The k-value of 1.85 makes sense logically as of the three helices (k = 0.5, k = 1, k =  2), the helix with k = 2 can be visually seen to have a height closest to the actual height (Table 3).

 Table 3. Applying Transformations of Horizontal Dilations to the Helix. k = 0.5 k = 1 k = 2 k =

The final transformation to achieve the DNA’s second strand is a vertical shift, which was created by replicating the helix previously found, and then simply adding a value of c to the z-parameter. The second helix, illustrated in green in Figure 7, is vertically shifted up by 1.2 (=) nm, in accordance with the actual distance between DNA strands (refer to Figure 1).

The complete parametrization of the B-DNA double helix, with a vertical dilation (a) by a factor of 1, horizontal dilation (k) by a factor of , and vertical shift (c) of nm can now be applied to the parametric equations defined above (②). This is summarized in Table 4.

 Table 4. Parametrization of B-DNA Double Helix. Original Strand (black): Second Strand (green):

Since a helix is a vector-valued function, the parametric equations established above for a B-DNA helix can be rewritten as the sum of the three base vectors: i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1), where function . Letting represent the curve of the first strand, and represent the second strand, I get:

Through applying transformations to the parent parametric equations of a helix (①) in GeoGebra, I was able to find the parametrization of B-DNA in the most accurate and unbiased way. This technological approach entirely eliminated the limitations of using physical materials, such as equipment uncertainties or human error. Moreover, since the 3D graphing calculator provides an immediate visual representation of any inputted function, the effects of each transformation could be illustrated, ultimately aiding me in parametrizing B-DNA. Without technology, recreating a physical model would only provide one specific model with fixed dimensions, and even if flexible material were to be used, the adjustment of the model would introduce new uncertainties, ultimately being less accurate than a digital model. I am satisfied with the model I have created technologically and believe it has allowed me to determine the most accurate parameterization, which I can now use to further explore the helix function and find its properties.

With the two functions of  and , a graphical representation on a 2D Cartesian plane can be illustrated using Desmos (Graph 1). This graph was interesting to observe as, unlike the 3D graphical model which illustrates a symmetrical helical structure, the functions of  and  have an asymmetrical shape about the t-axis. The composition of  can be broken down further to understand the reason for its shape and properties. Since  and  are the same curve, only differing in vertical translation, only the first function will be explored to reduce redundancies.

The function  is the result of the sum of a sine function, cosine function, and linear function. First, the x- and y-parameters of the B-DNA helix will be investigated. When a sine and cosine curve of the same period are added together, their waves are superimposed. Letting y1 represent the x-parameter, and y2 represent the y-parameter, it can be seen in Graph 2 that their sum produces another trigonometric function, represented by y3.

From Graph 2 and the table of values (Table 5), I noticed the new function of  is structured very similarly to the sine function, y2, differing only in amplitude and by a phase shift. Since y3 has a maximum at , it can be interpreted that, from the parent sinusoid defined as  to y3, a transformation of an increased amplitude of 1.41 (=) nm is applied. Additionally, when rounded to three significant digits, y3 has a zero at t = -0.425, meaning a phase shift of nm to the left has also been applied. The period of y3 remains the same as that of y2. Therefore, using  as the original function, where a is the amplitude, k equals , and c is the horizontal shift, the equation of the image function, y3, can be rewritten as:   .

This equation and its accompanying transformations can be confirmed and explained algebraically. In recognizing the resemblance of a final sine function () as the sum of a cosine () and sine () function to the trigonometric addition formula stated below, I will try to manipulate the sum of y1 and y2 into the form of the addition formula, by using comparison.

Since y3 has a coefficient of, I multiplied the formula equation ③ by on both sides to get:. Now, I can compare the two terms on the left-hand side of this equation to the equations of y1 and y2 to find the values of A and B.

As displayed in Table 6, the values of A and B will be first calculated using the x-parameter, and then confirmed using the y-parameter.

 Table 6. Finding the Sum of y1 and y2 Algebraically through Comparison. x-parameter (y1) = y-parameter (y2) = Separating cosine and sine into two equations: Separating sine and cosine into two equations:

From working with both y1 and y2, it is calculated that  and . These values can now be substituted into the expression on the left hand side of the addition formula ③.

This expression is also confirmed graphically on Desmos, which reproduced an identical curve to that of the original sum equation of . Thus, graphically and algebraically, it can be concluded the two curves of the x-parameter (y1) and y-parameter (y2) produce the function of .

Now, when the z-parameter of  is introduced, it can be seen in Graph 3 that which is a diagonally-increasing sine wave is produced. This graph is the result of the sum of a sinusoid and a linear function. The oscillation of the sine wave  occurs about , which is a linearly increasing curve with a gradient of 1. The asymmetry about the t-axis of the final curvecan be explained by comparing two points on the graph (Graph 3).

Given two points on the sinusoid, point A and point B, which share the same y-value and only differ by a period, the vertical difference between point B and point B’, the corresponding point on the linear function with the same x-value, will always be greater than the difference between point A and point A’. This means that the resultant sum of point B and point B’ will always be greater than that of point A and point A’, creating a diagonally increasing composite function.

Looking at the function of  in the context of a helix structure,  which is the line in which the function oscillates over time, can be seen as equivalent to the z-axis of the 3D model produced on GeoGebra, which also represents the variable time. Whereas in the 3D model, the helix spirals upwards over time, in the 2D function, the helix spirals diagonally upwards over time. In understanding the relationship between a 3D helix and its 2D representation as a function over time, the 2D function can now be used to determine the properties of B-DNA, including its arc length and curvature.

A unique property of space curves such as a DNA helix is arc length, which can be

defined as the distance along a curve[7]. Given a space curve defined by the function  within the interval [a, b], the arc length can be found using the following

definite integral equation, where S(t) is the arc length and t is time[8]:

First,  will be found by applying chain rule to find the first derivative of .

Next, to find the magnitude of , I will use the following formula: , where v = (x, y, z). The base vectors of i, j, and k are eliminated as magnitude is a scalar quantity.

The expression of  can be substituted into the integration expression ④ to find arc length. The domain used in the parametrization of the helix is , so [0, 3.4] will be the interval used for the upper and lower boundaries.

Therefore, the arc length of one B-DNA helix twist is around 7.14 nm. This means if a B-DNA strand were to be stretched into a line, it would have a length of 7.14 nm, 2.1 times longer than its normal length of 3.4 nm. With this value, I can revisit my hypothesis mentioned on page 2 regarding the relationship between the arc length of a helix and the hypotenuse of the right-angled triangle created by an unraveled helix (Figure 2). The dimensions of the right-angled triangle created by one B-DNA helix strand is defined in Figure 8 below, where the base (C) refers to the circumference of the helix, the height (h) refers to the height of one helical twist, and the hypotenuse (S) refers to arc length. Using the pythagorean theorem: , I will calculate the value of S.

Therefore, the arc length of one B-DNA twist as determined using the pythagorean theorem is 7.144117693 nm, which rounds to 7.14 nm. I was fascinated to learn after comparing the results found through trigonometry with that of integration, that both methods resulted in the same arc length value of exactly 7.144117693 nm. This finding supports my hypothesis theorizing the correspondence between an unraveled helix with a diameter of d, height of h, and arc length of S and a right-angled triangle with a base of , height of h and hypotenuse of S. With this relationship, a new formula for the arc length of a helix can be derived, where arc length . This formula can be extremely valuable as it only requires the height and circumference of a helix to be known to find arc length. It is also a much simpler equation as it does not require the application of calculus techniques. Investigating my hypothesis has allowed me to understand how a concept from one particular branch of mathematics, such as trigonometry, can be applied to a different branch, such as calculus. The established relationship between a triangle and a helix opens opportunities for further investigations of the properties of a helix using trigonometric laws. At the same time, since this conclusion was obtained using only one case, a more rigorous method of proof would be necessary to confirm the accuracy of this formula and the scope of its application.

The arc length of B-DNA is valuable for biological applications as well. A B-DNA helix is composed of rows of chemical base pairs, similar to the structure of a ladder, and is most stable with around 10.5 base pairs per turn[9] (Figure 9)[10]. Each strand of B-DNA also contains a chain of nucleotides, whose length corresponds to the arc length of one helical twist. Thus, assuming each nucleotide is connected end to end, the arc length of 7.144117693 nm can be divided by 10.5 base pairs to find the average length of one nucleotide.

Therefore, each nucleotide has a length of around 0.680 nm. According to Lin et al., (2017)[11] , the length of one nucleotide is experimentally determined to be 0.6 nm. The overestimation of the mathematically calculated arc length of 0.680 nm may likely be due to the assumption made which led to the inclusion of the empty space between adjacent nucleotides, where electrons freely move in the covalent bond. As demonstrated by this calculation, the use of mathematics to calculate properties such as the length of a nucleotide may be inaccurate; experimental data is needed to support or increase precision of the values. However, I consider the value algebraically calculated of 0.680 nm to be fairly close to the true value. Thus, mathematics can be used as a theoretical method of providing a good estimation of measurements of biological molecules, which is especially useful when experimental methods at this microscale cannot be conducted.

Curvature is another property of helices, which refers to the degree to which a curve deviates from a straight line[12] (Figure 10). Mathematically, it can be defined as the rate at which the unit tangent vector (represented by ) changes with respect to arc length (S)[13]. This relationship is given in the following expression, where κ represents curvature:

κ

This expression can be broken down into two components: the magnitude of the derivative of the

unit tangent vector with respect to time (), divided by the magnitude of the derivative of

arc length with respect to time (). Thus, a new expression can be made:

κ

Since the arc length formula is  (④), is simply . Thus: κ =.

First, I will find the unit tangent vector  (Figure 11) by dividing the tangent vector, which is equal to the derivative of the helix function  by its magnitude, resulting in just its direction. This is expressed in the

following formula:

It was previously determined that and  (page 12). These two expressions can be substituted into the unit tangent vector equation.

Then, to find the first derivative of the unit tangent vector, I will apply chain rule.

Now, to find the magnitude of, the equation  where v = (x, y) will be used. The base vectors of i and j are eliminated since this is a scalar quantity.

Finally, the expressions and  can be substituted into the curvature formula.

κ

Therefore, the curvature of a B-DNA helix is around 0.774, which is less than the curvature of a circle of 1. This makes sense logically due to the vertical component of a helix which a circle lacks; the vertical stretch causes the helix to deviate from being a perfect circle. The value of the curvature can be useful in increasing understanding of the structure of a B-DNA helix. A helix which approaches the curvature of a circle of 1 would be unrealistic because adjacent base pairs in DNA cannot lie directly on top of one another. On the other hand, a helix which approaches the curvature of a line of 0 would be very improbable due to hydrophilic and hydrophobic interactions between base pairs and water in a cell which cause DNA to twist[14]. A curvature of 0.774 signifies that the most stable structure for B-DNA to allow interactions and bonding in DNA to properly occur is more circular than linear.

Through building a digital 3D model of a B-DNA helix on GeoGebra in this investigation, the manipulation of each parameter of a helix allowed me to visually understand the effect of each transformation on the structure as a whole. Additionally, through exploring the relationship between a 3D and 2D parameterization, I have gained a deeper understanding of the three components which make up a helix. The breaking down of the 2D parameterization further strengthened my understanding of composite functions and the addition of functions. Lastly, the calculations of the properties of a helix using the 2D parameterization allowed me to find meaningful expressions and values which better explain the structure of a helix as well as confirmed my hypothesis regarding the connection between a right-angled triangle and a helix. Using the arc length, I was able to calculate the theoretical length of a nucleotide, which was relatively accurate to its literature value. The value of curvature calculated also provided insight into the degree of curving which creates the most chemically stable B-DNA helix.

As an extension to this investigation, the other forms of DNA could be explored to understand the effect of dimensions on the arc length and curvature, to gain more insight into the reason for the potential chemical instability of these uncommon forms. The relationship between a helix and a helicoid could also be explored to calculate the surface area of a B-DNA double helix, which would give an approximation of the area the nucleotides and base pairs occupy. A new question inspired by this investigation is of how the mathematical properties of DNA helices differ from those of abnormal DNA structures? Abnormal DNA structures refer to knots, hairpins, or loops, which can cause permanent mutations, potentially leading to cancer[15]. I would be interested in exploring these abnormal structures, and comparing their properties to that of a standard DNA helix to understand the potential biological harms as a result of their structure.

Bibliography

Abnormal DNA Structures May Hold Key to Early Cancer Detection, Treatment. (1998,

February 1). Retrieved November 13, 2020, from https://www.cancernetwork.com/view/a

bnormal-dna-structures-may-hold-key-early-cancer-detection-treatment

Bailey, R. (n.d.). Why Is DNA Twisted? Retrieved November 14, 2020, from https://www.thoug

htco.com/double-helix-373302

tes.html

Curvature: Definition of Curvature by Oxford Dictionary on Lexico.com also meaning of

Curvature. (n.d.). Retrieved November 13, 2020, from https://www.lexico.com/definition /curvature

E. (2007). DNA sequencing. Retrieved November 13, 2020, from https://www.britannica.com/sc

ience/DNA-sequencing

Graphing Calculator. (n.d.). Retrieved November 13, 2020, from https://www.desmos.com/calcul

ator

Hardison, R. (2020, August 15). 2.5: B-Form, A-Form, and Z-Form of DNA. Retrieved

November 03, 2020, from https://bio.libretexts.org/Bookshelves/Genetics/Book%3A_Wo

rking_with_Molecular_Genetics_(Hardison)/Unit_I%3A_Genes_Nucleic_Acids_Genom

es_and_Chromosomes/2%3A_Structures_of_Nucleic_Acids/2.5%3A_B-Form_A-Form_

and_Z-Form_of_DNA

Johnson, Dr. David A. DNA, Samford University, 2017, www2.samford.edu/~djohnso2/jlb/333/(

06)dna.html.

Levitt, M. “How many base-pairs per turn does DNA have in solution and in chromatin? Some

theoretical calculations.” Proceedings of the National Academy of Sciences of the United States of America vol. 75,2 (1978): 640-4. doi:10.1073/pnas.75.2.640

Libretexts. (2020, August 12). 2.2: Arc Length in Space. Retrieved November 12, 2020, from

https://math.libretexts.org/Bookshelves/Calculus/Supplemental_Modules_(Calculus)/Vector_Calculus/2:_Vector-Valued_Functions_and_Motion_in_Space/2.2:_Arc_Length_in_Space

Lin, C., Tritschler, F., Lee, K., Gu, M., Rice, C., & Ha, T. (2017, March 06). Single‐molecule

imaging reveals the translocation and DNA looping dynamics of hepatitis C virus NS3 helicase. Retrieved November 13, 2020, from https://onlinelibrary.wiley.com/doi/abs/10.1 002/pro.3136

T. (2013, August 18). Curvature. Retrieved November 12, 2020, from https://www.britannica.co

m/science/curvature

Weisstein, Eric W. "Arc Length." From MathWorld--A Wolfram Web Resource. https://mathwor

ld.wolfram.com/ArcLength.html

Weisstein, Eric W. "Helix." From MathWorld--A Wolfram Web Resource. https://mathworld.wo

lframcom/Helix.html

3D Calculator. (n.d.). Retrieved November 13, 2020, from https://www.geogebra.org/3d?lang=en

[1]Johnson, Dr. David A. DNA, Samford University, 2017, www2.samford.edu/~djohnso2/jlb/333/(06)dna.html.

[2] Weisstein, Eric W. "Helix." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/Helix. html

[3] Hardison, R. (2020, August 15). 2.5: B-Form, A-Form, and Z-Form of DNA. Retrieved November 03, 2020, from https://bio.libretexts.org/Bookshelves/Genetics/Book%3A_Working_with_Molecular_Genetics_(Hardison)/Unit_I%3A_Genes_Nucleic_Acids_Genomes_and_Chromosomes/2%3A_Structures_of_Nucleic_Acids/2.5%3A_B-Form_A-Form_and_Z-Form_of_DNA

[6]Weisstein, "Helix."

[7]Weisstein, Eric W. "Arc Length." From MathWorld--A Wolfram Web Resource.https://mathworld.wolfram.com/Arc Length.html

[8]Libretexts. (2020, August 12). 2.2: Arc Length in Space. Retrieved November 12, 2020, from https://math.libretexts.org/Bookshelves/Calculus/Supplemental_Modules_(Calculus)/Vector_Calculus/2:_Vector-Valued_Functions_and_Motion_in_Space/2.2:_Arc_Length_in_Space

[9]Levitt, M. “How many base-pairs per turn does DNA have in solution and in chromatin? Some theoretical calculations.” Proceedings of the National Academy of Sciences of the United States of America vol. 75,2 (1978): 640-4. doi:10.1073/pnas.75.2.640

[10]E.(2007). DNAsequencing. Retrieved November 13, 2020, from https://www.britannica.com/science/DNA-sequencing

[11]Lin, C., Tritschler, F., Lee, K., Gu, M., Rice, C., & Ha, T. (2017, March 06). Single‐molecule imaging reveals the translocation and DNA looping dynamics of hepatitis C virus NS3 helicase. Retrieved November 13, 2020, from https://onlinelibrary.wiley.com/doi/abs/10.1002/pro.3136

[12]Curvature: Definition of Curvature by Oxford Dictionary on Lexico.com also meaning of Curvature. (n.d.). Retrieved November 13, 2020, from https://www.lexico.com/definition/curvature

[13]T. (2013, August 18).Curvature.Retrieved November 12, 2020, from https://www.britannica.com/science/curvature

[14]Bailey, R. (n.d.). Why Is DNA Twisted? Retrieved November 14, 2020, from https://www.thoughtco.com/double-helix-373302

[15]Abnormal DNA Structures May Hold Key to Early Cancer Detection, Treatment. (1998, February 1). Retrieved November 13, 2020, from https://www.cancernetwork.com/view/abnormal-dna-structures-may-hold-key-early-cancer-detection-treatment