Sie sind auf Seite 1von 52

Sine Waves

Here's a collection of different ways to generate sine waves in software. 1. y = sin(x) This is by far the simplest technique. The library function "sin()" does all of the work. However, unless you have dedicated co-processor hardware, this is also one of the slowest ways. 2. Table Look Up In the table look up approach, you first generate an array of the sine values and store them in memory. Typically, one makes use of the symmetry properties of the sine wave to minimize the memory storage requirements. For example, suppose we wish to obtain y = sin(x) in one degree steps. For x between zero and 90 degrees, we could create the array:
float sine[91],pi=3.141592653; for(int i=0;i<=90;i++) sine[i] = sin(pi/180 * i);

Then, if we wanted the sine of 45 degrees, we simply write


y = sine[45];

Now, to obtain the other 3/4's of the circle, we can use the symmetry of sine wave. Each quadrant is obtained as follows:
y = sine[ 180 - x]; y = -sine[x - 180]; y = -sine[360 - x]; /* 90 <= x <= 180 /* 180 <= x <= 270 /* 270 <= x <= 360 */ */ */

3. Table Look Up Plus First Order (Linear) Interpolation The table look up technique works very well, however it is limited to relatively few values (only 360 in the previous example). The apparent dynamic range may be increased by interpolating between successive values. For example, if we wanted the sine of 45.5 degrees, we write
y = sine[45] + (sine[46] - sine[45])/2;

We have taken the sine of 45 degrees and added to it the one half degree interpolation between it and the sine of 46 degrees. In general, the first order interpolation formula goes as follows:

y = sine[x] + (sine[x+1] - sine[x]) * delta_x

A natural question to ask is, "How much error does Linear Interpolation introduce?" Well, you can either use the Cauchy Remainder Theorem for Polynomial Interpolation (1) or you can do a brute force calculation for linear interpolation. I used the brute force approach (I only knew about Cauchy later), and calculated a relative error, where I define relative error to be: relative error = approximated sine / actual sine. While this formula appears to blow up when sine is zero, my calculations (which may be prone to error) indicate the max relative error is: max relative error = cos(w*delta_t/2) where, w is the sinusoid's frequency and delta_t is the separation between the tabulated interpolation points. The frequency, as I show in the example below, is really the normalized one. In other words, assuming the table approximates one full sine wave, the normalized frequency is 2*pi. As an example, suppose you want to know how much error can be expected if we have a table of 256 values and our frequency is 60 Hz? w, which is in radians, is related to f by: w = 2*pi*f = 120*pi delta_t is related to the period, T (=1/f): delta_t = T/N = 1/(f*N) = 1/(60*256) When w and delta_t are multiplied, notice the frequency drops out: w * delta_t = (2*pi*f) * (1/(f*N)) = 2*pi/N. Substituting back into the formula: max rel error = cos(2*pi/N/2) = cos(pi/N) Thus the max error depends only on the number of samples in the table, N, as one might intuitively expect. For this example: max relative error = cos(pi/256) = 0.9999247

or about 0.03%. Since a relative error of 1 would mean no error, we are very close with 256 values in our table. In fact, this may be too good. So next we might ask, "How many samples do I need in my table to keep the error below 1%?" (or whatever percent you desire) We can work backwards and show: 1 - rel error < 1% or, 1 - cos(pi/N) < 1% cos(pi/N) > .99 N > pi / arccos(.99) ~ 22.19 Since N is an integer, we would need a table containing 23 samples of the sine wave. We would probably increase the size to 32 for practical reasons, and also divide it by four by using the symmetry arguments described above. So, if we have 8 equally spaced samples of the first quadrant of a sine wave, we can use interpolation to find the exact value of the sine wave to with in 1%. That's not bad! 4. Trigonometric Identity 1 Suppose you need a consecutive sequence of evenly spaced sine values like: sin(1), sin(1.1), sin(1.2), ..., sin(9.9) Furthermore, suppose you don't feel like creating a table to store them. Consider the trigonometric identities: sin(a+b) = sin(a)cos(b) + cos(a)sin(b) cos(a+b) = cos(a)cos(b) - sin(a)sin(b) We can define the starting angle as a and the step in angles as b, and precompute the following:
s1 = sin(a); c1 = cos(a); sd = sin(b); cd = cos(b);

For the first iteration, we have


sin(a+b) = s1 * cd + c1 * sd; cos(a+b) = c1 * cd - s1 * sd;

Now, let s1

= sin(a+b) and c1 = cos(a+b).

For the second iteration:


sin(a+2*b) = sin((a+b) + b) = sin(a+b)*cd + cos(a+b)*sd = s1 * sd + c1 * sd; cos(a+2*b) = cos((a+b) + b) = cos(a+b)*cd + sin(a+b)*sd = c1 * cd - s1 * sd;

Again, let s1 = sin(a+2*b) and c1 = cos(a+2*b). The third and successive steps are similar. These steps can be collected into a simple program:
s1 = sin(a); c1 = cos(a); sd = sin(b); cd = cos(b); for(i=0;i<N;i++) { temp = s1 * cd + c1 * sd; c1 = c1 * cd - s1 * sd; s1 = temp; }

Each iteration requires four multiplications and two additions. While this is a substantial savings in terms of computing sines from scratch, the method shown below can optimize this approach one step further and cut the computation in half. In additon, notice that we have obtained the cosine values for free (a matter of perspective; could be a burden if you don't need them). There are cases when having both the sine and cosine values is handy (e.g. to draw a circle). However, the next algorithm can produce sines only or sines and cosines more efficiently then this one. 5. Trigonometric Identity 2: Goertzel Algorithm The previous section required both sine and cosines to generate a series of equally spaced (in frequency) sine values. However, if you only want sines (or cosines) there is a more efficient way to obtain them. Using the nomenclature of the previous section, we can define the starting angle as a and the step in angles as b. We wish to compute:
sin(a), sin(a + b), sin(a + 2*b), sin(a + 3*b), ..., sin(a + n*b)

The goal is to recursively compute sin(a+n*b) using the two previous values in the sine series, sin(a + (n-1)*b) and sin(a + (n-2)*b):

sin(a + n*b) = x * sin(a + (n-2)*b) + y * sin(a + (n-1)*b)

Here, x and y are two constants that need to be determined. Note that the next value or the n'th value depends on the previous two values, n-2 and n-1. Re-arranging and simplifying:
sin(a + n*b) = x * sin(a + n*b) = x * y * sin(a + n*b) = [x [x sin(a + n*b - 2*b) + y * sin(a + n*b - 1*b) [ sin(a + n*b) * cos(2*b) - cos(a + n*b) * sin(2*b)] + [ sin(a + n*b) * cos(b) - cos(a + n*b) * sin(b)] * cos(2*b) + y * cos(b)] * sin(a + n*b) * sin(2*b) + y * sin(b)] * cos(a + n*b)

For this to be true for all n, we must have the two expressions in brackets satisfy:
[x * cos(2*b) + y * cos(b)] = 1 [x * sin(2*b) + y * sin(b)] = 0

which, when solved yields


x = -1 y = 2*cos(b)

Finally, substituting into the original equation we get:


sin(a + n*b) = -sin(a + (n-2)*b) + 2*cos(b) * sin(a + (n-1)*b)

The fortuitous "x=-1" reduces a multiplication to a simple subtraction. Consequently, we only have to do one multiplication and one addition (subtraction) per iteration. Here's a simple program to implement the algorithm:
c = 2* cos(b); snm2 = sin(a + b); snm1 = sin(a + 2*b); for(i=0;i<N;i++) { s = c * snm1 - snm2; snm2 = snm1; snm1 = s; }

Incidently, the cosine function has an identical recursive relationship:


cos(a + n*b) = -cos(a + (n-2)*b) + 2*cos(b) * cos(a + (n-1)*b)

Putting this into the program yields:

cb = 2* cos(b); snm2 = sin(a + b); snm1 = sin(a + 2*b); cnm2 = cos(a + b); cnm1 = cos(a + 2*b); for(i=0;i<N;i++) { s = cb * snm1 - snm2; c = cb * cnm1 - cnm2; snm2 = snm1; cnm2 = cnm1; snm1 = s; cnm1 = c; }

Two multiplications and two additions are required at each step. This is twice as fast as the previous algorithm. When using this algorithm (or the previous one), be aware that round off errors will accumulate. This may not be too serious of a problem for a series of say 50 terms when floating point arithmetic is used. However, if we want 8 bit accuracy and start off with one bit error, the accumulated error rapidly deteriorates the useful dynamic range. 6. Taylor Series Approximation The Taylor series of the sine function is: sin(x) = x - (x^3)/3! + (x^5)/5! - (x^7)/7! + ... Keep in mind that x is in radians and not in degrees! For small values of x, only the first few terms are needed. However, as x approaches pi/2 many terms are needed (assuming you wish to have more than just a few significant figures). The number of needed terms can easily be computed since each succesive term is smaller than its predecessor. For example, if we want to compute sin(pi/4) to 10 significant figures, then we have to compute all terms up to the one satisfying the inequality: 10^(-10) < ((pi/4)^n)/n!, where n is odd. While it is possible to use the Taylor series for the entire range (+ and - infinity), it is practical to limit the range to -pi/2 < x < pi/2. For angles outside this range, use the periodicity and symmetry of the sine function to extrapolate an answer. Also, notice that it is not practical to compute x^3, x^5, x^7,... Instead, re-write the Taylor series in a nested form

sin(x) = x*(1 - (x^2)/3! + (x^4)/5! - (x^6)/7! + ...) sin(x) = x*(1 - (x^2)*(1/3! - (x^2)/5! + (x^4)/7! -) ...) Or, dropping the higher order terms sin(x) ~ x*(1 - (x^2)*(1/3! - (x^2)(1/5! - (x^2)/7! ))) Finally, the factorial terms may be factored: sin(x) ~ x*(1 - (x^2)*(1 - (x^2)(1 - (x^2)/(6*7) ) / (4*5) ) /(2*3) ) Thus, we only need to calculate x^2 as opposed to x^7 and we don't have to calculate all of the factorial terms. Note that this technique of factoring a series is called "Horner's Rule". 7. Small Angle Approximation 8. Polynomial Approximation 9. Infinite Product 10. Pad Rational Approximation A Pad approximation is a rational function whose power series agrees with the power series of the function it is approximating. For example, the ratio of two polynomials N(x) and D(x): R(x) = N(x) / D(x) is said to be Pad approximation to P(x) if R(0) = P(0) R'(x) = P'(x) , evaluated at x=0 R''(x) = P''(x), evaluated at x=0, etc. up to the order of interest Here, the primes refer to differentiation with respect to x. So each case above, we differentiate R and P, and then evaluate them at x=0. The Pad approximation for the sine function is

sin(x) = x * num(x) / den(x) num(x) = 1 + n1*x^2 + n2*x^4 + n3*x^6 den(x) = 1 + d1*x^2 + d2*x^6 + d3*x^6 n1 = -325523/2283996 n2 = 34911/7613320 n3 = 479249/11511339840 d1 = 18381/761332 d2 = 1261/4567992 d3 = 2623/1644477120 References: 1) Interpolation & Approximation, Philip J. Davis. Dover Publications, Inc. 1975 2) Numerical Mathematics and Computing, Ward Cheney and David Kincaid. Brooks/Cole Publishing Company. 3) The Art of Computer Programming Vol 1, Donald E. Knuth, Addison-Wesley Publishing Company BACK HOME

Square waves 1012 Uses of Square Waves


Square waves have an interesting mix of practice and theory. In practice, they are extremely simple. In their simplest form, they consist of an alternating sequence of amplitudes; e.g. high/low or 1's and 0's. They're found in numerous applications. However in theory, they are somewhat difficult to analyze. 001 Harmonics Often times square waves are used as a stimulus in a system. For example, in many PWM applications the square waves are low-pass filtered so that only the DC component used. In this case, the system consists of an imperfect low pass filter; imperfect in the sense that high frequencies are not totally suppressed. If we understood the harmonic content of the square wave we could ascertain which high frequency components are present and what their magnitudes are. There are numerous ways to obtain the harmonic content of a square wave. I'll use Laplace Transforms and provide a few extra details along the way (in other words, the equations are derived as opposed to being referenced to another source). Consider the square wave shown below:
^ v(t) | tau | <----> | A| +----+ +----+ | | | | | -+-----+ +------------------------+ +---------> t phi <-----> <-------------- T ------------>

There are four parameters that describe the square wave: A - Amplitude. phi - initial phase tau - pulse width of the high pulses (less than T) T - period or pulse rate There are other parameters that we derive from these. For example: f = 1/T is the pulse frequency d = tau/T is the duty cycle

It's convenient for later analysis to define the square wave with all of these parameters. For example, if we wanted to see how the harmonics vary as a function of the duty cycle then we only need to change one parameter: tau. The equation of the square wave can be written in terms of step functions. The first cycle, v0(t), can be written as
v0(t) = A(u(t-phi) - u(t-phi-tau)) for 0 <= t < T

Or graphically,
| u(t-phi) A| +-------------------------------------------| | -+-----+----------------------------------------------> phi t <-----> | u(t-phi-tau) A| +--------------------------------------| | -+-----------+----------------------------------------> phi + tau t <----------->

The whole square wave function can be expressed as a repetition of v0(t):


v(t) = v0(t - nT) n = 0, +/-1, +/-2,

Before diving into the details of finding the Laplace Transform of this square wave, it might be worth while studying the general theory behind the Laplace Transforms of Periodic Wave Forms. As discussed there, we only need to find the Laplace transform of a single cycle of the square wave. The Laplace Transform of v(t) is then found from the following formula
V0(s) V(s) = ---------------sT 1 - g * e

(g=1 for this type of symmetry)

V0(s) = ------------------------------s*T/2 / s*T/2 -s*T/2 \ e | e - e | \ / s*T/2 V0(s) = e * --------------2*sinh(s*T/2)

Where V0(s) is the Laplace transform of v0(t), and g=1 (recall that this is a factor that describes the symmetry of the wave form as discussed in the LTPW page). The Laplace Transform of v0(t) is:
L[v0(t)] = L[A(u(t-phi) - u(t-phi-tau))] A / -s*phi -s*(phi+tau) \ = --- | e - e |

s \ / A -s*(phi+tau/2) / s*tau/2 -s*tau/2 \ = --- e | e - e | s \ / 2*A -s*(phi+tau/2) = --- e *sinh(s*tau/2) s

And finally after inserting the expression for V0(s), we get the Laplace Transform of a square wave:
V(s) = e s*(T/2-phi-tau/2 2*A* sinh(s*tau/2) 1 * ------------------- * --2*sinh(s*T/2) s s*(T/2-phi-tau/2) sinh(s*tau/2) = A*e * --------------s*sinh(s*T/2)

It probably goes without saying that this is certainly a complicated way to express something as simple as a square wave. Fortunately, the powerful tools associated with Laplace Transforms can efficiently utilize these closed-form analytical expressions. Our immediate goal is to find the harmonic content of the square wave. And you could argue that this is what we achieved with the Laplace Transform. However, let's take this one step further and find the discrete frequency expansion like that obtain from a Fourier Transform. 010 Fourier Series The technique of finding the discrete transform amounts to finding the poles of the Laplace transform and the residues associated with those poles. Poles and residues are layman's terms for describing certain mathematical features of a function. Specifically, the poles are those values of the dependent variable that cause the function to evaluate to infinity. For example 1/(x-5), approaches infinity when x, the dependent variable, is equal to 5. Residues describe the magnitude of the function evaluated at the pole after the pole has been removed or canceled. A pole is "removed" by multiplying the function with a term (s-p)^a, where p is the pole and a is the order of the pole. The poles of the Laplace transform of the square wave are at those values of s for which the denominator evaluates to zero. This occurs for s=0 (those of you familiar with LT theory may note the indeterminacy at s=0, but if you look closely through the lens of L'Hopital you will find that indeed s=0 is a pole) and for sinh(s*T/2) = 0. The latter case is only zero for imaginary frequencies:
sinh(s*T/2) -j*sin(j*s*T/2) j*s*T/2 s 0 0 arcsin(0) j*2/T * (n*pi) 2*pi = j*n * ---T = = = =

n = 0,+/-1, +/-2 n = 0,+/-1, +/-2

Now that we have all of the possible poles it's time to find the residues. I say possible because it may turn out that in the process of finding the residues we discover that some the poles are canceled. Let's find out. The formula for finding the residues is:
/ \ (s-s_n)*sinh(s*tau/2)exp(s(T-tau)/2-phi) | | L[v(t)]*(s-s_n) | = ----------------------------------------- | \ / s=s_n s*sinh(s*T/2) |s=s_n

Note that both the numerator and the denominator evaluate to zero. This indeterminacy is resolved with L'hopital's rule. After applying one iteration of L'hopital's rule we get:
sinh(s_n*tau/2)exp(s_n(T-tau)/2-phi) = -------------------------------------T/2*s_n*cosh(s_n*T/2) sinh(j*n*pi*d) j*n*2*pi/T*((T-tau)/2 phi) = --------------------------------* e T/2* j*n*2*pi/T * cosh(j*n*pi) j*sin(n*pi*d) jn*pi -j*n*2*pi/T*(tau/2+phi) = -------------------- * e * e j*n*pi*cos(n*pi)

The exp(j*n*pi) term cancels the cos(n*pi) term leaving:


sin(n*pi*d) -j*n*2*pi/T*(tau/2+phi) --------------- * e n*pi ---------------------------------------------------------------R_n =

And this is the expression for the residues of the square wave. There is yet one more problem to be solved. Note that n=0 results in an indeterminacy. This is resolved by again applying L'hopital's rule. Differentiating the numerator and denominator with respect to n and then evaluating at n=0 results:
R_0 = d

In other words, the residue associated with the DC term is simply d, the duty cycle. Perhaps your intuition and knowledge of square waves is telling you that this is really a tremendous waste of effort going through this mathematical exercise to show something you already know. Appease any frustration by calling the duty cycle result a sanity check. One more good sanity check is for the 50% duty cycle case. In other words, you may be aware that no even harmonics are present in the expansion of a square wave with a 50% duty cycle. So as a check, substitute d=0.5 in the R_n equation
| sin(n*pi/2) -j*n*pi/2 -j*n*phi*2*pi/T R_n | = ------------ e * e | d=0.5 n*pi

And sure enough, if n is even the sin(n*pi/2) term in the numerator evaluates to zero.

At this point, we have got the poles and residues. We could consider our job done. The results are an infinite series expansion:
infinity ---- sin(n*pi*d) -j*n*2*pi/T*(tau/2+phi) L[v(t)] = \ ------------- * e * del(s-j*n*2*pi/T) /___ n*pi n=-infinity

Where the function del() is my way of denoting a delta function (in ASCII art). There is one more step that may be applied to convert this rather intimidating series into a more familiar form. Consider the Laplace transform of the cosine function:
L[cos(w*t + ph)] = s s*ph/w ------------ * e s^2 + w^2 s s*ph/w = ------------------- * e (s + j*w)*(s - j*w)

The poles are seen to be:


s_p = +/- j*w

And the residues associated with them are:


| j*w j*ph R | = ------- * e |s=j*w 2*j*w j*w = 0.5 * e

or writing both residues in one expression:


+/-j*w R_p = 0.5 * e

Finally, the Laplace Transform can be written as:


/ j*w -j*w \ L[cos(w*t+ph)] = 0.5*| e * del(s-j*w) + e * del(s+j*w) | \ /

To apply this result to our square wave expansion we need to pair up the poles of opposite signs. This amounts to folding the infinite series about n=0.
infinity ---* L[v(t)] = \ (R_n * del(s-j*n*2*pi/T) + R_n * del(s+j*n*2*pi/T) ) /___ n=0

Where R_n is the n'th residue:


sin(n*pi*d) -j*n*2*pi/T*(tau/2+phi)

R_n =

------------- * e n*pi

If we make the substitution n=-n, then we get the residues:


sin(-n*pi*d) j*n*2*pi/T*(tau/2+phi) ------------- * e -n*pi sin(n*pi*d) j*n*2*pi/T*(tau/2+phi) * ------------- * e = R_n n*pi

R_(-n) =

R_(-n) =

Substitute this result back into the series expansion of the square and wave we get:
infinity ---- sin(n*pi*d) / / 2*pi / tau \\\ L[v(t)] = d*del(0) + \ ------------- * L|cos|n* ---- * | t - --- - phi ||| /___ n*pi \ \ T \ 2 /// n=1

And finally (really), we get the series expansion of the square wave in terms of cosine waves:
infinity ---- sin(n*pi*d) / 2*pi / tau \\ v(t) = d + \ ------------- * cos|n* ---- * | t - --- - phi || /___ n*pi \ T \ 2 // n=1

Here's a screen shot of the first 100 harmonics of the repeating bit stream 10010001110000111100010101101010. If you want to create this plot yourself, then grab fftsq.m. This is an Octave script. 010 Square Waves and Circuits As an example, let's consider the response of a low-pass RC filter to a pulse-width-modulated square wave. You perhaps know from experience that the output has a DC component that's proportional to the width of the pulses. There's also a little ripple riding on the DC. But let's quantify it.
R +----/\/\/----------+ | | /---+---\ | | x(t) | ----\---+---/ ----| | +-------------------+

+ y(t) -

The square wave input, x(t), is applied to the RC network and the output, y(t) is measured across the capacitor. We can use Laplace transforms to find y(t).
Y(s) = H(s) * X(s)

We know X(s) from above. H(s) is found from circuit analysis to be

1/RC H(s) = --------------s + 1/RC So Y(s) is: s*(T/2-phi-tau/2) sinh(s*tau/2) 1/RC * --------------- * ----------s*sinh(s*T/2) s + 1/RC

Y(s) = A*e

It's going to be slightly more useful to manipulate this expression with hyperbolic sines expressed in terms of sums of exponentials. Feel free to do the arithmetic and feel better if you derive this result:
-s*tau -s*phi 1 - e 1 1/RC Y(s) = A*e * ------------ * --- * ----------s*T s s + 1/RC 1 - e

Now we have an expression for Y(s). However, we also want y(t). Perhaps the best approach here is to use partial fraction expansion.
Y(s) = K1 ---------s + 1/RC + K2 --s + P_o(s) -----------sT 1 - e

The first two terms are the natural response while the last term is the forced response. The constants associated with the natural response are solved by multiplying both sides of the equation by the poles and evaluating the resulting expression at the pole.
| K1 = Y(s) * (s + 1/RC) | |s= -1/RC tau/RC phi/RC 1 - e = -A*RC * e * --------------T/RC 1 - e

And similarly for K2 (after applying L'Hopital's once):


| K2 = Y(s) * s | |s = 0 = tau/T

The only thing left is to find the expression for the periodic portion of the response, P_o(s).
-sT K1 K2 P_o(s) = (1 - e )*(Y(s) - ---------- - --- ) s + 1/RC s -s*phi RC RC -s*tau K1 K2 -sT = A*e *(---- - ----------)*(1 - e ) -( ---------- + --- ) *(1 - e ) s s + 1/RC s + 1/RC s

This ominous looking expression is very close to being in a standard form for inversion. But before we go back to the time domain, let's make an observation that will save some effort. Note that p(t) is a periodic expression with a period of T. In other words, p(t) = p(t-T). The expression above defines p_0(t) which is p(t) over one period. The significance is that the very last exponential term, exp(-s*T), can be dropped. In other words, this exponential term induces a time shift in p_0(t) beyond which p_0(t) is defined. Finally, after taking the inverse Laplace transform of the above expression we get:
-t/RC -t/RC P_0(t) = A*RC*(1 - e )*(u(t-phi) - u(t-phi-tau)) - (K1 + K2*e )*u(t)

Wouldn't be interesting to see the plot of this? 011 Harmonic Content of a Periodic Pulse Train
011.1 Hamonic Cancellation

Suppose you wish to generate a stream of pulses devoid of certain frequencies. For example, Don Lancaster's Magic Sinewave algorithm attempts to produce a pulse stream that when low-pass filtered will leave behind a relatively spectrally pure sine wave. His algorithm minimizes the number of pulses over one cycle of the sinewave as contrasted to similar PWM algorithms that attempt to maximize the number of pulses over one cycle. Another example is for tone detection. Here you may be interested in DTMF decoding or perhaps a 60Hz digital phase locked loop. In either case, it's important to minimize or optimize the detection algorithm so that not all of the CPU's resources are consumed. So how do you go about cancelling harmonics? Certainly there are hardware techniques that use analog filters. But the focus here is on algorithms and mathematics. Here is a completely arbitrary square wave:
^ v(t) | tau | <----> | A| +----+ +----+ | | | | | -+-----+ +------------------------+ +---------> t phi <-----> <-------------- T ------------>

And just to be clear, here is a description of each parameter: A - Amplitude. phi - initial phase

tau - pulse width of the high pulses (less than T) T - period or pulse rate Two useful parameters derived from these are: f = 1/T is the pulse frequency d = tau/T is the duty cycle Now, it can be shown that(1) the Fourier series is given by:
sin(n*pi*d) -j*n*2*pi/T*(tau/2+phi) --------------- * e n*pi

R_n =

Where n is the harmonic number. (More specifically, this is the n'th term of the Fourier series; the whole series is in fact the infinite summation of each term). Before diving into the details, here are a couple of octave programs that you may use to study the harmonic content of square waves: ms.m ms2.m The expression above is a complicated way to express a square wave, so let's look at this equation more closely. First let's see how we can reconcile it with what we know (or have been taught) about square waves. You may be aware that a square with a 50% duty cycle does not have even harmonics and that the strength of each (or the remaining odd) harmonics is inversely proportional to harmonic number. A 50% duty cycle means that the square wave is high for half of its cycle. In terms of the parameters above, tau = T/2 and d = 1/2. We can make phi be zero:

^ v(t) | tau=T/2 |<-------------> | A +--------------+ +------------+ | | | | --+ +--------------+ +-> t <------------- T ------------>

And the expression for the harmonics:


R_n = sin(n*pi/2) -j*n*2*pi/T*(T/4 + 0 ) --------------- * e n*pi

sin(n*pi/2)

-j*n*pi/2

R_n =

--------------- * e n*pi

A closer examination of sin(n*pi/2) reveals:


n | sin(n*pi/2) | exp(-j*n*pi/2) ---------+---------------+--------------1 | 1 | -j1 2 | 0 | -1 3 | -1 | j1 4 | 0 | 1 5 | 1 | -j1 6 | 0 | -1 7 | -1 | j1 8 | 0 | 1

The sine term is responsible for suppressing the even harmonics. Combining each of the terms yields:
n | R_n ---------+-----------1 | -j(1/pi) 2 | 0 3 | -j(1/pi/3) 4 | 0 5 | -j(1/pi/5) 6 | 0 7 | -j(1/pi/7) 8 | 0 9 | -j(1/pi/9)

The even harmonics have been suppressed and the strength of the odd harmonics vary inversely proportional to harmonic number. In general, we can suppress any given harmonic by varying the width of the pulse. For example, to suppress the n'th harmonic:
sin(n*pi*d) -j*n*2*pi/T*(tau/2+phi) --------------- * e n*pi sin(n*pi*d)

R_n = 0 =

0 =

n*pi*d = arcsin(0) n*pi**T/7, 3*T/7, ...

Note that suppressing one harmonic will cause others to reappear. For example, if we do suppress the 7th harmonic, the 2nd,3rd, etc. harmonics will be present.
011.10 Canceling Multiple Harmonics

So how would you suppress two harmonics that are not multiples of one another? As long as there's just one pulse per period, you can't. But if you introduce additional pulses over the period, then you can! This is where the phase term becomes important. It's probably easiest to see how this works by starting with an example. Suppose the period, T, is divided into 12 equal portions and that we

would like to construct a waveform consisting of several pulses each 1/12 of T. Using the parameters for the square wave we've got: M = 12, this is the number of pulses for one period. tau = T/M = T/12 d = tau/T = 1/12 phi = m * tau = m * T/12 where m =0,1,2,...,11 In other words, each pulse has the same width, tau, and duty cycle, d. The phase of the pulse, phi, depends on its quantized position. The m'th pulse is the one that 'm' units from the beginning of the pulse stream. And of course, the pulse stream repeats with a period, T. The Fourier series for the m'th pulse can be written

R_n,m

sin(n*pi*d) -j*n*2*pi/T*(tau/2+phi) --------------- * e n*pi sin(n*pi*d) -j*n*2*pi/T*(tau/(2*M) + m*T/M) --------------- * e n*pi sin(n*pi/12) -j*n*2*pi/T*(T/24 + m*T/12) --------------- * e n*pi sin(n*pi/12) -j*n*pi*(1 + 2*m)/12 --------------- * e n*pi sin(n*pi/12) -j*n*pi/12 -j*n*pi*m/6 --------------- * e * e n*pi

From this we can observe that the magnitude at a given harmonic depends only on the harmonic number, n, and not on the pulse position, m. The phase, on the other hand, depends on both the harmonic and pulse position. Now let's look at how two pulse interact with one another. Let's call them m1 and m2 for right now.
^ v(t) | tau | <--> | m1 m2 m1 m2 A| +--+ +--+ +--+ +--+ | | | | | | | | | -+-----+ +--+ +---------T-----+ +--+ +-------> t phi1 <-----> phi2 <----------->

<----------- T ----------> phi1 = m1 * tau phi2 = m2 * tau

(m1 and m2 are integers between 0 and 11)

To obtain the Fourier series of the two-pulse pulse train we only need to add together the Fourier series obtained from the pulses when they're taken individually:
Rn = Rn,m1 + Rn,m2

Which says that the nth Fourier term of the composite wave is the sum of the nth terms of the m1 and m2 pulses.

R_n

sin(n*pi/12) -j*n*pi*(1+2*m1)/12 ------------ * e + n*pi sin(n*pi/12) / -j*n*pi*(1+2*m1)/12 ------------ * |e + n*pi \

sin(n*pi/12) -j*n*pi*(1+2*m2)/12 ------------ * e n*pi -j*n*pi*(1+2*m2)/12 \ | / -j*n*pi*m2/6 \ e | /

sin(n*pi/12) -j*n*pi/12 / -j*n*pi*m1/6 ------------ * e * |e + n*pi \

The term outside of the brackets depends only on the harmonic number (and not position or phase of the pulse. As seen earlier, this can be made zero only by varying the pulse width. However the pulse width has been fixed to T/12. Consequently we need to study the term in the brackets if we wish to investigate harmonic cancellation. Before simplifying the equation, it's worth while observing the tabulation of the exponential terms for the first few harmonics:
pulse phase m | exp(-j*n*pi*m/6) -----------+--------------------------------------------0 | cos(n*pi*0/6) -j*sin(n*pi*0/6) = 1 1 | cos(n*pi*1/6) -j*sin(n*pi*1/6) 2 | cos(n*pi*2/6) -j*sin(n*pi*2/6) = cos(n*pi/3) -j*sin(n*pi/3) 3 | cos(n*pi*3/6) -j*sin(n*pi*3/6) = cos(n*pi/2) -j*sin(n*pi/2) 4 | cos(n*pi*4/6) -j*sin(n*pi*4/6) 5 | cos(n*pi*5/6) -j*sin(n*pi*5/6) 6 | cos(n*pi*6/6) -j*sin(n*pi*6/6) = cos(n*pi) 7 | cos(n*pi*7/6) -j*sin(n*pi*7/6) 8 | cos(n*pi*8/6) -j*sin(n*pi*8/6) 9 | cos(n*pi*9/6) -j*sin(n*pi*9/6) 10 | cos(n*pi*10/6) -j*sin(n*pi*10/6) 11 | cos(n*pi*11/6) -j*sin(n*pi*11/6)

For the fundamental, n=1, this table can be re-written:


pulse phase

m | exp(-j*n*pi*m/6) -----------+--------------------------------------------0 | 1 1 | cos(pi/6) - j*sin(pi/6) 2 | cos(pi/3) - j*sin(pi/3) 3 | cos(pi/2) - j*sin(pi/2) 4 | cos(pi*2/3) - j*sin(pi*2/3) 5 | cos(pi*5/6) - j*sin(pi*5/6) 6 | -1 7 | -cos(pi*6) + j*sin(pi/6) 8 | -cos(pi/3) + j*sin(pi/3) 9 | -cos(pi/2) + j*sin(pi/2) 10 | -cos(pi*2/3) + j*sin(pi*2/3) 11 | -cos(pi*5/6) + j*sin(pi*5/6)

Or if you prefer, the phasor graph is shown below. The number next to each '*' is the pulse position, 'm', from the table above. (A phasor diagram is essential a vector plot. The 'real component' of the phase is the magnitude of the x=axis and the 'imaginary component' is the magnitude of the y-axis. The vector (or phasor) starts at the origion of the graph and extends to the (x,y) coordinate defined by the real and imaginary portions of the phasor). As we shall see below, the phasor diagram is a convenient way to visualize harmonic cancellation. Phasor are combined vectorially. So two phasors that point in opposite directions (and have the same magnitude) will cancel one another.
j ^ | 9 8 * 10 * | * 7 \ | / 11 *. \ | / .* 6 ' . \|/. ' 0 --*----------+----------*---------> . '/ |\' . *' / | \ ' * 5 / | \ 1 * | * 4 * 2 | 3

For the second harmonic, the table and graph are:


m | exp(-j*n*pi*m/6) -----------+--------------------------------------------0 | 1 1 | cos(pi/3) - j*sin(pi/3) 2 | cos(pi*2/3) - j*sin(pi*2/3) 3 | -1 4 | -cos(pi*1/3) + j*sin(pi*1/3) 5 | cos(pi*2/3) + j*sin(pi*2/3) 6 | 1 7 | cos(pi/3) - j*sin(pi/3) 8 | cos(pi*2/3) - j*sin(pi*2/3) 9 | -1 10 | -cos(pi*1/3) + j*sin(pi*1/3) 11 | cos(pi*2/3) + j*sin(pi*2/3)

^ | 4,10 | 5,11 * | * \ | / \ | / 3,9 \|/ 0,6 --*----------+----------*---------> / |\ / | \ / | \ * | * 2,8 * 1,7 |

It's interesting to see how the pulses 6 through 11 are aliased to pulses 0 through 5. This trend of aliasing the phase continues as long as the harmonic, n, divides evenly into the number of pulses, M. Finally, for n=3:
m | exp(-j*pi*m/2) -----------+--------------------------------------------0 | 1 1 | -j 2 | -1 3 | j 4 | 1 5 | -j 6 | -1 7 | j 8 | 1 9 | -j 10 | -1 11 | j

^ | 3,7,11 * | | | 2,6,10 | --*----------+----------*---------> | 0,4,8 | | | * | 1,5,9

Example two-pulse pulse trains for M=12

By either inspecting the table or examining the graph, we can easily see how two pulses may be paired such that their phases cancel for a given harmonic. For n=1, we could pair m=0 and m=6.

This would cancel the fundamental (i.e. n=1 corresponds to the fundamental). In fact, if you draw the pulse train containing just pulses 0 and 6, you'll notice that the frequency is doubled. It's not our objective to cancel the fundamental. So let's look at the table and graph for n=2. We can see that 0 and 6 have exactly the same phase and that they are opposite of 3 and 9. So if formed a pulse train consisting of pulse at the m=0 position and one at the n=3 position, the second harmonics would cancel! Here are several examples: Pulses at positions 1 and 7. This cancels the 'fundamental', which means that we created a new square wave with twice the frequency of the first.
^ v(t) | A| +--+ +--+ +--+ | | | | | | | -+--+ +--------------+ +-----------T--+ +-----> t 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3

Pulses at positions 1 and 4. This suppresses the 2nd harmonic. The fundamental and third harmonics are sqrt(2) times larger than the fundamental and third for just a single pulse.
^ v(t) | A| +--+ +--+ +--+ | | | | | | | -+--+ +-----+ +--------------------T--+ +-----> t 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3

Pulses at positions 1 and 3. This suppresses the 3rd harmonic.


^ v(t) | A| +--+ +--+ +--+ | | | | | | | -+--+ +--+ +-----------------------T--+ +-----> t 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3

Now let's return to the expression for the two pulse Fourier Series.

R_n =

sin(n*pi/12) -j*n*pi/12 / -j*n*pi*m1/6 ------------ * e * |e + n*pi \

-j*n*pi*m2/6 \ e | /

In general, we can induce phase cancellation when the term in the brackets is zero:

0 =

/ -j*n*pi*m1/6 |e + \ -j*n*pi*m2/6 -e

-j*n*pi*m2/6 \ e | /

-j*n*pi*m1/6 e =

-j*n*pi*m1/6 e = e

-j*n*pi*m2/6 j*pi e j*pi*(6 - n*m2)/6

Resist the temptation of taking the natural log and saying that m1 = (m2 - 6/n). The reason is that there's the multiplication by 'n'. This has the effect in some instances of advancing the phase by more than 2*pi radians. And when this happens you'll need to account for the wrapping around. So we can cover ourselves by re-writing the equation slightly. First note:
j*pi*x e = e j*pi*(x % 2)

And apply this to the expression above:


-j*pi*((n*m1/6)%2) e = e -j*pi*(( (-6 n*m2)/6) % 2)

The exponential implicitly performs the mod operation. We just made that explicit, so now it's safe to take the logarithm:
-(n*m1/6)%2 = -((6 - n*m2)/6)%2

If n evenly divides into M/2, or stated equivalent, (M/2) % n = 0, then we may re-write this equation:
(n*m1/6)%2 = ((6 - n*m2)/6)%2 0 = (n*m1/6 + n*m2/6 + 1)%2 0 = (-n*m1/6 + n*m2/6 + 6/6)%2 -n*m1 + n*m2 + 6 0 = ------------------ % 2 6 n*(-m1 + m2 + 6/n) 0 = ------------------- % 2 6 -m1 + m2 + 6/n 0 = ------------------- % 2 6/n (-m1 + m2 + 6/n) % (2*6/n) 0 = --------------------------6/n 0 = (-m1 + m2 + 6/n) % (2*6/n)

Re-iterating, the last step assumes that 6%n is zero OR n = 1,2, or 3. Before generalizing this result, let's apply this to the tabulated results and see how well the equation works. For n = 1, we noticed that pulses separated by 6 units would cancel. So assigning n = 1:

0 = (-m1 + m2 + 6) % 12

This equation is satisfied for m1=0 and m2 = 6, (1,7), etc. What about m1 = 7 and m2 =1?
0 = (-7 + 1 + 6) % 12 = 0 % 12, yep

Let's examine the third harmonic, n=3. Our observations from the tabulated results concluded that pulses separated by two positions cancelled the 3rd harmonic.
0 = 0 = (-m1 + m2 + 6/3) % (2*6/3) (-m1 + m2 + 2) % 4

Which agrees with the observation. As long as m1 and m2 differ by 2, this equation is satisfied.
Generalizing the results for two pulses

Most of the time derivations are derived generically and then specific examples are studied afterwards. The opposite approach has been taken here. Deriving a specific case allowed us to continuously compare our derivation to our observations. Reality checks are always invaluable. Extrapolating to the general case turns out to be fairly straight forward. While we could rigorously repeat all of the above in generic terms, let's instead pick out the salient portions. In the general case, there are M pulses. We're interested in suppressing the n'th harmonic by having the phase of two pulses cancel one another. The two pulses have been parameterized as m1 and m2. So here are the requirements for generalizing the above:
a) b) 0 = M % 2 0 = (M/2) % n

The first says that M must be even. The second says that the n'th harmonic must evenly divide half of M. If these two conditions are met, then the harmonics due to the m1 and m2 pulses cancel when:
+-------------------------------------+ | 0 = (-m1 + m2 + (M/2)*n) % (M/n) | +-------------------------------------+

Observations

If we wish to cancel the 2nd through n'th harmonics, it's necessary for each of these in turn to evenly divide M/2. The following table lists the minimum value for M such that the 2nd through n'th harmonics are cancelled:
n | M ------+------2 | 4 3 | 12 4 | 24 5 | 120 6 | 120 7 | 840 8 | 840 9 | 2520

100 Tone Detection

101 Dual Tone Multi-Frequency (DTMF) Detection

And here's more software.

Pulse Width Modulation Techniques

Introduction Let's face it, PWM is cool! It bridges the gap between the digital and analog regimes allowing one to exchange time for voltage. Spew a stream of 1's and 0's into a filter and out comes a DC voltage! Most people use hardware to generate PWM waveforms. I guess that's their right, but the fun lies in trying to generate them in software. So here are a few different techniques for generating Pulse Width Modulated square waves using software. It turns out that many of techniques described here may also be implemented in VHDL. Phase Shifted Counters In this version, there are two roll-over counters that count at the same rate, but are shifted relative to one another. You can think of them as the "rising edge" and "falling edge" counters. The operation is fairly simple: when the rising edge counter rolls over, make the PWM output High. When the falling edge counter rolls over, make it low. In Psuedo-code:
// Initialize PWM_output = 1; rising_edge = 0; falling_edge = MAX_COUNT - duty_cycle; // Loop forever: while(1) { // If the rising_edge counter rolls over, make the output high if(rising_edge++ > MAX_COUNT) { rising_edge = 0; PWM_output = 1; }

// Rising edge counter // Falling edge counter

// If the falling_edge counter rolls over, make the output low if(falling_edge++ > MAX_COUNT) { falling_edge = 0; PWM_output = 0; } }

In this un-optimized example, we start the PWM output in the high state and initialize the falling edge counter with the desired duty cycle (or actually MAX_COUNT - duty_cycle, since the

algorithm increments the counters). When the counters reach their maximum count, they're cleared back to zero and the PWM output is updated. Perhaps a simple picture illustrates it better:

RE

|---------|---------|---------|---------|--------01234567890123456789012345678901234567890123456789 ----|---------|---------|---------|---------|----67890123456789012345678901234567890123456789012345

FE

PWM ----______----______----______----______----______

RE = Rising Edge counter FE = Falling Edge counter PWM = PWM output In this example, the counters roll over after 10 counts. In the beginning, the rising edge counter is cleared and the output is driven high. Also, the duty cycle is 4 so the falling edge counter is initialized to (10 - 4) = 6. When the rising edge counter counts from 9 to 10, it is "rolled over" back to zero and the output is driven high. Similarly, the falling edge counter drives the output low when it rolls over. Perhaps one thing to notice is that the difference (modulo 10) between the two counters is always equal to the duty cycle. That's no coincidence. In fact that's why this technique is called the "Phase Shifted Counters".
A couple of optimizations

The psuedo code shown above can be simplified by taking advantage of counters that "automatically" roll over. Binary counters are certainly the most common. For example, suppose we're dealing with 8-bit bytes. These range from 0 to 255. If we were to put one of these counters into an infinite loop, it would start at 0, count up to 255, roll over to 0, count up to 255, etc. If we wanted a 5-bit wide counter, then we could logically "and" the counter with 2^5-1 = 31, each time after we increment it. In the example psuedo code shown below, the binary counters are N-bits wide.
MAX_COUNT = 2^N; MASK = 2^N - 1; PWM_output = 1; rising_edge = 0; falling_edge = MAX_COUNT - duty_cycle; // Loop forever: while(1) { // If the rising_edge counter rolls over, make the output high

// Rising edge counter // Falling edge counter

rising_edge = ++rising_edge & MASK; if(!rising_edge) PWM_output = 1;

// If the falling_edge counter rolls over, make the output low falling_edge = ++falling_edge & MASK; if(!falling_edge) PWM_output = 0; }

Multiple PWM outputs

So far this seems like a lot of over head to generate a single PWM output. Fortunately this technique scales up quite nicely for multiple pwm outputs. For example, all of the PWM outputs could be synchronized so that their rising edges all occur at the same time. The falling edges are of course dependent on the duty cycles. If you want to a real example with 8 pwm outputs code on a pic then check out pwm8.asm. The psuedo code below also illustrates the point:
MAX_COUNT = 2^N; MASK = 2^N - 1; PWM_output = 1; rising_edge = 0; falling_edge1 = MAX_COUNT - duty_cycle1; falling_edge2 = MAX_COUNT - duty_cycle2; falling_edge3 = MAX_COUNT - duty_cycle3; // Loop forever: while(1) { pwm_temp = 0; // If the rising_edge counter rolls over, make the output high rising_edge = ++rising_edge & MASK; if(!rising_edge) pwm_temp = 0xff; // all outputs high. // If the falling_edge counter rolls over, make the output low falling_edge1 = ++falling_edge1 & MASK; if(!falling_edge1) pwm_temp &= ~1; falling_edge2 = ++falling_edge2 & MASK; if(!falling_edge2) pwm_temp &= ~(1<<1); falling_edge3 = ++falling_edge3 & MASK; if(!falling_edge3) pwm_temp &= ~(1<<2);

// // // //

Rising edge counter Falling edge counter #1 Falling edge counter #2 Falling edge counter #3

PWM_OUTPUT = pwm_temp; }

Phase Accumulators Cascaded Counters Isochonous Code Vertical Counters

Square root theory


Here's a collection of a couple of algorithms that can be used to find the square root of a number. 1. Iterative Algorithm Up until some time in the 1960's, school children were taught a really clever algorithm for taking the square root of an arbitrarily large number. The technique is quite similar to long division. Here's how it works for decimal square roots. a) Starting at the decimal point pair off the digits. b) Find the largest square that subtracts from the left-most pair and still yields a positive result. This is the remainder that will be used in the next step and note that it is only two (or one) digits. The square root of this largest square is the first digit of the square root of the whole number. c) Concatenate the next pair of digits with the remainder. d) Multiply the square root devloped so-far by 20. Note that the least significant digit is a zero. e) The next digit in the square root is the one that satisfies the inequality: (20 * current + digit) * digit <= remainder where "current" is the current square root and "digit" is the next digit produced by the algorithm. f) Form the new positive remainder by subtracting the left side of the equation in the previous step from the right side. g) go to step (c) As an example, suppose we want to find the square root of 31415.92653: a) Pair off the digits
3 14 15.92 65 3 ^ ^ ^ ^ ^

b) The left most pair is 03 and the largest square that subtracts from it is 1. And since the square root of 1 is 1, the first digit of our answer will be 1. c-g)
1 7 7. 2 ------------) 03 14 15.92 -1 -27 | 2 14 1*20 = 20

*7=

1 89 (20+x)*x < 214 ==> x=7 ---347 | 25 15 17*20 = 340 *7= 24 29 (340+x)*x < 2515 ==> x=7 ----3542 | 86 92 177*20 = 3540 *2= 70 84 (3540+x)*x < 8692 ==> x=2 ----16 08

The reason this works is easy to show. We can rewrite the number we wish to root:
2n N = A n * 10 + A * 10 n-1 2(n-1) + . . . + A 1

(Not finished...)
Binary Square Roots

In general, the procedure consists of taking the square root developed so far, appending 01 to it and subtracting it, properly shifted, from the current remainder. The 0 in 01 corresponds to mutliplying by 2; the 1 is a new trial bit. If the resulting remainder is positive, the new root bit developed is truly 1; if the remainder is negative, the new bit developed is 0 and the remainder must be restored (thus the name) by adding the quantity just subtracted. The example he gives takes the square root of 01011111
1 0 0 1 . ----------------) 01 01 11 11 . 00 -1 --00 01 <--- positive: first bit is a 1 -1 01 ----11 00 <--- negative: 2nd bit is a 0 +1 01 <--- restore the wrong guess -----00 01 11 -10 01 --------11 11 10 <--- negative: 3rd bit is a zero +10 01 <--- restore the wrong guess --------01 11 11 -1 00 01 --------0 11 10 <--- positive: 4th bit is a one

etc... The other method does not restore the subtraction if the result was negative. Instead, it appends a 11 to the root developed so far and on the next iteration it performs an addition. If the addition causes an overflow, then on the next iteration you go back to the subtraction mode. Before I botch the explanation much further, let me quote Flores again:

As long as the remainder is negative, we proceed as in the previous section; we enter a 1 in the corresponding root bit being developed; we append 01 to this number; we shift it the correct number of times and _subtract_ it from the previous remainder. When the remainder goes negative, we do not restore as in the previous section. First we enter a 0 as the next root bit developed. To this we append 11. This result is shifted left the proper number of times and "added" to the present remainder. Again, the same example:
1 0 0 1 . ----------------) 01 01 11 11 . 00 -1 --00 01 <--- positive: first bit is a 1 -1 01 <--- Developed root is "1"; appended 01; subtract ----11 00 11 <--- negative: 2nd bit is a 0 +10 11 <--- Developed root is "10"; append 11 and add. --------11 11 10 11 <--- positive (i.e. didn't cause overflow): 3rd bit is a 0 1 00 11 <--- Developed root is "100"; append 11 and add --------1 00 00 11 10 <--- Overflow: 4th bit is a one

Another Iterative Algorithm

Here's another iterative algorithm. It's something I did by brute force (I'm sure I wasn't the first...) before I knew about the above algorithm. After implementing code for each I discovered that the two algorithms are quite similar. If you didn't know about the techniques described above, you'd probably be inclined to implement the square root routine like so:
unsigned char sqrt(unsigned int N) { unsigned int x,j; for(j= 1<<7; j<>0; j>>=1) { x = x + j; if( x*x > N) x = x - j; } return(x); }

In other words, x is built up one bit at a time, starting with the most significant bit. Then it is squared and compared to N. This algorithm works quite well for processors that have a multiply instruction. Unfortunately, the PIC doesn't fall into that category. However, it is posssible to efficiently multiply a number by itself (i.e. square it).

For example suppose you had an 8 bit number:


y = a*2^7 + b*2^6 + c*2^5 + d*2^4 + e*2^3 + f*2^2 + g*2^1 + h*2^0

If you square it and collect terms, you get:


y^2 = (a^2+ab)*2^14 + ac*2^13 + (b^2+ad+bc)*2^12 + (ae+bd)*2^11 + (c^2+af+be+cd)*2^10 + (ag+bf+ce)*2^9 + (d^2+ah+bg+cf+de)*2^8 + (bh+cg+df)*2^7 + (e^2+ch+dg+ef)*2^5 + (dh+eg)*2^5 + (f^2+eh+fg)*2^4 + fh*2^3 + (g^2+gh)*2^2 + h^2

There are several things to note in this expression: 1) The bits a-h can only be zero or one. So, a^2 == a. 2) If we are trying to build y iteratively by starting with a, then b, etc. then all of the least significant bits are assumed to be zero. Thus most of the terms do not need to be computed at each iteration. 3) The bit that is being squared is always multiplied by an even power of two. The following un-optimized algorithm implements this squared polynomial:
s1 = 1<<14 s2 = 0 s = 0 do { s = s + s1 + s2 if (s>N) s = s - s1 + s2 else s2 = s2 | s1 s1 = s1 >> 2 s2 = s2 >> 1 } while(s1) return(s2>>8)

And a more optimized version:


s1 = 1<<14 s2 = 0 do { if((s1 + s2) <= N) { N = N - s2 -s1 s2 = s2 | (s1<<1) } s1 = s1 >> 1 N = N << 1 } while(s1 > 2^6) return(s2>>8)

If you follow the details of this last algorithm, you will notice that all of the arithmetic is being performed on bits 8-15 i.e. the most significant bits. The one exception is the very last iteration where s1 = 2^7. Another thing you will notice is that under certain situations, N will be larger than 2^16. But, it will never be larger than 2^17 - 1. This extra overflow bit can be handled with a relatively simple trick. Instead of shifting N one position to the left, roll N one position. The difference is that the most significant bit of N will get moved to the least significant position after the roll operation. (A simple shift zeroes the least significant bit.)
2. Newton's Method

To be filled in later.... The only thing I'd like to say about Newton's Method for square roots is that it works well for a processor with hardware multiplication and division support (e.g. a Pentium) and that with few iterations can converge quite quickly. However, for microcontrollers and even DSPs (without hardware division) iterative algorithms like those discussed above are more suitable. Here's a square root routine implemented in a PIC. And here's more software.

Logarithms
Here's a collection of different ways to compute logarithms. Nomenclature: ln(x) - Natural logarithms: x = e^t ln(x) = t x = e^ln(x) log(x) - Common logarithms if x = 10^t then log(x) = t lg(x) - Binary Logarithms if x = 2^t then lg(x) = t Identities, where f () = ln(), log(), or lg() f(a*b) = f(a) + f(b) f(a/b) = f(a) - f(b) f(a^b) = b*f(a) We can use the identities to show how ln, log, and lg are related. Suppose we are given x and it is greater than zero. We can always find the three exponents that satisfy this equation: x = e^t1 = 10^t2 = 2^t3 Using natural logarithms: ln(x) = t1 = t2*ln(10) = t3*ln(2) Using common logarithms: log(x) = t1*log(e) = t2 = t3*log(2) And finally binary logarithms: lg(x) = t1*lg(e) = t2*lg(10) = t3

Combining these equations: ln(x) = log(x) / log(e) = lg(x)/lg(e) log(x) = ln(x) / ln(10) = lg(x)/lg(10) lg(x) = ln(x)/ln(2) = log(x)/log(2) Thus if we know the logarithm of x in any base, we can easily convert to another base by multiplying by a constant. 1. Math Library 2. Borchardt's Algorithm 3. Factoring 4. Table Look-up + Linear Interpolation 5. Series Expansions 1. Math Library The easiest way to compute logarithms is to use the built in math libraries. Most contain code that will find the natural logarithm and the common logarithm. Converting to binary logarithms or an other base logarithm requires calling either one of the library functions and multiplication (or division) by a constant. This is too simple. If you want to complicate things then consider the following algorithms. 2. Borchardt's Algorithm (1) ln(x) ~ 6*(x-1)/ ( x + 1 + 4*(x^0.5)) 3. Factoring The identity f(a*b) = f(a) + f(b) indicates that the logarithm of any number can be expressed as a sum of logarithms of the factors of that number. For example, the natural log of 72 can be expressed as: ln(72) = ln(8) + ln(9) = 3*ln(2) + 2*ln(3) In general,
/ f(X) = f| \ ______ | | | | x_i | | _____ \ \ | = / / -----

f(x_i)

where x_i are the factors of N. However this leaves two problems: 1) How can N be factored? and 2) Suppose N is not an integer? Well fortunately both of these stones can be killed by one bird. Finding the prime factors of an arbitrary integer will kill only one stone, not to mention the fact that this it is a very complicated bird. So let's assume that this approach is beyond the scope of the task at hand. Now consider noninteger factors. For example, 72 can be factor infinitely many ways. One that's conducive to finding the natural logarithm:
72 = (2.718281828459)^4 * 1.318725999989 ln(72) = 4 * ln(2.718281828459) + ln(1.318725999989) = 4 + 0.276666119016

Another for binary logarithms:


72 = (2^6) * 1.125 lg(72) = 6*lg(2) + lg(1.125) = 6 + 0.1699250014423

In both of these examples, 72 was reduced to two factors:


X = (b^n) * c

where b is the base of the logarithm, n is an integer, and c is a real number that is in the range 1.0 <= c <= b. And the logarithm is
log_b(X) = n + log_b(c)

Thus the problem has been reduced somewhat to just finding n and the logarithm of c. Here are two methods for doing this for binary logarithms: 3(a) Factoring for binary logarithms, case 1 (2) Here are the steps one can go through to apply the factoring method to binary logarithms. (Actually, with a simple modification you can also apply this method to other based logarithms.) Input: 16 bit unsigned integer x; 0 < x < 65536 (or 8 bit unsigned integer...) Output: g, lg(x) the logarithm of x with respect to base 2.
3(a).i) Create a table of logarithms of the following constants:
log2arr[i] = lg(2^i / (2^i-1))

i = 1..M, M == desired size of the table.

This array will have to be scaled in such a way that the fixed point arithmetic that uses it below will work. (A convenient scale factor is 2^n, where n is the location of the decimal point in the fixed point arithmetic.) The first few values of the array are lg(2/1), lg(4/3), lg(8/7), lg(16/15),... Note, if you wish to compute the logarithms in a different base, then simply use that base in computing the array. For example, use
log2arr[i] = log_b(2^i / (2^i-1))

where log_b() is the base b logarithm function.


3(a).ii)Scale x to a value between 1 and 2.

In essence, you need to find the most significant bit of x. Let g, the output, equal lg(MSB(x)). For example, if bit 13 is the most significant bit of x then set g equal to lg(2^13) = 13. (Note, g must be scaled in such a way to make the fixed point arithmetic work.) Then, shift x left until the MSB occupies the bit 15. Here's some psuedo code:
g = 15; while(g && !(x&0x8000) ) { x <<= 1; g--; } // loop until msb of x occupies bit 15

An efficiency comment: Since the MSB of x is always set after the scaling process, we could shift x one more position to the left. This is analagous to how most normalized floating point number are stored in memory. If you are not computing base 2 logarithms, then the scaling will have to be performed differently. One way is to change g to a floating point number and apply the following psuedo code:
g = log_b(2^15); while(g && !(x&0x8000) ) { x <<= 1; g -= log_b(2); } // loop until msb of x occupies bit 15

Another way is to scale as though if binary logarithms were being computed. Then after scaling, convert from binary logarithms to base_b logarithms:
g = g * log_b(2);

3(a).iii) Changing Perspective

Now, change your perspective on y. Consider it a fixed point number with the decimal point between bit positions 15 and 14. Since bit 15 is set, x is equal to one plus a fraction. The fraction is (x &0x7fff) / 0x8000.
3(a).iv) Factor y.

The goal here is to find the set of numbers a,b,c,d,... such that: x = a * b * c * d * .... Then we can take the logarithm of x and use the identity: lg(x) = lg(a) + lg(b) + lg(c) + .... The difficulty is how do we find the factors? Well, suppose we consider the numbers:
2^i --------2^i - 1 i=1: i=2: i=3: i=4: i = 2,3,4,...

2 1.333333333333 1.142857142857 1.066666666667

etc. If x is greater than one of these factors, then divide that factor out. As an example, suppose x = 1.9
a) 1.9 > 1.333, x = x/1.333 = 1.425 b) 1.425 > 1.3333 so divide again x = x/1.333 = 1.06875 c) 1.06875 < 1.1428, so don't divide d) 1.06875 > 1.06666 x = 1.06875/1.06666 = 1.001953125 etc

So, x ~= 1.3333^2 * 1.0667 etc. and its base logarithm is lg(1.9) ~= 2*lg(1.3333) + lg(1.0667) or in terms of the log2arr[] created in step (I), lg(1.9) ~= 2*log2arr[2] + log2arr[4] Note that these are not perfect factors. But who cares? The reason these particular factors is that they lend themselves well to an efficient division algorithm.
3(a).v) Efficiency comments

Recall the multiplication or division by 2 is equivalent to shifting left or shifting right. So that,

x * 2^n = x << n

In the previous step, the division can be converted to shifting:


2^i x / --------- = 2^i - 1 = = = x * (2^i - 1) / (2^i) (x * 2^i - x) / 2^i x - x / 2^i x - (x >> i)

3(a).vi) A word of caution

The first 16 entries of log2arr[] created in step(i) are:


1 .4150375 .1926451 9.310941E-02 4.580369E-02 2.272008E-02 1.131531E-02 5.646563E-03 2.820519E-03 1.40957E-03 7.04613E-04 3.522635E-04 1.76121E-04 8.80578E-05 4.402823E-05 2.201395E-05 1.100693E-05

If all of the entries are added together, you get about 1.791906. In the process of iterating towards lg(x), the algorithm will add the entries of log2arr[] together. Now suppose the logarithm of x is greater than 1.791906. It appears the algorithm will under estimate lg(x) in this case. However, there are instances when a factor is used more that once. (Recall the example of factoring 1.9 in step 4).
3(a).vii) Putting it all together

The preceding steps may be expressed succinctly with the following algorithm. (Note that this comes from Knuth (2), section 1.2.3, exercise 25). Given x, compute y=log_b(x), where log_b is the base b logarithm. L1. [initialize.] Set y = 0, z = x >> 1, k = 1. L2. [Test for end.] If x = 1, stop. L3. [Compare.] If x-z < 1, go to L5. L4. [Reduce values.] Set x = x - z, z = x >> k, y = y + log_b(2^k/(2^k-1)), go to L2.

L5. [Shift.] Set z = z >> 1, k = k + 1, go to L2. 3(b) Factoring for binary logarithms, case 2 This case is very similar to the previous one. The only difference is in how the input is factored. Like before we are given: Input: 16 bit unsigned integer x; 0 < x < 65536 (or 8 bit unsigned integer...) Output: g, lg(x) the logarithm of x with respect to base 2.
3(b).i) Create a table of logarithms of the following constants:
log2arr[i] = lg(1 + 2^(-(i+1)))

i = 0..M, M == desired size of the table. The first few values of the array are lg(3/2), lg(5/4), lg(9/8), lg(17/16),... Recall that in the previous case the factors were lg(2/1), lg(4/3), lg(8/7), lg(16/15),... Again, if you wish to compute logarithms to a different base, then substitute the lg() function with the appropriate based logarithm function.
3(b).ii)Scale y to a value between 1 and 2.

This is identical to 3(a).ii.


3(b).iii) Changing Perspective

Again, this is identical to 3(a).iii.


3(b).iv) Factor y.

This is very similar to step (iv) above. However, we now have different factors. Using the same example, x = 1.9, we can find the factors for this case.
a) 1.9 > 1.5, x = x/1.5 ==> 1.266666 b) 1.26 > 1.25 x ==> 1.0133333 c) 1.0133 < 1.125 so don't divide d) 1.0133 < 1.0625 " " e) etc.

So, x ~= 1.5 * 1.25 * etc. Like the previouse case, these factors are not perfect. Also, they're somewhat redundant in the sense that 1.5*1.25*1.125*1.0625*... spans a range that is larger than 2 ( ~2.38423 for i<=22). So unlike the previous factoring method, this one will not have repeated factors. Here's some psuedo code:
for(i=1,d=0.5; i<M; i++, d/=2) if( x > 1+d) { x /= (1+d); g += log2arr[i-1]; // log2arr[i-1] = log2(1+d); }

Here, d takes on the values of 0.5, 0.25, 0.125, ... , 2^(-i). Then 1+d is the trial factor at each step. If x is greater than this trial factor, then we divide the trial factor out and add to g (ultimately the logarithm of x) the partial logarithm of the factor. At this point, the answer is in the variable g.
3(b).v) An efficiency comment

Note that the division by 1+2^(-i) can be rearranged into a multiplication: Using the expansion
x ------v + e x / / e \ / e \2 / e \3 \ = --- * | 1 - |---| + |---| - |---| + ... | v \ \ v / \ v / \ v / /

let x = 1, v= 1, e = 2^(-i)
1 ---------- = [ 1 - 2^(-i) + 2^(-2*i) - 2^(-3*i) + ...] 1 + 2^(-i)

So, if you want to divide x by 1+2^(-i) :


x ---------- = [ x - x>>i + x>>(2*i) - x>>(3*i) + ...] 1 + 2^(-i)

The number of terms need depends on i. For example, on the first pass, i =1, so you would need to keep 15 terms in the expansion, however for i=4 you would only need to keep the first four terms. 4. Table Look-up + Linear Interpolation The logarithm function is a monotonic,slowly increasing function. This suggests that a straight line may be a good approximation to it. And in fact with a little work, it's an excellent approximation. The general linear interpolation formula is
b - x x - a f(x) ~= ------ * f(a) + ------- * f(b), a <= x <= b

b - a

b - a

Where in our case, f() is one of the logarithm functions. This approximates f(x) when x is in between a and b. This can be arranged into the perhaps more familiar point-slope form:
/ f(b) - f(a) \ / a*f(b) - b*f(a) \ f(x) = x * |-------------| - |-----------------| \ b - a / \ b - a /

Or into a significantly more useful form:


/ f(b) - f(a) \ f(x) = (x - a) * |-------------| \ b - a / + f(a) eq(4.3)

There are a couple steps required before we can use this equation in a program. First, we need to map x (the range) into a sequence of integers: x -> 0..n. This is necessary to index the table. Second we need to choose the spacing between the tabulated function values so that the division by (b - a) is simplified. Fortunately, we can kill both of these stones with one bird. For example, suppose that x is a 16-bit integer. We obviously don't want to have 65k look-up table (at least not with this algorithm). So a real simple way to map x into a smaller range is to shift it right. If we have decided to tabulate only 16 values of f(x), then we need only the 4 most significant bits of x (because 2^4 = 16). This is easily obtained by shifting x right 12 bit positions. Then all we need is to tabulate the 16 values: f(0x0000), f(0x1000), f(0x2000), etc. (BTW, the f(0x0000) value is undefined, but there's a way around that problem... see below). This kills one stone, now for the other. Note that in the point slope equation that we divide by (b a). However, this is always an integer power of 2. For the example of 16 tabulated values, a and b take on the values of 0, 0x1000, 0x2000, etc. Their difference (assuming we interpolate between consecutive table entries) is always 0x1000. So the division reduces to a shift right! As an example, suppose we wish to find lg(0x3456) given that we have a table of lg(0x1000*i) for i=0..15. Using equation 4.3:
/ lg(0x4000) - lg(0x3000) \ lg(0x3456) = (0x3456 - 0x3000) * |-------------------------| \ 0x4000 - 0x3000 / / lg(4/3) \ = 0x456 * |---------| + lg(0x3000) \ 0x1000 / = (0x456 * lg(4/3) ) >> 12 + lg(0x3000) = 13.697 + lg(0x3000)

This is to within 0.1% of the true answer (~13.710). There is another trick we can perform. It's to reduce the range of x like we did above in case 3. While not necessary, it does have the benefit of reducing the dynamic range of the tabulated function values. This has two effects. First, the arithmetic can use smaller variables, e.g. 16-bit

integers instead of 32-bit integers. Second, if we decide to interpolate with many line segments, then the memory to store the array of points that approximate the function is smaller. If we are working with 16-bit integers then x ranges from 1 to 2^16-1 = 65535. The range reduction method discussed above can scale x between a value of 1.0 and 2.0. (Recall that x is shifted left until its most significant bit occupies the most significant bit location of the memory word.)
Putting it all together- with integer arithmetic

So like the algorithms discussed in case 3, let's assume we are given: Input: 16 bit unsigned integer x; 0 < x < 65536 (or 8 bit unsigned integer...) Output: g, lg(x) the logarithm of x with respect to base 2. The following (untested!) C-program implements table look-up plus first order interpolation. It's been designed to use integer arithmetic.
g = 15; while(g && !(x&0x8000) ) // loop until msb of x occupies bit 15 { x <<= 1; g--; } // If g==0, then we're done. if(!g) return(0); // We have the integer portion of the log(x). Now get the fractional part. x <<= 1; // "normalize" x, get rid of the MSB. j = x >> 12; // Get the array index x_minus_a = x & 0xfff; // The lower bits of x are all that's left // after subtracting "a". gf = log_table[j] + (x_minus_a * (log_table[j+1] - log_table[j])) >> 12;

Check out PIC Logarithms for an implementation of this method in PIC assembly. 5. Series Expansions Here are some handy series expansions.(3)
ln(1+x) = x - x^2/2 + x^3/3 - x^4/4 + ... - (-x)^n/n / x-1 \ 1 / x-1 \2 1 / x-1 \3 ln(x)=|-----| + ---|-----| + ---|-----| + ..., \ x / 2 \ x / 3 \ x / / x+1 \ / 1 1 1 \ ln|-----| = 2*| --- + ----- + ----- + ... |, \ x-1 / \ x 3*x^3 5*x^5 / for -1 < x < 1

for x>= 1/2

for

|x| >= 1

Working software.

Other Theoretical stuff.

References (1) Doerfler, Ronald W., "Dead Reconing: Calculating without instruments",Gulf Publishing Company, Houston. ISBN 0-88415-087-9 (2) Knuth, Donald E., "The Art of Computer Programming Vol 1", Addison-Wesley Publishing Company, ISBN 0-201-03822-6 (3) Abramowitz and Stegan, "Handbook of Mathematical Functions", Dover, ISBN 0-486-61272-4

DTMF - Decoding with a 1-bit A/D converter


The purpose of DTMF decoding is to detect sinusoidal signals in the presence of noise. There are plethora of cost effective integrated circuits on the market that do this quite well. In many (most ?) cases, the DTMF decoder IC interfaces with a microcontroller. In these instances, why not use the microcontroller to decode the sinusoids? Well the answer is because the typical microcontroller based decoder requires an A/D converter. Furthermore, the signal processing associated with the decoding is usually beyond the scope of the microcontroller's capabilities. So the designer is forced to use the dedicated IC or upgrade the microcontroller to perhaps a more costly digital signal processor. However there is yet another way to decode DTMF signals with a microcontroller and a 1-bit A/D converter (i.e. comparator). The theory is quite similar to the "classical" signal processing technique alluded to above. So let's briefly consider what is involved there before we continue. One brute force way to detect DTMF signals is to digitize the incoming signal and compute 8 DFT's (discrete fourier transforms) centered around the 8 DTMF composite frequencies. DFT's are preferred over FFT's because the frequencies are not equally spaced (in fact they are logarithmically spaced). In its simplest form, the DFT goes something like so:
N ---\ / ----

DFT(x) =

x(k) * W(k)

where x(k) are the time samples and W(k) is the infamous kernal function:
W(k) = e^(j*2*pi*f*k/N) = cos(2*pi*f*k/N) + j*sin(2*pi*f*k/N)

All this says is that we multiply the samples by sine waves and cosine waves and add them together. And when you're done, you will end up with 8 complex numbers. The magnitudes of these numbers tell us roughly how much energy is present for each frequency of the input signal. In other words, we have computed the frequency spectrum at the 8 DTMF composite frequencies. The reason this works so well is because of the "orthogonality" of the sine waves. In other words, if you performed the DFT on two sine waves:
DFT = ---\ / ---sin(f_1*t) * sin(f_2*t)

You will get a "large" number if the two frequencies are the same and a "small" number or zero if they're different. Now, I realize if you have never seen this before then you probably have no idea what I'm talking about and if you have you're probably wandering why left out so much important stuff. But the

main point I'm trying to make is to show how the DFT concept can be generalized to nonsinusoidal signals; square waves in particular.
"DFT" With Square waves:

The orthogonality concept applies equally well to square waves too. In fact, it's even easy to illustrate with ASCII art! Consider the two examples: Ex 1
+----+ +----+ +----+ +----+ +----+ | | | | | | | | | | --+ +----+ +----+ +----+ +----+ +--+----+ +----+ +----+ +----+ +----+ | | | | | | | | | | --+ +----+ +----+ +----+ +----+ +--+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 => 25

Ex 2
+----+ +----+ +----+ +----+ +----+ | | | | | | | | | | --+ +----+ +----+ +----+ +----+ +--+---------+ +---------+ +-------| | | | | --+ +---------+ +---------+ +1+1+1+1-1-1-1-1-1+1+1+1+1+1-1-1-1-1-1+1+1+1+1+1-1 => 2

In the first example, the two square waves have the same frequency and phase. When the individual samples are "multiplied" and summed together, you get a large number: 25. In the second case, the square waves differ in frequency by a factor of two. And as expected, when you "multiply" the individual samples and add them up you get a small number: 2. If you look closely, you'll notice that the multiplication is really an exclusive OR operation. In PIC parlance,
sum_of_products equ input_sample equ square_wave_sample equ movf xorwf skpnz movlw addlw addwf 0x20 0x21 0x22 ;A register variable ;In LS bit ;In LS bit

input_sample,W square_wave_sample,W -2 1 sum_of_products,F

Quadrature:

In the DFT we used both sine and cosine waves. The two are obviously related by 90 degree phase shift. An analogously shifted square wave is needed for the DTMF decoding too. The reason is that it's possible to end up with a small sum-of-products even if the two waveforms have the same frequency. For example,
+-----+ +-----+ +-----+ +-----+ +-----+ | | | | | | | | | | --+ +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ +-----+ | | | | | | | | | | -----+ +-----+ +-----+ +-----+ +-----+ +1-1-1+1+1-1+1-1-1+1+1-1+1-1-1+1+1-1+1-1-1+1+1-1+1-1-1+1+1 => 0

So to protect against this situation, we must perform two sum-of-products operations. One between the digitized input signal and a square at the detection frequency, the other between the digitized input signal and a square wave shifted 90 degrees with respect to the first square wave. Dot Product The DFT operation is a dot product operation. You can imagine the signal and the kernals as vectors whose indices are the sample number. The vectors could be very large, e.g. 4096 samples. Signal Strength: 1-norm If we were computing DFT's with sine and cosines, then the signal strength of a particular frequency is easily ascertained by finding the familiar magnitude:
strength ~ sqrt( real(DFT)^2 + imaginary(DFT)^2)

In other words, the result of a DFT is a complex number when the complex kernal is used. And the magnitude of a complex number is the square root of the sum of the real part squared plus the imaginary part squared. This is a cumbersome operation that I would rather advoid. But if you wish to consider it, then check out the square root theory and square roots with a PIC pages. The square root of the sum of the squares normalization is called the "square norm". It is a subset of the general class of normed linear spaces called the "p-norm". In our case, the linear space consists of two components: the real and imaginary parts of the DFT. The p-norm for our case is:
strength ~ (abs(real(DFT))^p + abs(imaginary(DFT))^p) ^ 1/p

And this is the same as the square norm when p is 2. Now if p is 1 we up end with an extremely simple formula, the 1-norm:
strength ~ abs(real(DFT)) + abs(imaginary(DFT))

Digitization: +1 and -1 OR +1 and 0?

So far we've been digitizing to +1 and -1. However, in a real program we'll probably want to digitize to +1 and 0 since these are the numbers which microcontrollers are most comfortable. (If the subjective feelings of the microcontroller don't sway you, then read on.... there're some real efficiency reasons too). The next question is how does this impact the DFT calculations? Suppose we have two square waves that are digitized to + and - 1. What happens to their dot product if we digitize them to +1 and 0 instead? Let f1 and f2 be the two square waves. f1 = -1 or +1 f2 = -1 or +1 Convert to 1's and 0's: q1 = (f1 + 1)/2 => f1 = 2*q1 - 1 q2 = (f2 + 1)/2 => f2 = 2*q2 - 1 So, q1 and q2 represent the re-digitized f1 and f2 square waves. Now the dot product:
DFT = ------\ f1*f2 = \ (2*q1 - 1) * (2*q2 - 1) / / ---------= \ (4*q1*q2 - 2*q2 - 2*q1 + 1) / ---------------= 4 \ q1*q2 - 2 \ q2 - 2 \ q1 + \ 1 / / / / -------------

The last term evaluates N. In other words, 1+1+1+ ... +1 = N. The middle two terms require a closer examination. Assume that f1 and f2 contain no DC component:
N ---\ f1 / ---0

= 0

Then the sum of q1 is:


N ---\ q1 / ---0 N ---= \ (f1 + 1)/2 / ---0 = 0 + N/2 = N/2

And similarly, the sum of q2 is N/2. Combining these results yields:

---\ f1*f2 = / ---=

---4 * \ q1*q2 - 2*N/2 - 2*N/2 + N / ------4 * \ q1*q2 - N / ----

So the conclusion is that digitization with 0's and 1's is essentially the same as digitizing with +1's and -1's. The only differences are that a new "DC" term of N has been introduced and the dot product has been scaled.
Question all assumptions

I made the assumption that the square waves f1 and f2 have no DC component. But think about how theses square waves are created. One of them say f1 is the digitized DTMF signal, the other is software synthesized. We certainly (should) have control over the synthesized square wave. However, the other one is subject to the errors of reality. First, there's the issue that the comparator introduces an assymetry in the digitization. This can be remedied by a properly design comparator circuit. Second, there's the issue of having a sample window of finite width. For example, if the sampling window is 20 milliseconds wide and there is a 60Hz component present in our input signal we will introduce a DC term. In fact, if there is any frequency present that does not have an integer number of cycles over the sampling window a DC error term is present. There two ways to fix this "problem". The first is to make the window as wide as possible. This unfortunately slows the detection down. The other fix is to measure the DC component.
Measuring the DC component

The DC component of a square wave is really easy to measure. And no, you are not going to need an A/D converter. Consider a square wave that has a 50% duty cycle. If we were to sample this at a high rate (e.g. 10 times the frequency of the square wave), we should get the same number of high and low samples. The DC component in this case is one-half of the amplitude of the square wave. If more high samples were accumulated than low samples then the DC component would be larger. In other words, the duty cycle of the square wave is proportional to the DC component.
Returning to the tone detection application...

In our application, one of the q's in the dot-product is the 0/1 digitized input. The other is a software generated square wave. There will always be two dot-products for each tone; one for each quadrature.
Applying the 1-norm to the +1 and 0 Digitization:

Not complete... It's easy enough to apply the 1-norm formula to the new digitization.

Error Analysis

A square wave is hardly equivalent to a sine wave. So the question naturally becomes: How much error does this technique introduce? Probably the best way to try to answer this question is by performing a harmonic analysis on the digitization process. Consider the dual tone signal:
g(t) = A1*cos(w1*t) + A2*cos(w2*t)

The frequencies w1 and w2 are certainly different (otherwise we don't have a dual tone). The amplitudes are also different, but in many cases they are within a few percent of one another. Let's consider the case where the amplitudes are the same for right now:
g(t) = A1*cos(w1*t) + A1*cos(w2*t) = A1*(cos(w1*t) + cos(w2*t)) /w1 + w2 \ /w1 - w2 \ = A1* cos |------- * t| * cos |------- * t| \ 2 / \ 2 /

This signal is digitized to 1's and 0's by a comparator. The digitization process can be mathematically described by:
gd(t) = (1 + sign(g(t)))/2

Where sign() is a function that returns the sign of its argument. sign() by itself produces the +1/-1 digitization. g(t) is positive when the two cosine terms have the same sign. Now consider the following sign() identity:
sign(f1(t) * f1(t)) = sign(f1(t)) * sign(f2(t))

and apply this to our formula


/ /w1 + w2 \\ / /w1 - w2 \\ sign(g(t)) = sign|cos |------- * t|| sign|cos |------- * t|| \ \ 2 // \ \ 2 //

The sign() function with a sinusoidal arguement produces a 50% duty cycle square wave with the same frequency as the sinusoid. Not complete... The Square Waves page provides even more theory. And here's more software.

Das könnte Ihnen auch gefallen