Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECE6580 Lecture 9. MyFirAsm.asm.global _MyFirAsm; _MyFirAsm: entry; dm(b2Save) = b2; dm(i2Save) = i2; l4 = reads(1); // save length of filter in l4 l2.

Similar presentations


Presentation on theme: "ECE6580 Lecture 9. MyFirAsm.asm.global _MyFirAsm; _MyFirAsm: entry; dm(b2Save) = b2; dm(i2Save) = i2; l4 = reads(1); // save length of filter in l4 l2."— Presentation transcript:

1 ECE6580 Lecture 9

2 MyFirAsm.asm.global _MyFirAsm; _MyFirAsm: entry; dm(b2Save) = b2; dm(i2Save) = i2; l4 = reads(1); // save length of filter in l4 l2 = l4;// save length in l2; b4 = r8;// pointer to states goes in b4 and i4 b2 = r12;// pointer to coefs goes in b2 and i2

3 Un-Rolled Loop f8=f4;// x needs to be in f8 f1=dm(i4,m5);// fetch states[0], but do not inc i4 dm(i4,m6) = f8;// states[0] = x f2=dm(i2,m6); // fetch coefs[0] f4 = f1*f2;// coefs[0]*states[0] f0 = f0 + f4;// acc = acc + coefs[0]*state[0] f8 = dm(i4,m5);// fetch states[1], but do not inc i4 dm(i4,m6) = f1; // states[1] = states[0] f2=dm(i2,m6);// fetch coefs[1] f4 = f8*f2;// coefs[1]*states[1] f0 = f0 + f4;// acc = acc + coefs[1]*states[1] f1 = dm(i4,m5); // fetch states[2], but do not increment i4 dm(i4,m6) = f8; // states[2] = states[1] f2=dm(i2,m6);// fetch coefs[2] f4 = f1*f2;// coefs[2]*states[2] f0 = f0 + f4;// acc = acc + coefs[2]*state[2] f8 = dm(i4,m5);// fetch states[3], but do not inc i4 dm(i4,m6) = f1; // states[3] = states[2]

4 How Can We Roll It Up? f8=f4;// x needs to be in f8 f1=dm(i4,m5);// fetch states[0], but do not inc i4 dm(i4,m6) = f8;// states[0] = x f2=dm(i2,m6); // fetch coefs[0] f4 = f1*f2;// coefs[0]*states[0] f0 = f0 + f4;// acc = acc + coefs[0]*state[0] f8 = dm(i4,m5);// fetch states[1], but do not inc i4 dm(i4,m6) = f1; // states[1] = states[0] f2=dm(i2,m6);// fetch coefs[1] f4 = f8*f2;// coefs[1]*states[1] f0 = f0 + f4;// acc = acc + coefs[1]*states[1] f1 = dm(i4,m5); // fetch states[2], but do not increment i4 dm(i4,m6) = f8; // states[2] = states[1] f2=dm(i2,m6);// fetch coefs[2] f4 = f1*f2;// coefs[2]*states[2] f0 = f0 + f4;// acc = acc + coefs[2]*state[2] f8 = dm(i4,m5);// fetch states[3], but do not inc i4 dm(i4,m6) = f1; // states[3] = states[2]

5 Rolled and Ready to Go f2=dm(i2,m6);// fetch coefs[0] f0 = f2*f4;// acc = coefs[0]*x f8 = f4; lcntr = r2,do MyFirAsmEnd until lce; f1=dm(i4,m5);// fetch states[i], but do not inc i4 dm(i4,m6) = f8;// states[i] = state[i-1] f2=dm(i2,m6); // fetch coefs[i] f4 = f8*f2;// coefs[i]*states[i] f0 = f0 + f4;// acc = acc + coefs[i]*state[i] f8 = dm(i4,m5);// fetch states[i], but do not inc i4 dm(i4,m6) = f1; // states[i+1] = states[i] f2=dm(i2,m6);// fetch coefs[i+1] f4 = f8*f2;// coefs[i+1]*states[i+1] MyFirAsmEnd: f0 = f0 + f4;// acc = acc + coefs[i+1]*states[i+1]

6 Bench Mark Numbers MyFir8535 cycles MyFir1076 cycles (optimized) MyFirAsm1304 cycles


Download ppt "ECE6580 Lecture 9. MyFirAsm.asm.global _MyFirAsm; _MyFirAsm: entry; dm(b2Save) = b2; dm(i2Save) = i2; l4 = reads(1); // save length of filter in l4 l2."

Similar presentations


Ads by Google