lentinlogic-densitywhilethelastFPGAisafine-grainedarchitectureutilizingsmallerbutmorenumerouscells.ThemultiplicationrateofeachmultiplierislistedinMHzaswellasthepercentageoftheFPGArequiredtoimplementthemultiplier.Thebit-serialmultipliershavelistedboththeirclockrate(bit-rate)andtheireffectivemultiplicationrate(clockrate/N)..MultipliertablecontentsThemajorityofthemultipliersinthisstudyusedcommonarchisorsmaximizemultiplicationperformancebyusingfastparallel-arraymultiplierseithersinglyorinparallel.FPGAsalsohavetheabilitytoimplementmultiplierssinglyorinparallelaccordingtotheneedsoftheapplication.Thus,inordertounderstandtheperformanceoftheFPGArelativetotheASICandtheDSPprocessoracomparisonofFPGAmultiplicationalternativesandtheirperformancerelativetocustommultipliersolutionsisneeded.ThissectionpresentsthebasicalternativesformultiplierimplementationsandtheirperformancewhenimplementedonFPGAs..MultiplierarchitecturealternativesWhenimplementingmultipliersinhardwaretwobasicalternativesareavailable.Themultipliercanbeimplementedasafullyparallel-arraymultiplierorasafullybit-serialmultiplierasshowninFigure.Theadvantageofthefullyparallelapproachisthatalloftheproductbitsareproducedatoncewhichgenerallyresultsinafastermultiplicationrate.Themultiplicationrateforaparallelmultiplierisjustthedelaythroughthecombinationallogic.However,parallelmultipliersalsorequirealargeamountofareatoimplement.Bit-serialmultipliersontheotherhandgenerallyrequireonly/NththeareaofanequivalentparallelmultiplierbuttakeNbittimestocomputetheentireproduct(Nisthenumberofbitsofmultiplierprecision).Thisoftenleadsonetobelievethatthebit-serialapproachisthusNtimesslowerthananequivalentparallelmultiplierbutthisisnottrue.Thebit-times(clockcyclesforsynchronousbit-serialmultipliers)areveryshortindurationduetothereducedsizeandhencepropagationpathsofthemultiplier.Thisresultsinabit-serialmultiplierachievingabout/themultiplicationrateofanequivalentparallelmultiplieronaverage,evenexceedingtheperformanceoftheparallelmultiplierinsomecases.Fig..Blockdiagramsofbasicmultiplieralternatives.FPGAmultiplicationresultsTableliststheperformanceofseveralmultipliersimplementedonthreedifferentFPGAs.TheFPGAsusedwereaXilinx,anAlteraFlex,andaNationalSemiconductorCLAy.ThefirsttwoFPGAscanbecharacterizedasmedium-grainedarchitecturesandareapproximatelyequivalentinlogic-densitywhilethelastFPGAisafine-grainedarchitectureutilizingsmallerbutmorenumerouscells.ThemultiplicationrateofeachmultiplierislistedinMHzaswellasthepercentageoftheFPGArequiredtoimplementthemultiplier.Thebit-serialmultipliershavelistedboththeirclockrate(bit-rate)andtheireffectivemultiplicationrate(clockrate/N)..MultipliertablecontentsThemajorityofthemultipliersinthisstudyusedcommonarchi闭芯片。这些条目贴上Altera公司U-BIT-Serialrefer无符号位串行乘法器来构建抽头滤波器,而那些标有Altera公司S-位串行指签署位串行乘法器使用。签署过滤器和.倍分别为-位和位抽头FIR滤波器签署位串行运算映射效率低下,导致系统芯片数量增加。LDLMUandLDLMU项是指自定义乘数芯片使用是结合了FPGA来实现滤波器逻辑器。FPGA是用来实现必要数据延误,数据路径,乘法器芯片控制,和产品积累所需乘法累加FIR滤波器循环。同样,假设NS芯片延迟时间。比较等效实现利用FPGA实现-一个可能致力于执行该乘法器(位版本)项标记和快速并行地赛灵思并行快速被包括在内。表中下一个项目,先前讨论了乘法器Xilinx常系数分布式算术结果。两个自定义FIR滤波器最后排名结果:ASIC,逻辑DevicesLFx位数字滤波器和GECPlesseyPDSP/AProgrammableFIR滤波器。..比较和结论比较所有上市滤波实现,可以看出,基于ASIC实现可以获得最高性能。然而,他们性能几乎是通过与赛灵思基础常数乘法器相匹配来实现。这清楚地表明,使用分布式乘法运算方法优势。使用这种方法位和位版本过滤器获得加快和因素分别超过DSP处理器。因此致力于一个特定滤波器,由于每个乘数是一个常数这种方法缺点是需要执行所有并行乘法。对位滤波器来说这个结果代表了一个较大芯片数(比专用集成电路)。比DSP处理器性能差是那些只使用一个单一基于FPGA乘法器执行整个过滤循环系统(项标记快速并行)。在这些系统中,一个单一乘数被用来计算滤波器整个迭代乘法累加循环。这种方法最接近DSP处理器用于执行滤波方法,但是,由于基于FPGA乘法器和以超大规模集成电路为基础DSP处理器乘法器速度差异。结果,性能较差。因此,当一个自定义超大规模集成电路乘数芯片是和FPGA(表项标记LDLMUandLDLMU)一起使用时,这种架构再次超过DSP处理器性能。表复杂基FFT性能系统精确值芯片号数计算时间ptptptTITMSCx位μsμs.msPDSP/A位.μs.μs.msPDSPA位--μsSharpLH位--.μs.基快速傅里叶变换比较使用FFT算法也已完成,并出现在表。精确FFT列在表给出了每个用于实部和虚部输入数据字变换比特数。基于FPGA实现使用一个AlteraFPGA和来自GEC普莱塞半导体复杂乘法器芯片PDSP/。该系统是用于控制算法和执行基蝶形单元。快速傅里叶变换计算通过使用相同硬件先后计算每一列基快速傅里叶变换。通过使用一个FPGA和复杂乘法器可以更快实现列快速傅里叶变换。从表中可以看到FPGA使用比在FFT长度为,和TMSCxDSP处理器分别加快.,和.点。该算法实施主要是计算约束,从而进一步加快对实现如上所述更大并行,或通过使用更快复数乘法器。使用外部乘法芯片与FPGA提供了一个数量级增加超过了TIDSP芯片性能。额外性能提高可能与基于ASIC系统,然而,如果一个芯片组每个FFT列都被使用,那么基于ASIC系统性能可以接近或超过了通过FPGA执行情况。例如,使用一个PDSP/A芯片和每FPGAFFT列点基FFT可以进行/=.秒。四、结论执行表中FPGA乘法器结果表明,对于大多数类型乘法器FPGA明显比定制芯片慢。因此,对于FPGA为了获得比DSP处理器和ASIC更好性能,广泛专业化和并发增加必须使用。分布式乘法运算方法被证明是常数乘法可应用于大型应用性能提高了专业化方法。抽头FIR滤波器和基FFT结果表明对于基于FPGADSP系统性能幅度提高超过了使用DSP处理器是不合理。这可以被认为是足够显著提高,值得进一步应用到DSPFPGA。此外,FPGA提供一个超过ASIC重新配置优势。与适当ASIC设计相比灵活性有限,但FPGA有能力在功能和通过重新配置I/O方面彻底改变。这使得它可以通过专业化和增加并发性自定义设计,并通过硅在许多不同应用程序摊销,以获得最好性能和减少成本。参考文献.雷蒙德研究Andraka.FIR滤波器FPGA中使用了位串行方式,在第三届PLD设计会议展出,。.西北贝格曼和J.C.Mudge.基于FPGA性能比较自定义电脑与DSP应用通用计算机,第-页,NAPA,CA,年月FPGAIEEE研讨会论文集.肯尼斯大卫查普曼.适合快速整数乘法器在FPGAEDN杂志,第页,年月日。.半导体Group.Digital信号处理产品和应用底漆.德州仪器公司,。.启黄.计算机算术原理与体系结构设计.约翰威利父子,年。.R.F.里昂.二补流水线乘法器.IEEE在通信,第-页,.。sorsmaximizemultiplicationperformancebyusingfastparallel-arraymultiplierseithersinglyorinparallel.FPGAsalsohavetheabilitytoimplementmultiplierssinglyorinparallelaccordingtotheneedsoftheapplication.Thus,inordertounderstandtheperformanceoftheFPGArelativetotheASICandtheDSPprocessoracomparisonofFPGAmultiplicationalternativesandtheirperformancerelativetocustommultipliersolutionsisneeded.Thissectionpres一、英文原文AnAssessmentoftheSuitabilityofFPGA-BasedSystemsforuseinDigitalSignalProcessing★★★RussellJ.PetersenandBradL.HutchingsBrighamYoungUniversity,Dept.ofElectricalandComputerEngineering,CB,ProvoUT,USAAbstract.FPGAshavebeenproposedashigh-performancealternativestoDSPprocessors.ThispaperquantitativelycomparesFPGAperformanceagainstDSPprocessorsandASICsusingactualapplicationsandexistingCADtoolsanddevices.PerformancemeasureswerebasedonactualmultiplierperformancewithFPGAs,DSPprocessorsandASICs.ThisstudydemonstratesthatFPGAscanprovideanorderofmagnitudebetterperformancethanDSPprocessorsandcaninmanycasesapproachorexceedASIClevelsofperformance.IntroductionTomeettheintensivecomputationandI/OdemandsimposedbyDSPsystemsmanycustomdigitalhardwaresystemsutilizingASICshavebeendesignedandbuilt.Customhardwaresolutionshavebeennecessaryduetothelowperformanceofotherapproachessuchasmicroprocessor-basedsystems,buthavethedisadvantageofinflexibilityandahighcostofdevelopment.TheDSPprocessorattemptstoovercometheinflexibilityanddevelopmentcostsofcustomhardware.TheDSPprocessorprovidesflexibilitythroughsoftwareinstructiondecodingandexecutionwhileprovidinghighperformancearithmeticcomponentssuchasfastarraymultipliersandmultiplememorybankstoincreasedatathroughput.TheFPGAhasalsorecentlygeneratedinterestforuseinimplementingdigitalsignalprocessingsystemsduetoitsabilitytoimplementcustomhardwaresolutionswhilestillmaintainingflexibilitythroughdevicereprogramming[].UsingtheFPGAitishopedthatasignificant★TobepublishedinthInternationalWorkshoponField-ProgrammableLogicandApplications,Oxford,England,Aug..★★ThisworkwassupportedbyARPA/CSTOundercontractnumberDABT--C-underasubcontracttoNationalSemiconductor.performanceimprovementcanbeobtainedovertheDSPprocessorwithoutsacrificingsystemflexibility.ThispaperisanattempttoquantifytheabilityoftheFPGAtoprovideanacceptableperformanceimprovementovertheDSPprocessorintheareaofdigitalsignalprocessing.MultiplicationanddigitalsignalprocessingAcoreoperationindigitalsignalprocessingalgorithmsismultiplication.Often,thecomputationalperformanceofaDSPsystemislimitedbyitsmultiplicationperformance,hencethemultiplicationrateofthesystemmustbemaximized.CustomhardwaresystemsbasedonASICsandDSPprocessorsmaximizemultiplicationperformancebyusingfastparallel-arraymultiplierseithersinglyorinparallel.FPGAsalsohavetheabilitytoimplementmultiplierssinglyorinparallelaccordingtotheneedsoftheapplication.Thus,inordertounderstandtheperformanceoftheFPGArelativetotheASICandtheDSPprocessoracomparisonofFPGAmultiplicationalternativesandtheirperformancerelativetocustommultipliersolutionsisneeded.ThissectionpresentsthebasicalternativesformultiplierimplementationsandtheirperformancewhenimplementedonFPGAs..MultiplierarchitecturealternativesWhenimplementingmultipliersinhardwaretwobasicalternativesareavailable.Themultipliercanbeimplementedasafullyparallel-arraymultiplierorasafullybit-serialmultiplierasshowninFigure.Theadvantageofthefullyparallelapproachisthatalloftheproductbitsareproducedatoncewhichgenerallyresultsinafastermultiplicationrate.Themultiplicationrateforaparallelmultiplierisjustthedelaythroughthecombinationallogic.However,parallelmultipliersalsorequirealargeamountofareatoimplement.Bit-serialmultipliersontheotherhandgenerallyrequireonly/NththeareaofanequivalentparallelmultiplierbuttakeNbittimestocomputetheentireproduct(Nisthenumberofbitsofmultiplierprecision).Thisoftenleadsonetobelievethatthebit-serialapproachisthusNtimesslowerthananequivalentparallelmultiplierbutthisisnottrue.Thebit-times(clockcyclesforsynchronousbit-serialmultipliers)areveryshortindurationduetothereducedsizeandhencepropagationpathsofthemultiplier.Thisresultsinabit-serialmultiplierachievingabout/themultiplicationrateofanequivalentparallelmultiplieronaverage,evenexceedingtheperformanceoftheparallelmultiplierinsomeca 1
一、英文原文AnAssessmentoftheSuitabilityofFPGA-BasedSystemsforuseinDigitalSignalProcessing★★★RussellJ.PetersenandBradL.HutchingsBrighamYoungUniversity,Dept.ofElectricalandComputerEngineering,459CB,ProvoUT84602,USAAbstract.FPGAshavebeenproposedashigh-performancealternativestoDSPprocessors.Thispaperquant