נושא הפרוייקט

מספר פרוייקט מחלקה שמות סטודנטים אימייל שמות מנחים

פיתוח מאיץ חומרה לרשת עצבית מבוסס FPGA

FPGA-based Hardware Accelrator for DNN

תקציר בעיברית

כחלק מהתפתחותה המהירה של הבינה המלאכותית, רשתות נוירונים עמוקות הפכו להיות אחד מתחומי העניין המרכזיים בימינו.

מטרת הפרויקט הינה ביצוע מקסום ואופטימיזציה לביצועי מאיצי רשתות הנוירונים, תוך צמצום השהיה וצריכת הספק, בסביבה התומכת בממשק תוכנתי פשוט.

היעד בפרויקט זה הינו לפתח ולבצע אופטימיזציה לרשת זיכרון לטווח ארוך-קצר(LSTM) (שהיא סוג של רשתות עצביות חוזרות ונשנות (RNN)) בסביבת RISC המושתתת על כרטיס FPGA, ולאפשר ממשק תוכנתי פשוט ונגיש של למידת מכונה בשימוש של TensorFlow Lite for Microcontrollers (TFLM).

הרעיון של האצת רשת זיכרון לטווח ארוך-קצר בעזרת חומרה מבוססת כרטיס FPGA הינה חדשה באופן יחסי. אנו שואפים לחקור כלים חדשניים על מנת לפתח, לנתח ולבצע אופטימזיה למאיץ מסוג זה.

אנו מציעים להשיג יעדים אלו על ידי מימוש מאיץ מבוסס כרטיס FPGA. תחילה, נוסיף תמיכה תוכנתית של רשת זיכרון לטווח ארוך – קצר למודל TFLM המותאם. שנית, נממש את רשת הזיכרון לטווח ארוך-קצר ב- FPGA  המושתתות על כרטיס מסוג Ultra96, וזאת על ידי שימוש בפלטפורמת הפיתוח Vitis HLS. לאחר מכן, נחקור ונחיל אפשרויות אופטימיזציה חומרה שונות למאיץ שפיתחנו. לבסוף, נמזג בין מאיץ החומרה שפיתחנו לבין סביבת העבודה התוכנתית של TFLM (TensorFlow Lite for Microcontrollers).

כחלק מהשימוש במאיץ ומימוש של רשת נוירונים חומרתית, אנו מצפים לראות האצה משמעותית בביצועי הרשת לטווח ארוך-קצר .

תקציר באנגלית

As part of the rapid development in AI, Deep Neural Networks (DNN) have become one of the today’s main fields of interest.

The purpose of this project is to maximize and optimize the performance of a DNN, while minimizing latency and power consumption, in an environment supported by a simple software interface.

Our objective is to develop and optimize a Long Short-Term Memory (LSTM) network hardware accelerator in a low- powered Field Programmable Gate Array (FPGA) based Reduced Instruction Set Computer (RISC) environment, and to enable simple and accessible Machine Learning (ML) software interface using TensorFlow Lite Microcontroller (TFLM).

The idea of accelerating the LSTM network using FPGA- based hardware is relatively new, we wish to explore innovative tools in order to develop, analyze and optimize such an accelerator.

We propose to achieve our objective by implementing such FPGA- based hardware. First, we will add software support of LSTM network to our modified TFLM model. Second, we will implement LSTM network to the Ultra96, using the Vitis HLS development tool. Third, we will research and apply different hardware optimizations to our accelerators. Lastly, we will integrate the newly developed acceleration layers in the TFLM software.

As part of using an accelerator and implementing the NN on hardware, we expect to see substantial speedups in the LSTM performance.

Key Words: RISC, FPGA, Accelerator, XILINX, HLS, RNN, LSTM, TFLM