题名 |
Improving ILP with Semantic-Based Loop Unrolling Mechanism in the Hyperscalar Architecture |
DOI |
10.29428/9789860544169.201801.0190 |
作者 |
Jih-Ching Chiu;Shu-Jung Chao;Yi-Xuan Lu |
关键词 |
ILP of loop ; semantic of loop ; loop unrolling ; hyperscalar |
期刊名称 |
NCS 2017 全國計算機會議 |
卷期/出版年月 |
2017(2018 / 01 / 01) |
页次 |
1012 - 1017 |
内容语文 |
英文 |
中文摘要 |
In this thesis, we propose an architecture called semantic-based loop unrolling mechanism in the hyper-scalar architecture. This architecture can unroll the loop in the instruction analyzer (IA) automatically by analyzing the information gathered after finding the closed interval of loop body instructions by parsing the semantic of instructions, which is identical to what we formulate. Proposed architecture consists of three unit: loop detect unit (LDU), loop unrolling unit (LUU) and loop controller. LDU will find the closed interval of the loop body instructions by parsing the semantic of instructions which is identical to what we formulate and collect the information of this closed interval. LUU will unroll the loop based on the information collected by LDU. The unrolling procedures of LUU are as follows: (1) Decide loop unrolling times by the resources of core numbers, and add the SEQ tag to these instructions. (2) Register renaming and eliminate iteration dependence of the unrolled loop. (3) Generate tag of these instructions and add compensate tag to make sure the accuracy of data. (4) Rearrange the issue order of these instructions to issue the instructions which have been eliminated iteration dependence first, and generate instruction tag dispatch table, loop VSRF mapping table, loop M tag mapping table and loop specific instruction flush table. Loop controller will depend on the branch instruction with wrong prediction result and the loop which finish the unrolling procedures to decide whether LUU has the dispatch right. If this branch instruction identical to the unrolled loop's conditional check branch instruction, and then the dispatch right will be handed over to LUU. When the execution of the unrolled loop is finish, loop controller will hand the dispatch right back to IA. In this paper, the verify ARM instructions are generated by Keil μVision5 compiler. The results show that eliminating iteration dependence can improve ILP by 20% to 100%, and flushing specific instruction can decrease the total execution time of the loop whose loop body contains the internal branch instructions. |
主题分类 |
基礎與應用科學 >
資訊科學 |