New type High performance GPU
Horizontal operation compared to conventional vertical operation
- Different from conventional way of executing commands per cycle, same command can be executed in one action in a few cycles.
- Number of registers required becomes larger, but there is merit in performance
and cost which can offset the larger number of registers.
- Regrading performance, same command is executed continuously so that each timing conditions can be mitigated, and high level functions can be supported through multi level pipelines.
- Regarding cost, simple circuit without command cache can be implemented as well as low price SRAM can be used.
Performance win over processor tech/Program simplification
- Reason for high performance is removal of occurrence of data dependence
which is the fate of processor technology.
However, it is restricted to image processing where same operations are
- Due to data independence, can construct a very long pipeline, increase operation frequency, and make trigonometric functions using CORDIC calculations possible.
- Since no need to pay attention to data dependencies, can implement software easily with less hardware inherent restrictions.
- Compact architecture which can be implemented using FPGA also (For each
processor element using Altera CycloneV approximately 10K ALM).
- Number of processor elements can be freely set (User operator can also