Sie sind auf Seite 1von 16
IA A cu») United States 2) Patent Application Publication co) Pub. No.: US 2020/0242723 Al Colenbrander (43) Pub, Date: Jul. 30, 2020 3AME CONSOLE CPU/GPU (62) US.cL ME CONSOLE AND cre GO6T 1/20 (201301); ABIE 13/385 (2014.09): GO6F 1/60 (201301) on ABSTRACT (2) Filed: Jan, (1) nec. Gor 120 Gus 1760 ABP 132385 Cloud gaming Client game console | <>» management server a 212 pe Storage Server 1 4 Storage Server N RAM | RAM 206 ney 4 Patent Application Publication Jul. 30, 2020 Sheet 1 of 7 Primary Dispa 18) Speakers Processor 24 20 Network ||9PUt Device Cera Interface 28 Zz Pot | Medium |~ GPS 24 2a 0 Near Fists Bluetooth Neat Field Auxiliary 34 \Element__99| Sensor(s) OTA Infrared 38 42) 44, Consumer Electronic Device D. Speakers] it nevioalPOcesse4 Network |!" Camera Interface SA a Port Medium, GPS 84) Bluetooth [Neat Field Auxiliary Communication salElement__z0| Sensor(s) Climate | Biometric Sensor(s)|_ Sensor(s) 52 56. Infrared Py 37 50 L-s3 72 ‘US 2020/0242723 AL Consumer Electronic Device Cable or satellite source 28a j Network Interface = / Medium 34) Processor] Patent Application Publication Jul. 30,2020 Sheet 2 of 7 US 2020/0242723 Al 200 202 Cioud gaming | | | Client game console | <———> | | Management server | | | I 28 22, Storage Server 1 [ Storage Server N . RAM RAM | Noes Nae FIG. 2 Patent Application Publication Jul. 30,2020 Sheet 3 of 7 US 2020/0242723 Al 790 I 304 sf : \ feru Deru GPU] CPU i 308 | Memory 12 Memory { sro} Semmolen Controller | |] Memory (RAM) Memory (RAM) i i FIG. 3. Non-uniform memory access 400. 402 cPU_| GPU GPU_[ CPU 404 Memory Controller 406 Memory FIG. 4 Separate dies ~ all processors eae 502 504 ease | cpu _| Gpu soe | Memory [i____t | | Controtter ae 7] Memory (RAM) | Let FIG. 5 Extra die with extra APU Patent Application Publication Jul. 30,2020 Sheet 4 of 7 US 2020/0242723 Al 608 Video |__| ene encodef| [ Scan out unit 604 Register Register | | | 606 Buffer 1D | Buffer ID FIG. 6 Patent Application Publication Jul. 30, 2020 Sheet 5 of 7 ‘US 2020/0242723 AL 700 702 704 Assign memory Program registers of regions as frame scanout unit to point Cycle through buffers buffers (e.g., two | to buffer managed by *) to output HDMI buffers) different GPU NUMA technique A — GPUs render FIG.7 ferent frames 800 802 804 Assign memory Program registers to Receive frame from regions as frame }———> point only to local |-————>} other GPU via Direct buffers buffers Memory Access | 806 Cyole through frames to output HDMI FIG, g NUMA technique B - GPUs render different frames, Patent Application Publication Jul. 30, 2020 Sheet 6 of 7. US 2020/0242723 AL 200 902 908 Generate N of M Generate M-N lines Output HDMI frame: lines from buffer 4 from buffer 2 lines 1-M FIG, 9 GPUs render different portions of "each frame (NUMA technique 1) 4000 1002 1004 Receive M-N lines |» from second GPU via} ——>} DMA Generate N of M lines from buffer 4 Output HDMI frame fines 1 -M FIG. 19 GPUS Fender different portions of IS each frame (NUMA technique 2) 1400 4102 1104 GPU 1 renders N GPU 2 renders M-N Output HDMI frame lines to frame buffer lines to same buffer lines 1-M. FIG. 11 GPUS render different portions of each «1 frame (shared memory controller) Patent Application Publication Jul. 30,2020 Sheet 7 of 7 US 2020/0242723 Al | | Physically connect HDMI port to eon | particular GPU | Which GPU controls HDMI FIG. 12 output (technique 1) 1300 1302 Each GPU has its own Multiplexer toggles output port between ports Which GPU controls HDMI FIG. 13 output technique 2) 1402 1404 Hor ZEH oe Multiplexer "Ch GPU FIG. 14 US 2020/0242723 AI SCALABLE GAME CONSOLE CPUK DESIGN FOR HOME CONSOLE AND. CLOUD GAMING u FIBLD [0001] The application relates generally to sealable game ‘console CPUIGPU designs for home consoles and cloud aming BACKGROUND 10002} | Simulation consoles such as computer game con- soles typically use a single chip, referred to as “system on chip” (SoC) that contsins a central processing unit (CPU) ‘and a praphies processing unit (GPU). Due to semiconductor Scaling challenges and yield issues, multiple small chips can be linked hy high-speed coherent busses to form big chips While such a sealing solution is slightly less optimal ia performance compared to building a hnge monolith chip, iis less costly, SUMMARY, 10003] As understood herein. SoC technology canbe applied to video simulation consoles such as game consoles, ‘an in particular a single SoC may’ be provide for “ight” version of the coasole while plural So's may be wsed 0 provile a “high-end” version of the console with grester processing and storage capsbilty than the “light” version. ‘The “high end” system can also contain more memory such, as random-access memory (RAM) and other features and may’ also be wsed for a cloud-optimized version using the same game console chip with more performance, 10003] As further understood herein, however, such “high end” multiple SoC design poses challenges to the software and simulation (game) desiga, which must scale accontingly. As an example, challenges arise related (0 non-uniform memory access (NUMA) and thresd manage- ment, as well as providing hints 19 software t0 use the hardware in the best way. In the ease of GPUs working in ‘concert the framebuffer management and control of high definition multimedia (HDMI) output may be addressed, Other challenges as well may be addressed herein. [0005] Accordingly, an apparatus includes at least a frst _araphics processing unit (GPU), and a least a second GPU ‘communicatively coupled to the first GPU. The GPUs are programmed to render respective portions of video, such that the frst GPU renders frst portions of video and the sccond GPU renders second portions of the video, with the first and second portions being different from each other. [0006] Stated differenly, the first GPU may be pro- grammed fr rendering first frames of vido to provide a ist ‘output, while the second GPU is programmed rendering sovne, but not all frames of the video to provide @ second ‘output. The frames rendered by the second GPU are different from the frames rendered by the first GPU. The first and second outputs may be combined to render the video. I sition, or altematvely the frst GPU may be programme Jor rendering all of some, but not all, Hines of frame of video to provide firs ine outpat and the second GPU may be programmed for rendering Some, but not all, lines of the frame ofthe video to provide a second fine output. The lines rendered by the second GPU are different from the lines rendered bythe fist GPU, The first and secons-line outpits ‘can be combined to render the frame Jul. 30, 2020 [0007] In some embodiments, the first and second GPUs fare implemented on # common die, In other embodiments, the fist and second GPUs are implemented on respective Tint and second dies The fist GPU may be associated With a first central processing unit (CPU) and the second GPU ‘may be asiociated with a second CPU. [0008] In some implementations, a first memory controller And first memory’ are associated with the first GPU and second memory controller and second memory are aso sted with the second GPU. In other implementations, the GPUs share a common memory controller controlling. a common memory. [0009] In some examples, each GPU is programmed render all ofsome, but nt al frames of video diferent rom frames of the video rendered by the other GPU to provide a espoctve output The outputs ofthe GPUs can be combined to render the video. In other examples, each GPU is pro- grammed to render all of some, but not all, lines ofa frame ‘of wide, with ines of frame of video rendered by a GPU being diferent from lines ofthe frame readered by the other i. The outputs of the ubined ta render the video, [0010] In an example technique, the fist GPU ineludes at Jeastone scanout unit pointing to atleast one butler managed by the second GPU. The first GPU eaa be programmed 10 cyele through butfers to output a complete sequence of frames of the video. In another example, the frst GPU includes at least one seanot wit pointing only to bullers ‘managed by the first GPU and is programmed to receive frames ofthe video from the second GPU via direct memory access (DMA) fo output a complete sequence of frames of the video. [011] In yet another example technique, the fist GPU includes at least one seanoutvnit pointing to atleast a frst buffer managed by the first GPU and a second buffer managed by the second GPU. In this technique, the frst GPU is programmed to eycle through bulfers to output 2 complete sequence of frame of video sing I-N lines asso- ate withthe first buffer and (N#1)-M lines associated with the second buller. The 1-N lines are different lines of the same frame associated withthe (N)-M lines [0012] Yet again, the fist GPU can include at loast one ‘Seanout unit pointing oat Teast frst buffer managed by the first GPU and not toa second buffer managed by the second GPU, In this implementation, the first GPU may be pro- grammed t0 cycle through bulfers to output a complete Sequence of frame of video using I-N lines associated with the fist buffer and (N+1)-M lines associated with the second buller and received by the first GPU via direct memory ‘eess (DMA). The 1-N lines and (Ne1)-M lines are diller ent Fines of the frame of video. [0013] In still another technique, the frst GPU includes at Teast one scanout unit pointing to at least first busfer commiiicating with the common memory controller. The second GPU includes a second buffer communicating with the common memory controller. The fist GPU is pro- sarammed for rendering I-N lines associated with the fi buller and the second GPU is programmed for rendering (N+ 1)-M lines associated with the socond bute. [0014] In some examples, the first GPU manages video ‘ata output from the fist and second GPUs, This may’ be affected by physically comnecting a HIDMI port tothe first GPU. In other examples, the GPUs output video data to a US 2020/0242723 AI plexes the frames odo ines fo together to output video. In another aspect, in @ muligraphics processing ‘method. incldes ‘causing plural GPUs to render respective frames of Video, oF to render respective portions of each frame of video, or both to render respective Trames axl respective portions of frames of video. The method ineludes controlling frame ‘ouput using a first one of the GPUs receiving fame information ffom at least one other of the GPUIG), oF rutiplexing ontputs of the GPUS togeher, or both using & first one of the GPUs receiving frame information from at Jeast one other ofthe GPU(s) and multiplexing outputs ofthe GPUS together. [0016] In another aspect, a computer simulation apparatus includes at Jeast a fist graphics processing unit (GPU) programmed for rendering a respective fist portion of simulation video, anda least a second GPU programmed for rendering a respective second portion of simulation video. AA east the first GPU is programmed to combine the fist ‘and second portions and to render an ourput establishing 8 ‘complete simulation video. [0017] The dt structure and operation, can best be w to the accompanying drawings, in which like reference numerals refer vo like parts, and in which, 0015] unit (GPU) simulation environment BRIEF DESCRIPTION OF THE DRAWINGS 0018] FIG. 1 isa block diagram of an example system Jncluding an example in accordance with preset principle 0019] FIG. 2 is 9 schematic dingeam of a cloud-based Bamig system {0020} FIG. 3 is a block diggram of an example oon- ‘niform memory acess (NUMA) architecture, n which to APUsare shows on a single Tare, it being understood that the NUMA architecture may be implemented by APU oa separate fabrics and that more than two APUS may be implemented {0021} FIG. 4s a block diagram of « shal memory farchiteerio in which two APUs are showa with cach Procetsr being implemented on is own respective de, it being understood thatthe arehiteture may be implemented ‘on fewer or even one dic and that more than two APUs may be implemented, 0022] FIG, 5 is a block diagram of a shared memory ‘chiteenire in which to APUs are showin ith each APU being implessened on ts own respective fare ad with the shared memory controller being implemented on one ofthe fabrics, it being understood that the architecture may be implemented on one fabric and that more than wo APUs may be implemented on one or more dies: [0023] FIG. 6 isa block diagram of an example GPU with Scanout unit: 10024] FIG. Ti « Bow chart of example logic ofa NUMA ‘embovliment in which each GPU renders complete frames ‘with cach GPU rendering different frames of the same video than the other GPU, with one of the GPUs having registers pointing to bullers of the eter GPU(6}: 10025] FIG. 8 sa flow chart of example logic ofa NUMA ‘embodiment in which each GPU renders complete frames with each GPU rendering diferent frames ofthe same video than the other GPU, with one of the GPUs receiving frames via DMA from the other GPU(S}: Jul. 30, 2020 (0026) 1G. 9 sa ow chart oF example logic ofa NUMA cmbodiment in which cach GPU readers portions (oa lines) of frames with each GPU rendering diferent portions ofthe sume frame than the other GPUs {0027} FIG. 10 is a flow chact of example logic of a [NUMA embodiment in which each GPU renders portions (Ge, lines) af frames with each GPU rendering dilleret Portions ofthe same frame tan te ether GPU, with one of {he GPUs receiving lines via DMA from the other GPU): {0028} FIG. IL sa fow chart of example logic of shared semory embodiment in which cich GPU renders portions (Ce. lines) of frames with each GPU rendering diferent portions of the same frame than the other GPU: {0029} FIG. 12 is a low chart of example logie for controling video outpt using a single GPU eonncctd t 3 HDMI pore (0030) FIG, 13 is @ flow chart of example logie for controling video omipu using a multiplexer, an {0081} FIG. 14isa Block diagram assoeated with FIG. 13. DETAILED DESCRIPTION 0032] This disclosure relates generally to computer eo systems including aspects of consumer electronics (CE) device networks such af but not limited to distributed ‘computer game networks, video broadcasting, content dliv~ ery networks, vital machines, and machine leaming app cations. A system herein may include server and client fomponens, connected over @ network such that data may be exchanged between the client and server components ‘The client components may include one or more computing devices including game consoles such as Sony PlayStation and related motherboards, portable televisions (e.g, smart TVs, Internet-enabled TVs), portable computers such as laptops and tablet computers, and other mobile devices including smart phones and additional examples discussed below. These client devices may operate with a variety of ‘operating environments. For example, some of the client ‘computers may employ, examples, Orbis or Linux oper- ating systems, operating systems from Microsoft, ora Unix ‘operating system, or operating systems proxuced by Apple ‘Computer or Google. These operating environments may be used to execute one of more browsing programs, such a8 @ broviser made by Microsoft or Google or Movilia or other browser program that can access Websites hosted by the Internet servers discussed below. Also, an operating envi- ronment aeconting to present principles may be used 10 fexecte one of more computer game programs [0033] Servers andlor gateways may include one or more processors executing instructions that configure the servers fo receive and transmit data over a network such a the Internet. Or, client and server cat be connected over a local inteanet or a virtual private network. A server or controller ‘nay be instantiated by a game console and/or one or more ‘motherboards thereof such as a Sony PlayStation®, a pet- sonal computer, ee [0034] Information may be exchanged over a network ‘between the clients and servers, To this end and for security, servers andr clients can include firewalls, load balancer Temporary stories, and proxies, and other network infra. structure for reliability and security. One or more servers ‘may fom an apparatus that implement methods of providing f secure community such as an online social website to retwork members US 2020/0242723 AI 10035] As used herein, instructions refer to computer ‘implemented steps for processing information inthe system. Insiruedons can be implemented in software, firmware or hardware and inclode any type of programmed step under- taken by components of the system, 10036] A processor may be any conventional general- purpose single- or multichip processor that can execute logie by means of various lines such as address Hines, data lines, and contol Hines aad repsters and shill registers. [0037] Software modules described by way of the flow ‘chars and user interfaces herein can include various sub= routines, procedures, ete. Without limiting the disclosure, logic stited to be executed by a particular module can be redistributed to other software modules and/or combined together ina nodule and/or mide available in shareable brary [0038] Present principles described herein ean be imple: mente as hardware, software, fimWare, oF eombinations thereof, hence, illustrative components, blocks, modules ‘iruits, and steps are set forth in terms of thei functionality: 10039] Further to what has been alluded to above, logical blocks, modules, and circuits described below ean be imple- mented or perfonned with a general purpose processor, @ digital signal processor (DSP), a field programmable gate array (FPGA) or other programmable logic device such as fan application specific integrated circuit (ASIC). discrete ate or transistor logie, discrete hardware components, oF fy combination thereof designed to perform the functions dsribed herein. A processor can be implemented by 3 ‘contolle of state machine or @ combination of computing devices, 0040] The functions and methods described below, when ‘implemented in software, can be writen ia an appropriate Janguage suchas but not limited to ava, C# or C+, and ean be slored on or irinsmitted through computer-readable storage medium such as a random access memory (RAM), read-only memory (ROM), electrically erasable program- rable read-only memory (EEPROM), compact disk read- ‘only memory (CD-ROM) or other optical disk storage such fs digital versatile dise (DVD), magnetic disk storage or ‘ther magnetic storaze devices including removable thumb drives, ete. A connection may establish a computer-readsble ‘medivin. Such connections ean include, as examples, hard- ‘wired cables including fiber opties and coaxial wires and

Das könnte Ihnen auch gefallen