40.3K Views
June 21, 21
スライド概要
※本講演はUnite Seoul 2020での講演を日本語吹替したものとなります
「原神」のレンダリングパイプラインと、コンソールでのクロスプラットフォーム開発について、miHoYoのテクニカルディレクターであるZhenzhong Yiがお話しします。
リアルタイム3Dコンテンツを制作・運用するための世界的にリードするプラットフォームである「Unity」の日本国内における販売、サポート、コミュニティ活動、研究開発、教育支援を行っています。ゲーム開発者からアーティスト、建築家、自動車デザイナー、映画製作者など、さまざまなクリエイターがUnityを使い想像力を発揮しています。
From Mobile to Console: Genshin Impact’s rendering technology on Console Zhenzhong Yi Studio Technical Director, miHoYo Head of Genshin Impact Console Development
Genshin Impact • Open-world action RPG • Features an “anime” rendering style • Multiplatform support, including PS4, PC, and mobile versions • Long game life-cycle with numerous updates over time
About Myself • Zhenzhong Yi • Over 10 years of console game development experience. Worked for companies both in China and the US before joining miHoYo, including: • Microsoft Xbox - Seattle, WA • Avalanche Studios - New York City, NY • Zindagi Games - Los Angeles, CA • Ubisoft Shanghai • Returned to China and joined miHoYo in early 2019 to build the console development team
Shipped Titles
Presentation Outline • Console rendering pipeline overview • Analysis of key technical points • Summary and conclusion
• Unity is an extremely flexible engine • Unity’s concise coding style allowed us to more conveniently customize the development of Genshin Impact's rendering pipeline • Unity China’s technical support also provided great assistance and cooperation, so we truly appreciate their help
• • • • More than half of our efforts were focused on making full use of the console’s hardware architecture in development and optimization We accumulated tons of experience through this process and had several technical breakthroughs. However… According to Sony NDA restrictions: • We cannot share hardware-related content • We cannot share information concerning low-level optimization We will not be discussing the CPU, I/O, or other such modules today
Presentation Outline • Console rendering pipeline overview • Analysis of key technical points • Summary and conclusion
Console Rendering Pipeline Overview • Genshin Impact’s engine has two different customized rendering pipelines • PC and console ➔ “console rendering pipeline” • Android and iOS ➔ “mobile rendering pipeline” • The overall tone is PBR based stylized rendering • The game’s versions are under simultaneous development, with mobile serving as the primary development platform, and console development following close behind • The selection of technologies must consider resource production and possible runtime costs of all platforms
Console Rendering Pipeline Overview • Based on PBR, which keeps the light and shadow effects of the entire environment uniform • • Not completely energy conservative The lighting models of different materials are altered based on art requirements • The game’s light and shadow effects are calculated in real-time • • Time of day Dynamic weather
Console Rendering Pipeline Overview • High-Resolution output • • • PS4 Pro – native 4k resolution Standard PS4 – 1440P rendering resolution, 1080P output resolution A clearer image is more conducive to our game’s art style • Extensive use of compute shaders • Over half of the features in the console pipeline were implemented via compute shaders
Console Rendering Pipeline Overview • Lighting and materials appear very different from more realistic-looking games, since the art team had very special requirements for them • Dirty, noisy, dark • Fresh, bright, clean, anime-like • We kept these 2 groups of words in mind at all times
What We Started With • A deeply customized Unity engine for mobile • We couldn’t just rely on Unity's own PS4 platform implementation • Nearly non-existent development resources • • • Initially, the console team consisted of one member: me! Whole studio severely lacked console development experience Many other things to handle: Sony Accounts, PSN Store, PS4 TRC, etc. • A tight development deadline • Starting from zero with a timeline of approximately a year and a half
Principles We Followed • When transforming the rendering pipeline for console, we adhered to the following ideas: • Avoid excessive features • • Choose practical and mature technologies whenever possible • • Cool-looking technologies do not necessarily match the style of our game No time for trial and error It’s best to have interaction between multiple technologies • Systematic transformation allows the resulting picture to appear more unified • The benefits of improved picture are exponential: 1 + 1 > 2
Presentation Outline • Console rendering pipeline overview • Analysis of key technical points • Summary and conclusion
• Selected a few technical points from scene lighting and shadow effects • Shared ideas on how to upgrade Genshin Impact’s visual quality • Focused on methodology
Shadows • To match the art style, shadows needed to provide enough details up close while still covering a large enough area
Shadows • Cascaded shadow map + Poisson noise soft shadows • Did not use the usual 4 cascades, instead we used up to 8 cascades • More cascades bring better shadow effects, but also bring more performance overhead • • More draw calls resulting in more CPU overhead More cascades resulting in more GPU overhead
Shadows • CPU optimization • Used shadow caching to reduce draw call count • First 4 cascades are updated with every frame • Last 4 cascades are updated via interleaving • Total 5 cascades updated each frame • All cascades are updated at least once every 8 frames
Shadows • GPU optimization • We used a screen space shadow map • Each pixel will do 11 samples based on Poisson disc to generate soft shadows, and to eliminate banding, sampling patterns are rotated • The cost of the entire pass is about 2 to 2.6ms • Do we really need to do such intensive calculations for every single pixel?
Shadows • GPU optimizations • Soft shadows are only useful at edges of shadows • A full-screen mask map is generated to mark the shadow, non-shadow, and penumbra areas • Soft shadows are only calculated for pixels marked in penumbra areas • 2 – 2.6ms ➔ 1.3 – 1.7ms
Shadows
Mask Generation ⚫ Low resolution calculation ⚫ ¼ x ¼ resolution ⚫ 16 pixels correspond to a mask value ⚫ Calculate each pixel to determine if it’s in shadow or not, merge the results to get the mask pixels ⚫ Accurate, but slow, and must be done 16 times ⚫ Optimization: Select a small number of sample points to calculate, get approximate results ⚫ A few samples cannot perfectly represent the 4x4 tile ⚫ Blur the mask map to enlarge the penumbra area
Shadow Optimization OFF
Shadow Optimization ON
Multiple AO Technologies • Characters and scenery that are already in shadow appear to be floating • Using ambient occlusion (AO), characters and scenery in the shadows will cast faint soft shadows around them • The game uses 3 different kinds of AO technology • HBAO provides more details in small areas • AO Volume provides a wide range of AO for static objects • Capsule AO provides a wide range of AO for characters
HBAO OFF
HBAO ON ON
AO Volume OFF
AO Volume ON
AO Volume • AO Volume generates a larger range of AO than HBAO • • For example, a tabletop can cast large AO on the floor HBAO cannot provide such effects • Occlusion information for each object is generated offline in object local space • AO value is calculated during runtime via this local space occlusion information
Capsule AO OFF
Capsule AO ON
Capsule AO • AO Volume solves the large range AO problem for static objects • But the game’s characters have skeletal animations • The occlusion information cannot be just generated offline • • • We used capsule AO technology for characters Skeletal animations are used to update capsules The occlusion is directional
Applying AO • • ½ x ½ resolution AO render target, bilateral upsampled to the full res. AO texture Bilateral filtered Gaussian blur is applied to eliminate noise. Remember, no noise! • Points for further optimization: • The bilateral filter has lots of repeated calculations • 2 pass blur + 1 pass upsample = multiple AO reads and writes • Solution: • Complete the compute shader in a single pass
Local Lighting • • Implemented clustered deferred lighting, supports 1024 lights in view Screen is divided into 64 x 64 pixel tiles, with each tile sliced into 16 clusters in depth direction
Local Light Shadows
Local Light Shadows • Nearly 100 lights in view can cast real-time shadow • We could support more, but these are enough • Dynamically adjust shadow map resolution based on distance and priorities • Baked static shadow texture + dynamic object shadows • With lots of local light, the large amount of baked shadow textures puts a heavy load on both the game’s capacity and I/O
Local Light Shadows • Static scene shadow texture is baked offline and then compressed • Use compute shader to decompress at runtime, the decompression speed is very fast • On a base PS4, about 0.05ms to decompress a 1k x 1k shadow texture
Local Light Shadow Texture Compression • 2 x 2 block encode into 32 bits used for every 4 depth values • Plane equation mode or packed floating-point mode • 64 bits with optional high-precision compression • Quadtree is used to merge encoded data, further increasing compression rate • 16 x 16 blocks form a single tile, each tile has 1 quadtree • Reference: [Bo Li, 2019]
Local Light Shadow Texture Compression • Compression Rate: In default precision mode, compression rate of a typical indoor scene is about 20:1 – 30:1, high precision mode is about 40 – 70% of that • Compression of shadow texture is essential, the capacity can be reduced by an order of magnitude
Local Light Shadow Texture Compression Default precision compression
Local Light Shadow Texture Compression High-Precision compression
Local Light Shadow Texture Compression • The size of a 2k x 2k texture is reduced from 8MB to 274.4KB • The default precision compression rate reaches up to 29.85:1 • The difference between the high-precision compressed texture and the uncompressed texture is indistinguishable to the naked eye • The resulting texture size is 583.5KB with an approximate compression ratio of 14:1
Volumetric Fog • Volumetric fog can be illuminated by a light source
Volumetric Fog • If the local light has a projection texture, volumetric fog will produce the corresponding effect
Volumetric Fog • Physically-based light scattering • The volumetric fog can be controlled with different parameters in different areas • Volumetric fog is illuminated by the light source • Temporal filter is used for multi-frame blending which results in a smoother and more stable fog image • The GPU cost is less than 1ms
Volumetric Fog • Camera view based • The view frustum is divided into voxels and aligned with the clusters of clustered deferred lighting • The fog parameters are saved in texture and loaded into the world via streaming • Injected into voxel while calculating • Take local lights into account • Ray marching to get volumetric information
God Rays • Occluded directional light produces the god ray effect
God Rays • A separate pass to generate god rays • ½ x ½ resolution • Generated via ray marching, sampling up to 5 shadow cascades • Provides adjustable parameters for the art team to add god rays on top of volumetric fog
God Rays • Volumetric fog can also generate god rays • But the resulting effect in-game was not satisfactory to the art team • The voxels’ resolution wasn’t enough • The intensity of god rays relies on the density of the volumetric fog, but dense fog on screen looks dirty • Separately generated god rays have a higher resolution, sharper edges, and more room for adjustment by the art team
God Rays God Ray From Volumetric Fog
God Rays God Ray From GodRay Pass
Image Based Lighting
Reflection Probes
Reflection Probes • Used reflection probes to provide reflection information for the scene • Time of day + dynamic weather means we cannot use baked cubemap • Offline bake scene data into a mini G-Buffer • Runtime generates cubemaps using real-time lighting condition • The art team can place multiple reflection probes wherever necessary
Reflection Probes • Reflection probes are updated at runtime • The process consists of three steps in total: relight, convolve, then compress • Compute shader is used to simultaneously process all 6 faces of a cubemap • Calculations are done in multiple frames with one probe being processed at a time, looping continuously
Reflection Probes • Relight: Mini G-Buffer is lit based on the current lighting condition Relight + =
Reflection Probes • Convolve: Generate the cubemap's mipmap chain, convolution applied on each mip level • Compress: Performs a BC6H compression on cubemap, 4x4 block ➔ 128 bits
Ambient Probes
Ambient Probes • After relighting, the reflection probes contain the current lighting condition, from which we can extract ambient information • This is then saved as 3-band SH • After the reflection probes are updated, the corresponding ambient probes are then automatically updated • Implemented using compute shader
Improvements on Image-Based Lighting • Relight does not consider shadows for the sake of performance and size of data • Both reflection probes and ambient probes will leak light • Ground located within a shadow will appear too bright • Shadows are baked offline and saved as shadow SH • In the same way, we save the local light's lighting information into a local light SH • Finally, during the relight step we add the shadow SH and local light SH
Improvements on Image-Based Lighting Shadow SH OFF
Improvements on Image-Based Lighting Shadow SH ON
Improvements on Image-Based Lighting Shadow SH OFF Local Light SH OFF
Improvements on Image-Based Lighting Shadow SH OFF LocalLight OFF Local LightSH SH ON
Interior Marks • Reflection probes are divided into indoor and outdoor types in order to handle different indoor and outdoor lighting conditions • Our art team uses the interior mesh to mark which pixels are affected by an indoor lighting environment • The ambient probes will accordingly generate different ambient lighting for indoor and outdoor environments
Interior Marks OFF
Interior Marks ON
Interior Marks Mask
Screen Space Reflection (SSR) • Different technique than that used for water surface reflections • On a PS4 Pro, the GPU overhead is around 1.5ms • A temporal filter is used to increase stability • Hi-Z buffer is used for acceleration and allow rays trace up to the full screen • During the reflection, the color buffer of the previous frame is sampled
Screen Space Reflection OFF
Screen Space Reflection ON
Runtime Reflection System • As was visible in the previous images, even without SSR, the game will still have the reflection information provided by reflection probes • Deferred reflection pass simultaneously performs the reflection and ambient calculations • The AO value is considered when reflection information is added to the lighting calculation, which then effectively reduces issues of leaking light
HDR Display • HDR: High Dynamic Range Display • SMPTE ST 2084 Transfer Function • Supports a max brightness of 10,000 nits • WCG: Wide Color Gamut • The Rec. 2020 color space covers 75.8% of the CIE 1931 color space, compared to Rec. 709, which is commonly used by HDTV and only covers 35.9% • (Image source: Wikipedia)
HDR Display • Here’s a breakdown of Genshin Impact’s console rendering pipeline for SDR and HDR modes:
HDR Display • Starting from 1.2, Genshin Impact will support HDR10 on PS4 • No SDR tone mapping, color grading or gamma correction • WCG color grading LUT is made in software like DaVinci • WCG color LUT pass takes less than 0.05ms • Including white balance, WCG color grading and color expansion • ACES pipeline (RRT + ODT) was not used as it does not fit our game’s style • By blending HDR output with tone mapping results, the scene within the SDR brightness range looks closer to the SDR version
HDR Display • The issue of consistency within the SDR brightness range • SDR: EOTF_BT1886 (OETF_sRGB (color)) != color • HDR: Inverse_PQ(PQ(color)) = color • An OOTF curve is added at the end of the HDR pipeline to simulate the error caused by EOTF_BT1886 (OETF_sRGB (color)) conversion
Addressing the Issue of Hue Shift • Many art resources rely on hue shift generated by the tone mapping curve • There is no tone map in a hue-preserving HDR pipeline SDR (Hue Shift) ╳ = HDR (Hue Preserving)
Addressing the Issue of Hue Shift • Blackbody radiation solution • Temperature is used to calculate color • Requires the art team to modify their assets • Genshin Impact uses a method of simulating hue shift in the shader • Does not require any modification of assets • Final color grading now comes with a tone mapping curve, so there’s no need for mixing tone mapping as we mentioned before • Calculations are combined in the LUT
Presentation Outline • Console rendering pipeline overview • Analysis of key technical points • Summary and conclusion
• Global players' acceptance of Genshin Impact on the PlayStation 4 has far exceeded our expectations • However, there are still many areas that require further optimization, and we will continue to make more improvements • Better performance • Faster loading times • More stability • More graphics features that suit the game’s style
• Genshin Impact is only the beginning and is our first venture in console development • Time is limited, resources are limited • The team requires talent of all sorts • We’ve gained experience in combining realistic rendering technology with anime rendering style • There is still much we’d like to do, especially with the next generation of consoles arriving now • We look forward to more opportunities to exchange development experiences
We Are Hiring! • We will continue to develop future products • The game programming team is continually recruiting new members • Especially hiring console development-related positions • Join our team of tech otakus saving the world!
With Special Thanks To: • The entire Genshin Impact engine team • Every member who contributed to Genshin Impact’s console development ♥ • Special member of the console team, Lulu🐱 • A very special thanks to Wenli Chen, Terry Liu, and all the global tech support specialists at Sony for all the support they have given us
References: • Josiah Manson, 2016,“Fast Filtering of Reflection Probes” • Li Bo, SIGGRAPH 2019, “A Scalable Real-Time Many-Shadowed-Light Rendering System” • Michal Iwanicki, SIGGRAPH 2013, “Lighting Technology of The Last Of Us” • Paul Malin, 2018, “HDR Display in Call of Duty®” • Yasin Uludag, 2014, <GPU Pro 5>,“Hi-Z Screen-Space Cone-Traced Reflections” • Nathan Reed, GDC 2012, “Ambient Occlusion Fields and Decals in Infamous 2” • Fabian Bauer, SIGGRAPH 2019, “Creating the Atmospheric World of Red Dead Redemption 2: A Complete and Integrated Solution”