193 Views
May 08, 18
スライド概要
講演者:Kerry Turner(Unity Technologies)
こんな人におすすめ
・アプリケーションを最適化したいプログラマー
受講者が得られる知見
・パフォーマンス分析と最適化のベストプラクティス
リアルタイム3Dコンテンツを制作・運用するための世界的にリードするプラットフォームである「Unity」の日本国内における販売、サポート、コミュニティ活動、研究開発、教育支援を行っています。ゲーム開発者からアーティスト、建築家、自動車デザイナー、映画製作者など、さまざまなクリエイターがUnityを使い想像力を発揮しています。
Kerry Turner Developer Relations Engineer Unity Technologies
Real world performance analysis and optimisation
What we’ll cover • Memory usage • Load times • CPU optimisation: animations
What we’ll cover FIRST • Memory usage • Load times • CPU optimisation: animations • Profiling
Profiling: Best practice • • • Profile in real-world conditions • Don’t profile in the Unity Editor • Profile on target hardware • Profile in a typical environment Profile the whole state of your game • Find the cause of your problem • Understand your resource budget Profile before and after you make a change
Profiling: Unity Profiler Window • Useful for CPU cost of Unity’s internal systems, managed heap size, GC allocs • Added in 2017.3: • Experimental support for Deep Profiling in standalone players using Mono • Profile threaded code using Profiler.BeginThreadProfiling()
Profiling: Unity Frame Debugger • Useful for examining the commands sent to the graphics API without platform-specific tools, learning why draw calls have not been batched • Added in 5.6: • Batch breaking information
Profiling: Unity Memory Profiler • Useful for identifying assets that are inappropriately large, or that should not be resident in memory • Download from bitbucket.org/Unity-Technologies/memoryprofiler • 2017.3: • Support for Mono .NET 3.5 runtime
Profiling: Platform-specific tools
Runtime memory usage: Asset settings • Create asset rules • Enforce asset rules using AssetPostProcessor scripts • Download Asset Auditor from github.com/MarkUnity/AssetAuditor
Runtime memory usage: Asset complexity • Overly large and complex source assets are a very common cause of excessive runtime memory usage • Identify overly large or complex source assets by examining a memory snapshot and auditing assets • Reducing source asset size and complexity has other benefits • • Faster asset load times • Meshes with fewer vertices = reduced vertex processing cost on GPU • Lower resolution textures = reduced texture read cost on GPU • Animations with fewer curves = reduced animation cost on CPU This is a good example of where asset rules can prevent human error
Runtime memory usage: Read/Write Enabled Textures • • • Read/Write Enabled = 2 copies of texture • 1 in GPU memory, as usual • 1 additional copy in CPU memory Enable Read/Write Enabled only if: • You get pixels in code • You set pixels in code Disabling Read/Write Enabled reduces texture size in memory by 50%
Runtime memory usage: Read/Write Enabled Textures • Texture2D instances created in code are read/write enabled by default • Make Texture2D instances read-only by calling texture2D.Apply(updateMipmaps, true); • This uploads the texture from main RAM to the GPU • If makeNoLongerReadable is true, the copy in main RAM is then discarded
Runtime memory usage: Mip maps • Mip maps add 33% to texture size • Generate Mip Maps defaults to true • Enable Generate Mip Maps only if: • • Texture distance from camera varies This is another great example of where an asset rule can fix incorrect settings
Runtime memory usage: Read/Write Enabled Meshes • Read/Write Enabled = 2 copies of mesh • 1 copy in GPU memory, as usual • 1 additional copy in CPU memory • This can more than double the size of a mesh in memory • Enable Read/Write Enabled only if: • You access the mesh properties in code • You are using a MeshCollider and the mesh transform has negative scaling • You are using a MeshCollider and the mesh transform is skewed or sheared
Runtime memory usage: Read/Write Enabled Meshes • Mesh instances created in code are read/write enabled by default • Make mesh instances read-only by calling mesh.UploadMeshData(true); • This uploads the texture from main RAM to the GPU • If markNoLongerReadable is true, the copy in main RAM is then discarded
Runtime memory usage: Vertex Compression • Applied in Player Settings • Uses half precision (16-bit floats) for selected vertex channels • Applied to all eligible meshes in project, including those generated by static batching, except when it is overridden • Not compatible with SkinnedMeshes until 2018.2 • From 2018.2, SkinnedMeshes can use Vertex Compression for texture co-ordinates only
Runtime memory usage: Vertex Compression • Vertex Compression cannot be applied when: • • A mesh has Mesh Compression applied • Mesh Compression is a lossy compression that affects size on disk only • Mesh Compression is applied to individual mesh assets at import time • To use Vertex Compression on a mesh, disable Mesh Compression A mesh is read/write enabled • When a mesh is read/write enabled, 2 uncompressed copies are resident in memory
Runtime memory usage: Animation Compression • Animation Compression allows some control over how Unity processes and represents a clip’s curve and keyframe data • Adjust precision using Animation Compression Errors settings
Runtime memory usage: Animation Compression • Off (default) • Keyframe reduction • • Optimal (Generic and Humanoid only) • • Applied after import, iterates over each curve and removes redundant keyframes Applied at build time, allows for use of Dense curve type for additional file size reduction Recommended settings: • Legacy: Keyframe reduction • Generic or Humanoid: Optimal
Runtime memory usage: Audio Load Type • Recommended settings: • Streaming if >1 MB • Compressed in memory if >200 KB and <1MB • Decompress on Load if <200 KB
Audio compression format: Compression ratio ADPCM 27.5% Vorbis 100% MP3 100%. 22.2% 27.50 31.0% Vorbis 50% 11.0% MP3 50% 11.0% 31.00 ms 22 11 11 0 8.25 16.5 24.75 33
Audio compression format: CPU time to load ADPCM 1.4% Vorbis 100% MP3 100%. 6% 1.40 12.0% Vorbis 50% 7.8% MP3 50% 7.5% 12.00 % 6 8 8 0 3.75 7.5 11.25 15
Audio compression format: Conclusions • ADPCM is by far the fastest to load, but offers a relatively poor compression ratio • At 100% quality, MP3 significantly outperforms Vorbis in terms of compression and load time • At 50% quality, Vorbis and MP3 have very similar performance • Recommended settings: • Short clips: ADPCM • Long clips: Vorbis or MP3
Load times: GetScriptingClass() 1110 ms 1.10 ms 0.9 1.2
Load times: GetScriptingClass() 1110 ms 1.10 ms 20ms 0.02 0 0.3 0.6 0.9 1.2
Load times: GetScriptingClass() • MonoManager::GetScriptingClass() searches assemblies for class types during application startup • Performance regression led to this function taking up to 50% of application startup time in IL2CPP projects • • Fixed in 2018.2 • • This was due to inefficient string operations String operations have been replaced with a hash map Patched to all versions of Unity 2017
Load times: ETC Crunch Textures • Crunch compression is a lossy texture compression format that provides additional file size savings • Unity moved to a new Crunch library in 2017.3 • Before 2017.3, Unity could apply Crunch to DXT only • From 2017.3, Unity now allows for Crunch compression of • ETC_RGB4 • ETC2_RGBA8
Load times: ETC Crunch Textures ETC 6.7 MB ETC Crunch 1.8 MB 67.00% ETC 18.00% ETC Crunch 0 37.5 75 112.5 150
Load times: ETC Crunch Textures ETC ETC Crunch 6.7 MB 49 ms 67.00% ETC 49.00% 1.8 MB 133ms 18.00% ETC Crunch 133.00% 0 37.5 75 112.5 150
Animation CPU optimisation: 100 animations with 12 curves Legacy 4ms Generic 9ms 4.00 ms -4.5 9.00 3 10.5 18 25.5 33
Animation CPU optimisation: 100 animations with 640 curves Legacy 19ms Generic 13ms 19.00 ms 13.00 Humanoid 26ms 26 0 8.25 16.5 24.75 33
Animation CPU optimisation: Conclusions • Unity Animation System becomes more efficient with a higher number of curves • Recommended settings for lower-end devices: • Unity Animation System for clips with >300 curves • Legacy Animation System for clips with <300 curves
Animation CPU optimisation: Humanoid or Generic? • Humanoid performs operations related to retargeting, root motion and IK every frame, regardless of whether those features are used • Recommended settings: • Humanoid when using retargeting and IK • Generic in all other cases
Animation CPU optimisation: Culling Mode • Culling Mode allows you to configure how an Animator behaves when offscreen • Always Animate (default) • • Cull Completely • • Performs all operations when culled Performs no operations when culled Cull Update Transforms • Updates internal state but skips transform writes, retargeting and IK when culled
Animation CPU optimisation: Animator bindings • Before 2018.1, Animators discard buffers and bindings when GameObject is deactivated • This results in CPU spikes when GameObject is reactivated • From 2018.1, retain buffers and avatar bindings when GameObject is deactivated by calling: keepAnimatorControllerStateOnDisable = true; • This can be set via script only, and is not visible in the Inspector • Be aware of memory usage of deactivated Animators with this set to true
One last tip: More talks like this • Unite Europe 2017 - Squeezing Unity • Unite 2016 - Let's Talk (Content) Optimization • Unite Europe 2016 - Optimizing Mobile Applications
Thank you!