Here are some general guidelines for content creation and level design to improve performance.
Minimize the number of elements per object.
Combine models to have a reasonable amount of triangles per element (e.g. 300+ per element).
Opaque materials are fastest due to having the best z buffer culling, masked are a bit slower, and translucent materials are slowest because of overdraw.
Limit UV seams and hard edges as it results in more vertices for the hardware. In the worst case, high poly meshes with hard edges result in 3 times more vertices than reported in modelling applications.
The vertex processing cost for skinned meshes is higher than for Static Meshes.
The vertex processing cost gets larger when adding morph targets or when using WorldPositionOffset. Texture lookups can be quite slow due to caching.
Tessellation adds quite a lot of extra cost and should be avoided when possible. Pretessellated meshes are usually faster.
Very large meshes can be split for better culling. This is for view culling, as light culling is usually done at a finer granularity.
Smaller texture formats result in faster materials (e.g. DXT1 is 4 bits per pixel, DXT5 is 8 bpp, ARGB uncompressed is 32 bpp).
Lower resolution textures can be faster (when magnified). Sometimes they can be even smoother, as bilinear filtering can result in more shades than the texture format can express.
Materials with few shader instructions and texture lookups run faster. To optimize materials, use Material Editor stats and the view mode ShaderComplexity.
Never disable mip maps if the texture can be seen in a smaller scale, to avoid slowdowns due to texture cache misses.
Some material expressions are more costly (sin, pow, cos, divide, Noise). The fastest expressions include: multiply, add, subtract, as well as clamp() when using 0 and 1.
Shading models have a cost: Unlit is fastest, Lit is what should be used mostly, and other models are more expensive.
Limit the number of Stationary and Dynamic Lights.
Area light sources are a bit more expensive so avoid them where possible.
Adjust the draw distance on smaller objects to allow for better culling.
Verify LOD are setup with aggressive transition ranges. A LOD should use vertex count by at least 2x. To optimize this, check the wireframe, and solid colors indicate a problem. Using the Simplygon integration, it only takes minutes.
Try to combine lights that have a similar origin. For example, car head lights can use a single light with a light function to make it appear to be two lights.
Static lights are the fastest, Stationary lights are less optimal, and Dynamic lights are the most costly.
Limit the attenuation radius and light cone angle to the minimum needed.
Dynamic/Stationary point lights are the most expensive, directional lights are a little cheaper, and spot lights are best. Shadowmap generation cost scales with shadow casting objects' light frustum.
Light functions cost extra (actual cost depends on the material) and can prevent the light being rendered as a tiled light.
IES profiles cost extra (less than light functions) and can prevent the light being rendered as a tiled light. Do not use IES where spotlight cone angles could create the same effect.
Billboards, imposter meshes, or skybox textures can be used to efficiently fake detailed geometry.
Good level design takes occlusion culling into account (add visibility blockers for better performance). Use r.VisualizeOccludedPrimitives to check this.
Avoid Light Propagation Volumes or limit its cost by using the GIReplace material expression or disabling it on most objects.
Disable shadow casting where possible, either per object or per light.
Use ProfileGPU (Ctrl + Shift + ,) in the editor to get quick information on what is slow.
Decal performance cost scales with the number of pixels they cover.
CONSOLE: r.VisualizeOccludedPrimitives 1