Full Photogrammetry Guide for 3D Artists

Vlad Kuzmin shared a lot of tips on photogrammetry. Learn how to scan real-world objects, clean them up and render.

PLANNING

The success of a good photogrammetry scan depends on the right planning.

This means a lot more than just fully charged camera batteries and bringing the camera tripod along. And not just good overcast weather or proper studio scan setup.

Before you start shooting like crazy, first you will need to estimate how, and from where you will shoot your object.

You will need to examine your object, the lighting conditions and the available routes around the object if you shoot outdoors.

It is good to have enough moving space around the object you are planning to shoot. You should have about 3 to 5 meters of available space around the object, and for the close-up shoot, you’ll need to get to about 50cm from the object (for objects up to 3 meters in height).

ACQUIRING

Good scanning practice: at least one loop around the object in 10-degree increments (approx. 36 images), where the whole object fits in the image. For better results, you can make more loops from different elevation levels. These images give the photogrammetry software a “basic frame”. And this “frame” will be used for alignment and depth calculation of the more close-range shots.

Next is the mid-range and close-range shots: if the first loop is at 3m away from the object, the mid-range is at 1.5m and the close-range is about 50cm. The mid-range shots connect the far distance loop and close-range images.

On the far-range images, a coin looks like a dot, and in the close-range images, the coin is seen with clear details. The feature (tie point) detection algorithms are not able to recognize a dot in one image and a big coin in the other images. That is why the intermediary step is necessary.

On the middle and close-range shots, we need to remember about ALL surfaces. Every side of the scanned object needs to be shot: back, top, underneath, inside, etc. And every point of every surface needs to be seen in at least 2-3 images, taken one straight and two from a slight (10°-15°) angle to surface.

I have seen many scans made by hobbyists where the back of knees, underarms, the bottom sides of chairs and tables, the bottom sides of the roofs, or even the tops of sculpture’s heads (very common problem) have been completely forgotten.

DO NOT FORGET about proper overlap: every shot needs to overlap with the next in at least 60% (and 80% or more is even better).

A post with advice on taking images for photogrammetry scan.

AVOID panoramic shots (where the camera stays on the tripod, and you rotate only the camera). Photogrammetry software sees them a single panoramic image. Such images do not have parallax changes, and as result, no depth information that required for calculation of the mesh details. Or the software just does not align these images correctly.

When you have a basic plan for movement around the object and already know the part of the object that needs additional attention, you can start shooting.

Shot, small side step, shoot, small side step, shoot… after finishing the loop, raise the camera and start with the next loop and so on.

In some lighting conditions, for example, an object in a cave and one side darker than other, better use other technique. Shoot from below, move the camera up shoot again and so on. After that, take a small sidestep and repeat vertical movements but from the top to the bottom. Sidestep and again from below to top.

If you are shooting like this, you can adjust camera settings so that any backlit side of the object is not underexposed and a light side is not overexposed. You should avoid changing the ISO settings on your camera during this process. Images with different ISO settings tend to have different “noise pattern” or noise level. And the images with different ISO need to the pre-processed separately.

You should always follow the shape and curves of the object. The shooting needs to be straight or from only a slight angle in relation to the surface.

When you know that you have all the necessary images, like the top of the sculpture’s head, the back side, the surfaces that look down etc., it is time for the final “kill” shots. These can be random shots of the most complex surfaces, or the surfaces with fine detail like face, eyes, ears, palms etc. And this is done to ensure that we have not forgotten anything.

We need to remember that the images need to be as sharp as possible. Also, the depth of field needs to be as big as possible. If you are using a DSLR or a mirrorless camera, you should already know how to achieve this.

You can even use a good smartphone. The iPhone has a good camera that can be used for high-resolution photogrammetry scan with sub-millimeter resolution. Only achieving a good texture is a more difficult step in post-processing.

I know it is always great to have good results from just 40-90 images, but let’s pass this for sports competitions.

You will be getting better results from more images. 300-500 images with camera or twice more if you are using a smartphone. A small number of images is not our choice.

After taking all the necessary images, it is time to get back to the office.

Remove the rotation tag

Almost all photo-editing software automatically rotates the image depending on the camera rotation you have used. So you will see the image the “correct way up” on the screen.

Generally taken, rotated images are not a problem for the modern photogrammetry software, but the vertical photos taken with the camera can be rotated to 90° or 270°, and are treated as different lens groups, which is not always desirable.

That is why it is smart to remove the camera rotation EXIF tag by using the ExifTool or ExiftoolGui.

exiftool –Orientation=1 -n image.ext or

exiftool –Orientation=1 –n *

for batch removing this tag from all images in the selected folder.

If you have dedicated camera that you are using only for photogrammetry, you can disable the image rotation in camera settings.

Pre-processing

What if I told you that any pre-processed dataset is ALWAYS better than not been pre-processed?

Well, your camera is very good, your photos look amazing on Instagram. Yes, it is possible to shoot and make a proper scan with a native iPhone camera app.

But it is better to shoot in RAW and pre-process the images in order to get better alignment, meshes, and textures.

The human eye can see much more details than any photogrammetry software. That is why we need to apply pre-process – we’ll help the photogrammetry software with the identification of details, especially in the shadowy and lighted areas.

My personal preference is to use CameraRAW in Adobe Photoshop in that step. Lightroom, DxO Optics, Affinity Photo or any other free or paid image-processing tools also can be used.

We’ll need to fix the exposure, to lighten the shadowy parts and darken the lighted areas. We also can remove the chromatic aberrations.

Remove the noise and sharpen the image. But be careful as the image CANNOT be distorted in any way.

Fixing the lens distortions is STRICTLY PROHIBITED! Un-distortions in image-processing tools is not photogrammetry correct.

You need to forcibly uncheck the lens distortion corrections in CameraRAW, Lightroom, DxO or the other photo-editing software you use. If not, you will be corrupting the data necessary for performing a correct photogrammetry process.

For most outdoors scans, I use these settings. Yes, they are strong 

Exposure: from +0.0 to +1.5
(depends on the lighting condition. Usually, images are a bit underexposed)

Lights: from -50 to -100
Shadows: from +50 to +100
Lights: from -50 to -100
Blacks: from +20 to +100

You can add more micro contrast with Clarity, Vibrance, and Dehaze… For good low ISO images, the default denoise and sharpen is enough. But “Details” setting is better to set to zero.

I prefer batch image processing through Adobe Photoshop with action script that opens images with CameraRAW. The script can run some additional filters, actions or plugins. And I can slightly adjust the script for each dataset.

The basic idea of preprocessing is to increase the visibility of details in the shadows and light and increase micro contrast.

NOTE: All processes need to be done in 16 bit and exported in 16-bit tiff/png.

Export also can be 8-bit JPG with 100% quality. Which are enough for 16bit textures (yes – 8bit JPG images can give 16bit textures – if you know mathematics and image processing basics).

REALITYCAPTURE:

Sorry if this word makes you cringe, but that no other photogrammetry tool can match the abilities of RealityCapture.

And not only because of its superior speed, highly detailed meshes or the lowest hardware requirements compared to other tools.

All the calculations except the camera alignment doing in RealityCapture out of the core. Imagine a 3 billion polygon mesh on just 16Gb of RAM!

And this power is also available to non-enterprise users.

Maybe the subscription is considered a bit high: 3 months for 99 Euros (33euro/month). But in some countries, this is like having dinner in a restaurant.

The other competitors cost like air-jet or are as slow as molasses.

That is why we’ll use RealityCapture, because we want all possible details from the scan. And this week, but not next month.

Note: even though 16Gb of RAM is enough, 32Gb is more comfortable for working.

– Dude, I don’t like cats!
– You just don’t know how cook them!

There are two reasons why RealityCapture can give bad results:

The first reason – you don’t know how to “Cook” RealityCapture!
The second reason – you don’t know how to “Shoot” images for photogrammetry!

And the second is the main reason why people get bad results using RealityCapture. Garbage in – garbage out!

If you have a good dataset, the results should be incredible. You can get up to 0.5mm resolution for objects 3 meters in height with just an iPhone 6 camera.

You can find the basic workflow of RealityCapture on the developer’s website, or on the SketchFab blog. You can also find more advice in Help in the app itself. The Help menu is context sensitive, and you can easily find explanations about any selected tool.

RealityCapture (RC) offers is the full control of all the photogrammetry steps. The default settings are enough for obtaining good results from a good dataset. However, not perfect datasets need some fine-tuning, and this can be done by tailoring the necessary settings for each dataset.

REALITYCAPTURE SETTINGS

Cache location:

By default, RC uses a system Temp folder for all temporary files (that can range from 120-150Gb for an average 12-24Mpx image dataset). As soon as the files get used in the project, or you work with more than one project at the same time, it is better to set a Cache location to a dedicated big SSD (512Gb or more) or on a fast HDD.

RC runs 90% or all operations out of the core, and all temporary files are written to the Cache directory. These temporary files can be reused by RealityCapture in later steps and can drastically speed up calculations.

My recommendation is to set a cache location to another drive, for example, Z:/RC_TEMP. After this, RC will need to be reloaded.

Camera grouping by EXIF

There is no reason not to use this setting. Enabled grouping can help you avoid the wrong camera and lens estimation. RealityCapture calculates these parameters from all images in one lens group. I strongly recommend this for “one camera with prime lens” turntable scans. Grouping is not necessary for multi-camera rigs.

There are two ways to group images.

1st: Select the “Image” branch in the top left 1D view and on the window that will appear, press group.

2nd: Enable grouping on image import.

Image overlap:

It defines how good the scanned object is covered by the taken images. By default, this is set to Medium.

Medium overlap can be set if all the images in the dataset have at least 70-80% overlap with the nearest images. These overlap settings are not good for most hobbyists as it requires up to 800-1000 images with no gaps in between for simple shape. If you have a small dataset, it will likely be split into many components (groups of connected and aligned images).

That is why, for most uses, the Low overlap is preferred.

Sensor sensitivity:

It defines how fine and how many features (unique spots of the object) RC will try to detect.

Example: for rich granite texture, you can use Low sensitivity. For subtle white marble texture, you will have to use High sensitivity.

Ultra-high sensitivity is rarely used. It can be used in studio facial or body scans. It is better not to set a higher sensitivity than the required or you can have too many false positive errors.

Console view:

On some of the RC windows, I recommend enable console view. It can show some useful information, that can help when we need to contact support or other experts.

REALITYCAPTURE WORKFLOW

ALIGNMENT

If this is an older project, or we are not sure about grouped images, we can tap on “Images” on the top left 1D view and click the “Clear Calibration Groups” – Ungroup and “Group Calibrations w.r.t. EXIF” – Group.

Now, when all the images are grouped by EXIF, change image overlap to Low and check required detector sensitivity.

Next, we can run alignment.

Depending on the number of images and their size, this can take several minutes.

While any operation is going on in RC, what we can see in the console view:

If we use the default settings, 40,000 features per image and in console view, we see some images with less than 40,000 features, this usually points to problems with image quality. Later, we can identify that these images were not properly aligned, lost or the reason for problems is in the mesh or texture.

We can also see a detailed time reports for that process.

If we have shot the dataset correctly 99% or all of the images align in the first step. And only lose a couple of images due to some problem with images. Bad images are most likely the ones with smaller than usual detected features that we have seen in the console view.

If we see in left window more than one component but our dataset was good, then RealityCapture can align all images as one component in the next step. We will need to delete the small components, leaving only the biggest and run align again.

At that point, RC will reuse the features detected in the first step (from data stored in the cache directory), refine the main component camera placement, and find a proper place for cameras from other components we had in the first align step.

If necessary, this can be repeated several times. Even for one component, RealityCapture will refine camera orientation and lens settings, that can increase quality.

If every time after alignment, you see the same count of separate components with the same image count, this means that we’ll need to use the manual control points.

The use of manual control points and components workflow in RC is a separate task. This is used, for example, for scanning huge objects, like buildings and castles, where you scan the interior and exterior. Control point workflow is a separate technique, and it is not fit for this tutorial, as it requires more time to explain.

The basic idea of the control points (CP) workflow:

In a small component, create about 3-4 control points and place them on at least 3-4 images for every CP by dragging from the Control Point window to the selected images.

For refining their placement, click on a placed CP, and while holding this point, by using the zoom wheel, adjust the zoom level and more precisely place the point, then release.

Place the same control points for the biggest components on at least 3 to 4 images.

The left and right arrows keyboard keys can be used for switching between images in the dataset.
Enabling of Residuals in Scene tab for selected 2D viewport will show lines from the current CP to correct place, estimated by the current component camera alignment.

A short video by the developers, showing the control points workflow can be seen here:

After this is done, delete the small component and run alignment again. If you have no more errors in the dataset, the RC will merge and align all the cameras with the one component. You can also delete all the old components if you don’t need them anymore.

Now we have almost all of the images aligned in one component. We can finally refine cameras alignment. To do this, we can ungroup all images and run final alignment. During this step, RC will count all the cameras as different lenses, and adjust the small deviations from the different zooms or sensor/lens shifts from optical stabilization (if you used it).

During this step, we can decrease the minimal re-projection error. By default, it is 2px. While it is good enough for most datasets, my personal preference is to refine cameras to the at least 1px error. Setting it lower will give a more precise alignment in the mean and median errors. Fewer errors – better mesh, better mesh – less work in post-processing.

MESHING

First, we will need to align our scan to the ground by defining a ground plane. The switching between views can be done via the 4, 6, and 2 number keys on the keyboard, or via the scene tab.

Next, we’ll need to define the reconstruction region. Better to be set as small as possible to fit the scanned object or scanning area.

You’ll need to save your project about now, (even though crashes are rare, this is still a good practice).

Next, we can run the meshing in Normal or High mode.

If you have default settings in Normal details mode, RC will calculate the depth maps from 2x downsampled images. The High details mode uses images without downsampling. RealityCapture can reconstruct details up to 0.25px, meaning that meshing in Normal mode sometimes is enough.

If you are sure that dataset is clean and sharp, by meshing in High Detail mode you will get more real details. However, this can take more time.

For a test, we can set 4x down-sample for Normal details mode. This will give enough details to estimate the final mesh quality in High Detail and will be finished pretty fast (20-30 minutes for 5-10mln poly mesh on decent hardware).

The most stressful step for the PC hardware is the Depth maps calculation. At this moment, GPU, CPU and SSD/HDD usage can be up to 100%. And because the system can’t control GPU priority, while this is going on, you will not be able to use the PC (it is better to run the meshing at night, especially in High mode).

The meshing will be complete after a certain amount of time, depending on your hardware configuration.

Never forget to save your project!

You can go several different ways from here. Many users opt to decimate a mesh to 1-1.5mln poly, texturing and upload it to SketchFab directly from RC.

But a raw, untouched mesh from any photogrammetry app is rarely perfect.

That is why we need to fix the raw mesh.

SIMPLIFYING

The common polygon count for a 500-700px 24mpx image mesh in High-resolution mesh is about 100-200mln polygons. It is safe to say that we don’t need all of them.

4K textures can store micro details in a normal map from about 16mln polygons, 8K go to about 67mln polygons maximum. To get the best results, 16mln poly mesh details need to be stored on 8K normal map.

This means that we can simplify a raw 100-200mln poly mesh to 8-16mln polygons without any issues. The decimation algorithm will preserve the sharp details, and we don’t need to worry about this.

To export the mesh, the OBJ is the most versatile format.

FIX RAW MESH TOPOLOGY ERRORS (Obsolete)

If you are using the external software, like ZBrush, you can skip this step.

Very common problems in the raw mesh are the holes and non-manifold edges/vertices (intersections) from the decimation step and/or the T-vertices (zero area polygons).

Because RealityCapture rarely has twisted intersected polygons, we’ll only need to fill the holes. For this purpose, we can use MeshLab or MeshMixer.

In MeshLab – Close Holes. In some cases, where there is a hole with two polygons connected with only one vertex in the middle, and MeshLab can’t close this hole. You will need to delete one or both polygons and run the tool again.

For eliminating the T-vertices, we need to run the tool «Remove T-vertices with edge flip» with a value of 10-20.

After we have closed all the holes and fixed all the t-vertices, we can import the mesh into a surface sculpting app. My personal preference is 3D-Coat.

This app can work with an unfixed mesh, but it is better to fix the topology errors before import.

3D-COAT

This app is amazing for 3D scanned meshes. And it is made by Humans. Comparing it to ZBrush, it has a straight linear workflow where the tool is tool and object is an object. It is simple to use, and you won’t need to Google how to import OBJ files, and find that your object is a tool and you need a tool to fix the tool inside the tool…

One important note: Because we have already fixed the mesh topology, we won’t have to import it as “repair scanned mesh”. This is usually slow and requires watertight meshes.

We need to only import the mesh for SURFACE Sculpting.

Another note: 3D-Coat is amazing with its voxel sculpting. But we must avoid all SurfaceVoxel transformations and Decimation in 3D-Coat. This operation will destroy all our micro-details with some fast screened Poisson mesh reconstruction algorithm.

3D-Coat has a huge advantage in the dynamic mesh tessellation (that you can and need to disable sometimes). It’s called “remove stretching”. This will increase the mesh resolution on the fly and only on areas you are working. The other mesh parts remain untouched.

When importing for surface sculpting, you’ll get a prompt dialog telling you that you need to enable “import without voxelization”, respect negative volume an leave rotated axis. You will need to press Auto Scale, and press Apply.

After this, another pop-up dialog will show, where we’ll need to select Yes for storing the original scale, rotation, and etch that will store the original mesh information in the 3D space. This should not change, or later when we import the mesh back to RC as a fixed mesh, it will not be in the same place as the original mesh.

SCULPTING

This is the most creative part of the work. 3D-Coat offers a wide array of different tools. The most used are Fill, Flatten, Punch and Brush with alphas and with or without dynamic tessellation.

The 3DCoat mesh sculpting basics:

Simple Brush – rise; Fill – fill pits or canals; Flat – flatten hills, etc.

When holding the CTRL button pressed, most tools are reversed.

Brush – dig; Fill – smooth flatten hills (fill them from the other side of the polygons); Flat – rise (careful with this), etc.

A special CTRL + SHIFT will switch any tool to Relax polygons. This works like smooth, is faster and does not increase the polygon density.

Another useful tool is the Tangent smooth tool. It can relax edges between big and small polygons but retain a shape of a surface.

We can fix big polygons with the “Draw” tool with default smooth “alpha”, set depth to 0.5 or less and enable “remove stretching”. Simply paint over the big polygons with or without CTRL and these will be subdivided. Next, select a “random” alpha and increase the depth up to 1 to add some noise similar to a correctly reconstructed surface.

If we have some shifted surface, we can fill the cracks with the Fill tool, or flatten it with CTRL + Fill. Or even smooth them with relaxation. After that, recover the original noise with “Draw” tool with alphas.

A video explaining the basics of how this works can be seen here:

The basic idea of fixing: Increasing the mesh resolution of big polygons from weak surfaces. Fix alignment errors. And recover the missing parts.

After fixing the shape, add some small noise details similar to the original untouched surfaces (in case you are scanning an organic or a stone object). Noise is not necessary for smooth surfaces, but sculpting can require some attention on sharp edges and flat surfaces.

Sometimes, you can split the scanned object into parts, recover the parts separately and later merge it back.

After all these steps, we have obtained a “clean” high-resolution mesh. This needs to be exported back as OBJ.

QUALITY CHECK

We’ll have to check the quality of our clean mesh. Same as with the raw original mesh, we can check for errors in MeshLab or MeshMixer. Check for the same errors: holes, intersected polygons, t-vertices and more that can appear during the sculpting step.

After repair export a clean mesh.

IMPORT BACK TO REALITYCAPTURE

Now we need to import clean mesh to RealityCapture from the reconstruction tab. If we work carefully, the clean mesh will be placed in the exact place as the original raw mesh. If the mesh will have many big differences from the Raw mesh in position, orientation or shape, the photogrammetry software will not be able to calculate the textures correctly.

TEXTURING

The next step in line is texturing the cleaned high-resolution mesh.

Of objects that are 2-3meters in height, it is good to generate a 16k texture resolution that will later be reduced to a final 4k or 8k texture.

RC has a 16k resolution and a “maximal textures count” for UV unwrapping and 16K texture resolution for imported mesh by default. You will first need to check these settings.

For improved texturing results, we need to change the weight in texturing for cameras from the outer loop to a really small value.

By using the camera lasso tool, select these outer cameras, or select all middle and close range cameras and invert the selection. Change the weight to 0.01 (1%) or less of all selected cameras at once in image properties.

Next, we can run Unwrap and then Texturize. Or just Texturize (it will Unwrap automatically if mesh did not have UV maps).

During this step, RC uses GPU so your system may not be responsive to the high GPU loads.

After finishing, we can simply export the clean, high-resolution mesh with textures in the desired format.

DECIMATION

What if I tell you that All the automatic or semi-automatic quads re-topology tools are bad?

Simple low poly topology rules: low poly mesh edges need to follow the high-resolution mesh as much as possible. But lips, eyes, sharp edges, all of these are a problem for the automatic re-topology tools.

Of course, you can use guides. This will certainly increase the final low poly mesh quality. The more guides you use, the more time you’ll need to get the work done. And sometimes, too many guides and methods can fail. In many cases, a manual retopology will work faster than fighting with auto methods and mess up the edges, or have edges cross diagonally through eyes, lips, etc.

As soon as all the real-time engines use triangles, and if we don’t need animate (morph) scan, static objects like stones, sculptures, etc., DON’T NEED quads at all.

We will use the Quadric Edge Collapse Decimation from MeshLab to decimate a high res mesh to a low poly mesh. In a 50% decimation step; from 10mln poly to 5mln poly; from 5 to 2.5, etc.

Don’t worry, as this step does not require too much attention. Just set 0.5 in settings and press “OK” several times.

Via this method, we will preserve most of the details than simply converting a 10mln to 30k poly mesh. Or via any Quad auto-retopology.

When decimating rich 100-150k polygons, it is good to check the mesh for topology errors and if they exist, fix them. If you have had highly noised surfaces that in high resolution can have small bridges or spikes, when edges collapse, these can create self-intersected poly groups. This can cause issues during the last decimation step.

This is easily fixable in MeshLab, in MeshMixer or any other 3D app you use. 100k poly meshes are not a problem to work with.

When all the problems are fixed to decimate mesh again by 50%.

We now have 50-75k mesh to work with.

If the object is a simple, uniform scan like stone, and it has no parts we wish to preserve, we can decimate it another 1 to 2 times and get a 10k-30k polygon mesh.

If our object is a statue or a statue with a base, we can use selective decimation.

Select the parts high in detail that are important: face, palms, fingers… We can use the marquee tool for rough selection, and the Brush selection tool for polygon selection.

After we have all the parts we want to preserve selected, we can apply Select Inverse.

And apply 50% decimate only to these parts with “Simplify only selected faces” option ON.

If this is a scan of a statue on a base, we can once again select only the base and decimate it for additional 50%.

You can select the bottom surface of the base and decimate it again, especially if this surface is not seen in the scan.

We should make a final check of the mesh for any topology errors, including the t-vertices.

UV

One of the main reasons why many people prefer quads is because the edge loops allow for a much faster UV Unwrap. Decimated meshes don’t have these, but this is not a problem.

The 3D-Coat app is very good for scans, and not only because of the dynamic tessellation. But because it has an amazing UV Path tool. This allows paths to be “drawn” through any edges and their flow can be easily adjusted.

So, we’ll just open low poly mesh with “UV map mesh” in 3D-Coat.

Here is a video explaining how UV Path works:

Another app with similar abilities is Blender. It can be used to “draw” Path between two selected vertices, but it does not allow simple adjustments as the 3D-Coat does.

Depending on the topology, the UV path tool needs several minutes for the UV Unwrap mesh.

Please follow a well-known basic: cut UV seams in hidden areas, in caves or on sharp edges.

With pressing Unwrap, we have UV Unwrapping. And can now Pack UV islands for better utilization of the UV space using several tools available in the 3D Coat.

We’ll need to switch to “Tweak room” for a while so the app will store all the UV changes. And switch back to the UV room to export the mesh in your preferred format.

BAKING

After this, we’ll need to prepare our low-poly mesh. For most scans, it is a better option to set all the vertices or all polygons smoothed, depending on the 3D app used. The real-time rendering engine will treat the whole mesh as smoothed. This is the simplest and easiest way to avoid issues with the tangent space normal map later.

For baking, I recommend xNormal.

It is free and has many advantages comparing to commercial tools. Allows setting the smoothness of the polygons on both high resolution or low poly meshes. It does not require 4-8Gb of VRAM for baking 8k textures. Can bake extremely high-resolution meshes and huge textures using CPU and RAM.

Open xNormal and load a high-resolution mesh by drag-and-drop to the high-resolution meshes window. Set the “Average normals“. Right-click, select ″Base texture for bake″ and add texture.

Load a low-resolution mesh to the low-resolution meshes window; set the “average normals“.

We can now switch to the Tools window and select the ray distance calculator. Running this will calculate the min/max distance between polygons of low poly and high-resolution meshes.

If we have done everything right in the previous steps, we should expect really small numbers like 0.05…

Sometimes a really big maximum distance may come up; this can be due to a clash of one or two polygons, or spikes on a high-resolution mesh. In that case, better check baked texture for errors. And fix texture in Photoshop if there are only small errors, or adjust distance settings in xNormal and bake again.

In most of my scans, I just set 0.1 for the min and max distance in the low poly mesh window setting.

After this, we can select the baking settings, texture size, render basket size and anti-aliasing settings.

It is best to set a maximal 4x sampling for the normals and base color.

For an ambient occlusion and cavity maps, because these are less visible and require too much computation time, we can set a value of 1x.

Enable Normal map baking (check that it will calculate in tangent space) and Base color map.

In the xNormal global settings, you can select between 8/16/32 bits for different image formats. I prefer 16bit and tiff.

Bake Normal map and Base color maps

After this, open a result normal map in Photoshop. Check all the areas where you have small details, caves or multiple small elements like a palm of a hand. If we set the wrong min/max distances, we’ll not be able to see continuous colors. In such case, adjustment of the min/max settings is needed. If everything is fine, next is the removing of the alpha channel and save the image with the maximum compression as PNG (lossless, supports 16bit and has better compression than ZIP tiff).

For using in SketchFab or Marmoset Viewer, it is better to down-sample the map to 4k (if the original is in 8k) and save a copy as JPEG with a maximum quality (image will get converted to 8bit during saving).

Back in xNormal, disable the normals and base maps, and enable ambient occlusion and cavity maps. Then change x4 to x1. Next, bake these maps. This process is very slow, so you can minimize xNormal, and decrease its priority in the Task Manager.

You can start work on the base texture.

After xNormal finishes baking the AO and cavity maps, open the map in Photoshop, delete the alpha channel and convert RGB to Grayscale.

Double check for errors in areas with small details. Next, save it as 8k PNG for archive and down-sample it to 2k or 1k for SketcFab or Marmoset (as the shadow details don’t require high resolution, and will load faster). And save with maximum quality JPG.

BASE COLOR TEXTURE

If your shots are correct and your cameras aligned correctly, the texture should be quite good and only needs fixing in areas where the camera has captured the surfaces from a high angle or where reconstructing invisible surfaces.

The best tool for this purpose is the Healing Brush and Patch from Content-Aware tools in Photoshop. These allow you to clone and fuse a texture from another part of the image to the part we want to restore/recreate.

This process requires technical and artistic skills, and especially attention to details. You’ll need to find similar, good parts of the image that can be used as a source, and follow (as much as possible) the UV map polygon distortions.

You’ll need to remember that all UV islands have extended borders around them that require proper rendering. A 1 black pixel border on a polygon edge will be visible on a close-up render. Or used in real-time engines, small level MIP-maps can make such an edge highly visible. This is why the border is so important (and needs to be 8px or even bigger).

For recovering large areas, you can use the Content-Aware Fill or Patch tool.

If you ever need to recover a texture that is split between UV islands, this can cause a bit of a problem. In that situation, you will need to create another temporary UV layout where the edge merged to one UV island and you can continuously “paint” on a new texture. First, bake the texture from a high resolution to the temporary UV layout texture, and then recover the parts. After that, bake the corrected texture from the temporary UV map to the working UV map back.

You’ll need to remember that all new texture conversions remove small details. That is why it is good to use a higher resolution for temporary UV maps. You even can increase the size of UV island that you need a fix. And use double size resolution – 16k temporary map for 8k working map.

After, merge the fixed and original textures in Photoshop with the layers and masks, by masking all areas that were not touched in the temporary map.

Because this is mostly done in Photoshop (or another preferred tool), we will not cover it in this tutorial, but I hope you understand the basic idea.

ASIDE

You have probably found it a bit strange to use 8bit source images and texture on the high-resolution mesh and baking it to 16bit. Yes, you saw that right.

If you de-shadow and de-light the image in the pre-processing step, then the difference between the 16 and 8bit is just the noise in the LSB part of the image. That is why we can use 8bit images for texturing.

Any texture generation or baking step have sub-pixel transformations, especially if we bake from 16/32k texture to 8/4K. This step will always add additional bits enough for 16bit.

All the possible details from the shadows that can require 16bit pipeline are already recovered in the preprocessing step, and we don’t increase any additional details on the shadows that exist on the baked texture.

Most likely, due to this workflow, I didn’t have success with the Unity Unlit tool, because the textures are already de-lighted and require only small touches and color corrections.

Any color correction of textures, it is better to work in 16bit.

If you know how to use the 16/32bit pipeline better, why read this tutorial? 

SKETCHFAB

At last, we have our low poly OBJ mesh.

For Sketchfab we’ll use 1-2K maps for the AO and cavity, 2-4K for normals and albedo (and if used 2-4K maps for specular/glossiness).

We can pack all files with 7zip (as a .zip or .7z archive) and upload them to SketchFab. Or simply pack the OBJ and MTL files, or only FBX mesh file. You can later add all the textures via the SketcFab 3D Editor.

A useful trick: If you want to upload a “scene” to SketchFab containing more than one object, you will need to export all meshes as OBJ and create an empty file sketchfab.zbrush, then upload as an archive. In such case, SketchFab will treat the separate objects as one scene file.

SKETCHFAB 3D EDITOR

I hope you have successfully added all the required materials to your model.

Normals from xNormal usually don’t require a Y-flip.

Ambient Occlusion looks better with “Occlude specularity” ON and at 80-90%. Cavity with 50%.

Next, we need to talk about the lights and background.

Because our scene is a 3D scan, the lights, environment, and background need to be selected wisely. We want to show a good picture or mood, but at the same time show the scanned object and all its details. Sometimes seeing a good scan is the only way for people around the world to see some unique objects or world heritage artifacts.

This means that the background should not clash with the hero of our story – scanned the object. The used lights need to contribute to the details of the object.

Light can be generated from an environment map and can create a shadow, so you don’t need to use additional light.

But If you can set up proper lights, your model will look very realistic.

I personally love the dark or light grey clean SketchFab background. But seeing interesting combinations of photos of scanned places and objects can look good too.

The Blurred Environment map as a background usually looks cheap and boring, and you’ll have to work on creating something better.

POST-PROCESSING

A post-process filter that can be used without causing problems on the scans is the Sharpen, and its default settings are usually enough.

SSAO (Screen space ambient occlusion) can look very interesting, but it is a bit strong for my taste. Also, as we have pre-baked AO and cavity, the additional SSAO effect is not needed.

I strongly recommend avoiding using DOF, aberrations, or other strong post-processing effects that can destroy the charm of a good scan.

To be honest: If you see any of these effects in combination with the Shadeless mode, this most likely means that the scan has gone bad or has a low quality, and the author needs to hide as much of it as possible.

The same goes for camera restrictions. In my opinion, if you don’t want to show all 360 degrees of the scanned object – simply use still renders or video.

Next, save the scene. This can take some time if we have added textures (they are needed to be uploaded).

Then exit to view mode.

Do not skip adding a nice description, and maybe some details regarding the used software and hardware like cameras, lens, number of images, etc.

Also, add tags with names of used software. This is helpful to other users that are searching photogrammetry tools and helps them understand the Pros and Cons of the modern photogrammetry software by browsing through the SketchFab.

Finally, Scale and Rotate the scene in the view window, and save the correct view with the left top Save View button.

Double check all the text, press Publish and wait for likes or maybe a Staff Pick .

RESOURCES

If you want more information about RealityCapture, photogrammetry or 3Dscanning welcome to the Capturing Reality Forum.

Or find more on Facebook in these amazing groups:

Here you can find not only me but also many other experts in photogrammetry and 3D Scanning, VFX and VR/AR.

Be sure to search and read first, as you can find many useful things and the most questions have been already asked and answered.

CONTACTS

If you like my work and tutorials you can follow me on:

Vlad Kuzmin, Photogrammetry, 3D, UI/UX & Graphic Design

For any inquiries and ideas: dalics@gmail.com

ABOUT THE AUTHOR

Vlad Kuzmin

Born in USSR, Leningrad (Saint Petersburg) during 1970s. Expert in the fields of design, UI/UX design, ads, photogrammetry, and 3d visuals. Certified Adobe ACE/ RealityCapture and photogrammetry expert. Quickly find and mixes different techniques from various branches of design to achieve success in the most efficient way.

During the last 5 years live and works in Japan. Worked as a UI/UX designer in a Japanese video game company. During the last year devoted his time to photogrammetry in the Avatta studio (avatta.net) in Tokyo. Avatta’s clients include Sega, Sony Computer Entertainment, NHK, Amazon Japan, etc. The company possesses the best in the country full body and facial camera rigs.