Despite the recent advances in automatic methods for computing skinning weights, manual intervention is still indispensable to produce high quality character deformation. However, current modeling software does not provide efficient tools for the manual definition of skinning weights. A widely used paint-based interface gives a users high degrees of freedom, but at the expense of significant efforts and time. This paper presents a novel interface for editing skinning weights based on splines, which represent the isolines of skinning weights on a mesh. When a user drags a small number of spline anchor points, our method updates the shape of the isolines and smoothly interpolates or propagates the weights while respecting the given iso-value on the spline. We introduce several techniques to enable the interface to run in real-time, and propose a particular combination of functions that generates appropriate skinning weight distribution over the surface given splines. Users can create skinning weight from the scratch by using our method. In addition, we present the spline and the gradient fitting methods that closely approximate an initial weight made by an automatic method or a paint interface, so that a user can modify the given weight with our spline interface.We show the effectiveness of our spline-based interface through a number of test cases.
We propose FaceVR, a novel image-based method that enables video teleconferencing in VR based on self-reenactment. State-of-the-art face tracking methods in the VR context are focused on the animation of rigged 3d avatars [Li et al. 2015; Olszewski et al. 2016]. While they achieve good tracking performance the results look cartoonish and not real. In contrast to these model-based approaches, FaceVR enables VR teleconferencing using an image-based technique that results in nearly photo-realistic outputs. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD), as well as a new data-driven approach for eye tracking from monocular videos. Based on reenactment of a prerecorded stereo video of the person without the HMD, FaceVR incorporates photo-realistic re-rendering in real time, thus allowing artificial modifications of face and eye appearances. For instance, we can alter facial expressions or change gaze directions in the prerecorded target video. In a live setup, we apply these newly-introduced algorithmic components.
We present the first computational approach that can transform 3D meshes, created by traditional modeling programs, directly into instructions for a computer-controlled knitting machine.Knitting machines are able to robustly and repeatably form knitted 3D surfaces from yarn, but have many constraints on what they can fabricate. Given user-defined starting and ending points on an input mesh, our system incrementally builds a helix-free, quad-dominant mesh with uniform edge lengths, runs a tracing procedure over this mesh to generate a knitting path, and schedules the knitting instructions for this path in a way that is compatible with machine constraints. We demonstrate our approach on a wide range of 3D meshes.
We present an efficient spacetime optimization method to automatically generate animations for a general volumetric, elastically deformable body. Our approach can model the interactions between the body and the environment and automatically generate active animations. We model the frictional contact forces using contact invariant optimization and the fluid drag forces using a simplified model. To handle complex objects, we use a reduced deformable model and present a novel hybrid optimizer to search for the local minima efficiently. This allows us to use long-horizon motion planning to automatically generate animations such as walking, jumping, swimming, and rolling. We evaluate the approach on different shapes and animations, including deformable body navigation and combining with an open-loop controller for realtime forward simulation.
Retouching can significantly elevate the visual appeal of photos, but many casual photographers lack the expertise to do this well. To address this problem, previous works have proposed automatic retouching systems based on supervised learning from paired training images acquired before and after manual editing. As it is difficult for users to acquire paired images that reflect their retouching preferences, we present in this paper a deep learning approach that is instead trained on unpaired data, which is much easier to collect. Our system is formulated using deep convolutional neural networks that learn to apply different retouching operations on an input image. Network training with respect to various types of edits is enabled by modeling these retouching operations in a unified manner as resolution-independent differentiable filters. To apply the filters in a proper sequence and with suitable parameters, we employ a deep reinforcement learning approach that learns to make decisions on what action to take next, given the current state of the image. In contrast to many deep learning systems, ours provides users with an understandable solution in the form of conventional retouching edits, rather than just a "black box" result. Through quantitative comparisons and user studies, we show that our retouching results surpass those of strong baselines.
We present an integrated approach for reconstructing high-fidelity 3D models using consumer RGB-D cameras. RGB-D registration and reconstruction algorithms are prone to errors from scanning noise, making it hard to perform 3D reconstruction accurately. The key idea of our method is to assign a probabilistic uncertainty model to each depth measurement, which then guides the scan alignment and depth fusion. This allows us to effectively handle inherent noise and distortion in depth maps while keeping the overall scan registration procedure under the iterative closest point (ICP) frame-work for simplicity and efficiency. We further introduce a local-to-global, submap-based, and uncertainty-aware global pose optimization scheme to improve scalability and guarantee global model consistency. Finally, we have implemented the proposed algorithm on the GPU, achieving real-time 3D scanning frame rates and updating the reconstructed model on-the-fly. Experimental results on simulated and real-world data demonstrate that the proposed method outperforms state-of-the-art systems in terms of the accuracy of both recovered camera trajectories and reconstructed models.
In geometry processing, smoothness energies are commonly used to model scattered data interpolation, dense data denoising, and regularization during shape optimization. The squared Laplacian energy is a popular choice of energy and has a corresponding standard implementation: squaring the discrete Laplacian matrix. For compact domains, when values along the boundary are not known in advance, this construction bakes in low-order boundary conditions. This causes the geometric shape of the boundary to strongly bias the solution. For many applications, this is undesirable. Instead, we propose using the squared Frobenius norm of the Hessian as a smoothness energy. Unlike the squared Laplacian energy, this energy's natural boundary conditions (those that best minimize the energy) correspond to meaningful high-order boundary conditions. These boundary conditions model free boundaries where the shape of the boundary should not bias the solution locally. Our analysis begins in the smooth setting and concludes with discretizations using finite-differences on 2D grids or mixed finite elements for triangle meshes. We demonstrate the core behavior of the squared Hessian as a smoothness energy for various tasks.
Simulating (elastically) deformable models that can collide with each other and with the environment remains a challenging task. The resulting contact problems can be elegantly approached using Lagrange multipliers to represent the unknown magnitude of the response forces. Typical methods construct and solve a Linear Complementarity Problem (LCP) to obtain the response forces. This requires the inverse of the generalized mass matrix, which is in general hard to obtain for deformable-body problems. In this paper we tackle such contact problems by directly solving the Mixed Linear Complementarity Problem (MLCP) and omitting the construction of an LCP matrix. Since the MLCP is equivalent to a convex quadratic program subject to inequality constraints, we propose to use a Conjugate Residual (CR) solver as the backbone of our collision-response system. We also propose a simple yet efficient preconditioner that ensures faster convergence. Finally, our approach is faster than existing methods (at the same accuracy), and it allows accurate treatment of friction.
Sony Imageworks implementation of the Arnold renderer is a fork of the commercial product of the same name, which has evolved independently since around 2009. This paper focuses on the design choices that are unique to this version and have tailored the renderer to the specific requirements of film rendering at our studio. We detail our approach to subdivision surface tessellation, hair rendering, sampling and variance reduction techniques, as well as a description of our open source texturing and shading language components. We also discuss some ideas we once implemented but have since discarded to highlight the evolution of the software over the years
Walt Disney Animation Studios has transitioned to path-traced global illumination as part of a progression of brute-force physically based rendering in the name of artist efficiency. To achieve this without compromising our geometric or shading complexity, we built our Hyperion renderer based on a novel architecture that extracts traversal and shading coherence from large, sorted ray batches. In this paper we describe our architecture and discuss our design decisions. We also explain how we are able to provide artistic control in a physically based renderer, and we demonstrate through case studies how we have benefited from having a proprietary renderer that can evolve with production needs.
Accurate simulation of light transport in participating media is expensive. However, given the band-limiting effect of scattering in media, the radiance signal is in general smooth. This makes this kind of light transport very suitable for adaptive sampling and reconstruction techniques. In this work we present a novel algorithm for volumetric light transport. We adaptively sample or interpolate radiance from sparse points in the medium using a second-order Hessian-based error metric to determine when interpolation is appropriate. We derive our metric from each point's incoming light field, computed by using a proxy triangulation-based representation of the radiance reflected by the surrounding medium and geometry. We use this representation to efficiently compute the first- and second-order derivatives of the radiance at the cache points while accounting for occlusion changes. To validate the quality of our approach, we propose a self-contained two-dimensional model for light transport in media: We demonstrate in this space with reduced dimensionality that our work significantly outperforms previous radiance caching algorithms in terms of both derivatives estimates and final radiance extrapolation. Then we show how our results generalize to practical three-dimensional scenarios, where we show much better results while reducing computation time up to 30%.
We present the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video. Our approach reconstructs articulated human skeleton motion as well as medium-scale non-rigid surface deformations in general scenes. Human performance capture is a challenging problem due to the large range of articulation, potentially fast motion, and considerable non-rigid deformations, even from multi-view data. Reconstruction from monocular video alone is drastically more challenging, since strong occlusions and the inherent depth ambiguity lead to a highly ill-posed reconstruction problem. We tackle these challenges by a novel approach that employs sparse 2D and 3D human pose detections from a convolutional neural network using a batch-based pose estimation strategy. Joint recovery of per-batch motion allows to resolve the ambiguities of the monocular reconstruction problem based on a low dimensional trajectory subspace. In addition, we propose refinement of the surface geometry based on fully automatically extracted silhouettes to enable medium-scale non-rigid alignment. We demonstrate state-of-the-art performance capture results that enable exciting applications such as video editing and free viewpoint video, previously infeasible from monocular video. Our qualitative and quantitative evaluation demonstrates that our approach significantly outperforms previous monocular methods in terms of accuracy, robustness and scene complexity that can be handled.
This article presents an iterative backward-warping technique and its applications. It predictively synthesizes depth buffers for novel views. Our solution is based on the fixed-point iteration that converges quickly in practice. Unlike this previous technique, our solution is a purely backward warping without using bidirectional sources. To efficiently seed the iterative process, we also propose a tight bounding method for motion vectors. Non-convergent depth holes are inpainted via deep depth buffers. Our solution works well with arbitrarily distributed motion vectors under moderate motions. Many scenarios can benefit from our depth warping. As an application, we propose a highly scalable image-based occlusion-culling technique, achieving a significant speedup compared to the state of the art. We also demonstrate the benefit of our solution in multi-view soft-shadow generation.
A large number of imaging and computer graphics applications require localized information on the visibility of image distortions. Existing image quality metrics are not suitable for this task as they provide a single quality value per image. Existing visibility metrics produce visual difference maps, and are specifically designed for detecting just noticeable distortions but their predictions are often inaccurate. In this work, we argue that the key reason for this problem is the lack of large image collections with a good coverage of possible distortions that occur in different applications. To address the problem, we collect an extensive dataset of reference and distorted image pairs together with user markings indicating whether distortions are visible or not. We propose a statistical model that is designed for the meaningful interpretation of such data, which is affected by visual search and imprecision of manual marking. We use our dataset for training existing metrics and we demonstrate that their performance significantly improves. We show that our dataset with the proposed statistical model can be used to train a new CNN-based metric, which outperforms the existing solutions. We demonstrate the utility of such a metric in visually lossless JPEG compression, super-resolution and watermarking.
Quadrotor drones equipped with high quality cameras have rapidely raised as novel, cheap and stable devices for filmmakers. While professional drone pilots can create aesthetically pleasing videos in short time, the smooth and cinematographic control of a camera drone remains challenging for most users, despite recent tools that either automate part of the process or enable the manual design of waypoints to create drone trajectories. This paper proposes to move a step further towards more accessible cinematographic drones by designing techniques to automatically or interactively plan quadrotor drone motions in 3D dynamic environments that satisfy both cinematographic and physical quadrotor constraints. We first propose the design of a Drone Toric Space as a dedicated camera parameter space with embedded constraints and derive some intuitive on-screen viewpoint manipulators. Second, we propose a specific path planning technique which ensures both that cinematographic properties can be enforced along the path, and that the path is physically feasible by a quadrotor drone. At last, we build on the Drone Toric Space and the specific path planning technique to coordinate the motion of multiple drones around dynamic targets. A number of results then demonstrate the interactive and automated capacities of our approaches on a number of use-cases.
Pixar's RenderMan renderer is used to render all of Pixar's films, and by many film studios to render visual effects for live-action movies. RenderMan started as a scanline renderer based on the Reyes algorithm, and was extended over the years with ray tracing and several global illumination algorithms. This paper describes the modern version of RenderMan, a new architecture for an extensible and programmable path tracer with many features that are essential to handle the fiercely complex scenes in movie production. Users can write their own materials using a bxdf interface, and their own light transport algorithms using an integrator interface -- or they can use the materials and light transport algorithms provided with RenderMan. Complex geometry and textures are handled with efficient multi-resolution representations, with resolution chosen using path differentials. We trace rays and shade ray hit points in medium-sized groups, which provides the benefits of SIMD execution without excessive memory overhead or data streaming. The path-tracing architecture handles surface, subsurface, and volume scattering. We show examples of the use of path tracing, bidirectional path tracing, VCM, and UPBP light transport algorithms. We also describe our progressive rendering for interactive use and our adaptation of denoising techniques.
Distributing a simulation across many machines can drastically speed up computations and increase detail. The computing cloud provides tremendous computing resources, but weak service guarantees force programs to manage significant system complexity: nodes, networks, and storage occasionally perform poorly or fail. We describe Halo, a system that automatically distributes grid-based and hybrid simulations across cloud computing nodes. The main simulation loop is written as simple sequential code and launches distributed computations across many cores. The simulation on each core runs as if it is stand-alone: Halo automatically stitches these multiple simulations into a single, larger one. To do this efficiently, Halo introduces a four-layer data model that translates between the contiguous, geometric objects used by simulation libraries and the replicated, fine-grained objects managed by its underlying cloud computing runtime. Using PhysBAM particle level-set fluid simulations, we demonstrate that Halo can run higher detail simulations faster, distribute simulations on up to 512 cores and run enormous simulations ($1024^3$ cells). Halo automatically manages these distributed simulations, balancing load across nodes and recovering from failures. Implementations of PhysBAM water and smoke simulations as well as an open source heat-diffusion simulations show Halo is general and can support complex simulations.
Subtractive manufacturing technologies, such as 3-axis milling, add a useful tool to the digital manufacturing arsenal. However, each milling pass using such machines can only carve a single height-field, defined with respect to the machine tray. We enable fabrication of general shapes using 3-axis milling, providing a robust algorithm to decompose any shape into a few height-field blocks. Such blocks can be manufactured with a single milling pass and then assembled to form the target geometry. Computing such decompositions requires solving a complex discrete optimization problem, variants of which are known to be NP-hard. We propose a two-step process, based on the observation that if the height-field directions are constrained to the major axes we can guarantee a valid decomposition starting from a suitable surface segmentation. Our method first produces a compact set of large, possibly overlapping, height-field blocks that jointly cover the model surface. We then compute an overlap-free decomposition via a combination of cycle elimination and topological sorting on a graph. The algorithm produces a compact set of height-field blocks that jointly describe the input model and satisfy all manufacturing constraints. We demonstrate our method on a range of inputs, and showcase several real life models manufactured using our technique.
The Manuka rendering architecture has been designed to enable visually rich computer generated imagery for visual effects in movie production. This means supporting extremely complex geometry, texturing and shading. In the current generation of renderers, it is essential to support very accurate global illumination as a means to naturally tie together different assets in a picture. This is achieved with Monte Carlo path tracing, using a paradigm often called shade on hit, in which the renderer alternates tracing rays with running shaders on the various ray hits. The shaders take the role of generating the inputs of the local material structure which is then used by path sampling logic to evaluate contributions and to inform what further rays to cast through the scene. We propose a shade before hit paradigm instead and minimise I/O strain on the system, leveraging locality of reference by running pattern generation shaders before we execute light transport simulation by path sampling. We describe a full architecture built around this approach, featuring spectral light transport and a flexible implementation of multiple importance sampling, resulting in a system able to support a comparable amount of extensibility to what made the reyes rendering architecture successful over many decades.
Arnold is a physically-based renderer for feature-length animation and visual effects. Arnold has been created to take on the challenge of making the simple and elegant approach of brute-force Monte Carlo path tracing practical for production rendering. Achieving this requires building a robust piece of ray tracing software that can ingest large amounts of geometry with detailed shading and lighting and produce images with high fidelity, while scaling well with the available memory and processing power. Arnold's guiding principles are to expose as few controls as possible, provide rapid feedback to artists, and adapt to various production workflows. In this paper we describe its architecture with a focus on the design and implementation choices made during its evolutionary development to meet the aforementioned requirements and goals. Arnold's workhorse is a unidirectional path tracer that avoids the use of hard to manage and artifact-prone caching and sits on top of a ray tracing engine optimized to shoot and shade billions of spatially incoherent rays throughout a scene. A comprehensive API provides the means to configure and extend the system's functionality, to describe a scene, render it, and write the results to dis
Blue noise sampling has been adopted for many graphics applications, but mostly in low dimensional spaces. Extensions to higher dimensional spaces remain under-explored. We present a blue noise sampling approach with good quality and performance across different dimensions. This is achieved by spoke-dart sampling, combining the locality of advancing front with the dimension-reduction of hyperplanes: samples are chosen from radial line segments through prior samples. New sampling rules over these segments create step blue noise, with a flat frequency spectrum, which avoids artifacts in high-frequency regions. Our algorithm creates blue noise directly and scales well to high dimensions. This enables several applications such as Delaunay graph construction, global optimization, rendering, and motion planning.
Everyone uses the sense of touch to explore the world, and roughness is one of the most important qualities in tactile perception. Roughness is a major identifier for judgments of material composition, comfort and friction, and it is tied closely to manual dexterity. The advent of high-resolution 3D printing technology provides the ability to fabricate arbitrary 3D textures with surface geometry that confers haptic properties. In this work, we address the problem of mapping object geometry to tactile roughness. We fabricate a set of carefully designed stimuli and use them in experiments with human subjects to build a perceptual space for roughness. We then match this space to a quantitative model obtained from strain fields derived from elasticity simulations of the human skin contacting the texture geometry, drawing from past research in neuroscience and psychophysics. We demonstrate how this model can be applied to predict and alter surface roughness, and we show several applications in the context of fabrication.