News Blog Paper China
SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance Normalization2021-05-13   ${\displaystyle \cong }$
Artistic style transfer aims to transfer the style characteristics of one image onto another image while retaining its content. Existing approaches commonly leverage various normalization techniques, although these face limitations in adequately transferring diverse textures to different spatial locations. Self-Attention-based approaches have tackled this issue with partial success but suffer from unwanted artifacts. Motivated by these observations, this paper aims to combine the best of both worlds: self-attention and normalization. That yields a new plug-and-play module that we nameSelf-Attentive Fac-torized Instance Normalization(SAFIN). SAFIN is essentially a spatially adaptive normalization module whose parameters are inferred through attention on the content and style image. We demonstrate that plugging SAFIN into the base network of another state-of-the-art method results in enhanced stylization. We also develop a novel base network composed of Wavelet Transform for multi-scale style transfer, which when combined with SAFIN, produces visually appealing results with lesser unwanted textures.
Style Transfer by Rigid Alignment in Neural Net Feature Space2019-09-26   ${\displaystyle \cong }$
Arbitrary style transfer is an important problem in computer vision that aims to transfer style patterns from an arbitrary style image to a given content image. However, current methods either rely on slow iterative optimization or fast pre-determined feature transformation, but at the cost of compromised visual quality of the styled image; especially, distorted content structure. In this work, we present an effective and efficient approach for arbitrary style transfer that seamlessly transfers style patterns as well as keep content structure intact in the styled image. We achieve this by aligning style features to content features using rigid alignment; thus modifying style features, unlike the existing methods that do the opposite. We demonstrate the effectiveness of the proposed approach by generating high-quality stylized images and compare the results with the current state-of-the-art techniques for arbitrary style transfer.
Fast Patch-based Style Transfer of Arbitrary Style2016-12-13   ${\displaystyle \cong }$
Artistic style transfer is an image synthesis problem where the content of an image is reproduced with the style of another. Recent works show that a visually appealing style transfer can be achieved by using the hidden activations of a pretrained convolutional neural network. However, existing methods either apply (i) an optimization procedure that works for any style image but is very expensive, or (ii) an efficient feedforward network that only allows a limited number of trained styles. In this work we propose a simpler optimization objective based on local matching that combines the content structure and style textures in a single layer of the pretrained network. We show that our objective has desirable properties such as a simpler optimization landscape, intuitive parameter tuning, and consistent frame-by-frame performance on video. Furthermore, we use 80,000 natural images and 80,000 paintings to train an inverse network that approximates the result of the optimization. This results in a procedure for artistic style transfer that is efficient but also allows arbitrary content and style images.
3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer2020-11-26   ${\displaystyle \cong }$
Transferring the style from one image onto another is a popular and widely studied task in computer vision. Yet, learning-based style transfer in the 3D setting remains a largely unexplored problem. To our knowledge, we propose the first learning-based generative approach for style transfer between 3D objects. Our method allows to combine the content and style of a source and target 3D model to generate a novel shape that resembles in style the target while retaining the source content. The proposed framework can synthesize new 3D shapes both in the form of point clouds and meshes. Furthermore, we extend our technique to implicitly learn the underlying multimodal style distribution of the individual category domains. By sampling style codes from the learned distributions, we increase the variety of styles that our model can confer to a given reference object. Experimental results validate the effectiveness of the proposed 3D style transfer method on a number of benchmarks.
Neural arbitrary style transfer for portrait images using the attention mechanism2020-02-17   ${\displaystyle \cong }$
Arbitrary style transfer is the task of synthesis of an image that has never been seen before, using two given images: content image and style image. The content image forms the structure, the basic geometric lines and shapes of the resulting image, while the style image sets the color and texture of the result. The word "arbitrary" in this context means the absence of any one pre-learned style. So, for example, convolutional neural networks capable of transferring a new style only after training or retraining on a new amount of data are not con-sidered to solve such a problem, while networks based on the attention mech-anism that are capable of performing such a transformation without retraining - yes. An original image can be, for example, a photograph, and a style image can be a painting of a famous artist. The resulting image in this case will be the scene depicted in the original photograph, made in the stylie of this picture. Recent arbitrary style transfer algorithms make it possible to achieve good re-sults in this task, however, in processing portrait images of people, the result of such algorithms is either unacceptable due to excessive distortion of facial features, or weakly expressed, not bearing the characteristic features of a style image. In this paper, we consider an approach to solving this problem using the combined architecture of deep neural networks with a attention mechanism that transfers style based on the contents of a particular image segment: with a clear predominance of style over the form for the background part of the im-age, and with the prevalence of content over the form in the image part con-taining directly the image of a person.
Learning from Multi-domain Artistic Images for Arbitrary Style Transfer2019-04-14   ${\displaystyle \cong }$
We propose a fast feed-forward network for arbitrary style transfer, which can generate stylized image for previously unseen content and style image pairs. Besides the traditional content and style representation based on deep features and statistics for textures, we use adversarial networks to regularize the generation of stylized images. Our adversarial network learns the intrinsic property of image styles from large-scale multi-domain artistic images. The adversarial training is challenging because both the input and output of our generator are diverse multi-domain images. We use a conditional generator that stylized content by shifting the statistics of deep features, and a conditional discriminator based on the coarse category of styles. Moreover, we propose a mask module to spatially decide the stylization level and stabilize adversarial training by avoiding mode collapse. As a side effect, our trained discriminator can be applied to rank and select representative stylized images. We qualitatively and quantitatively evaluate the proposed method, and compare with recent style transfer methods. We release our code and model at https://github.com/nightldj/behance_release.
Unpaired Image Translation via Adaptive Convolution-based Normalization2019-11-29   ${\displaystyle \cong }$
Disentangling content and style information of an image has played an important role in recent success in image translation. In this setting, how to inject given style into an input image containing its own content is an important issue, but existing methods followed relatively simple approaches, leaving room for improvement especially when incorporating significant style changes. In response, we propose an advanced normalization technique based on adaptive convolution (AdaCoN), in order to properly impose style information into the content of an input image. In detail, after locally standardizing the content representation in a channel-wise manner, AdaCoN performs adaptive convolution where the convolution filter weights are dynamically estimated using the encoded style representation. The flexibility of AdaCoN can handle complicated image translation tasks involving significant style changes. Our qualitative and quantitative experiments demonstrate the superiority of our proposed method against various existing approaches that inject the style into the content.
Filter Style Transfer between Photos2020-07-15   ${\displaystyle \cong }$
Over the past few years, image-to-image style transfer has risen to the frontiers of neural image processing. While conventional methods were successful in various tasks such as color and texture transfer between images, none could effectively work with the custom filter effects that are applied by users through various platforms like Instagram. In this paper, we introduce a new concept of style transfer, Filter Style Transfer (FST). Unlike conventional style transfer, new technique FST can extract and transfer custom filter style from a filtered style image to a content image. FST first infers the original image from a filtered reference via image-to-image translation. Then it estimates filter parameters from the difference between them. To resolve the ill-posed nature of reconstructing the original image from the reference, we represent each pixel color of an image to class mean and deviation. Besides, to handle the intra-class color variation, we propose an uncertainty based weighted least square method for restoring an original image. To the best of our knowledge, FST is the first style transfer method that can transfer custom filter effects between FHD image under 2ms on a mobile device without any textual context loss.
Structured Content Preservation for Unsupervised Text Style Transfer2018-10-31   ${\displaystyle \cong }$
Text style transfer aims to modify the style of a sentence while keeping its content unchanged. Recent style transfer systems often fail to faithfully preserve the content after changing the style. This paper proposes a structured content preserving model that leverages linguistic information in the structured fine-grained supervisions to better preserve the style-independent content during style transfer. In particular, we achieve the goal by devising rich model objectives based on both the sentence's lexical information and a language model that conditions on content. The resulting model therefore is encouraged to retain the semantic meaning of the target sentences. We perform extensive experiments that compare our model to other existing approaches in the tasks of sentiment and political slant transfer. Our model achieves significant improvement in terms of both content preservation and style transfer in automatic and human evaluation.
Deformable Style Transfer2020-07-19   ${\displaystyle \cong }$
Both geometry and texture are fundamental aspects of visual style. Existing style transfer methods, however, primarily focus on texture, almost entirely ignoring geometry. We propose deformable style transfer (DST), an optimization-based approach that jointly stylizes the texture and geometry of a content image to better match a style image. Unlike previous geometry-aware stylization methods, our approach is neither restricted to a particular domain (such as human faces), nor does it require training sets of matching style/content pairs. We demonstrate our method on a diverse set of content and style images including portraits, animals, objects, scenes, and paintings. Code has been made publicly available at https://github.com/sunniesuhyoung/DST.
Fair Transfer of Multiple Style Attributes in Text2020-01-18   ${\displaystyle \cong }$
To preserve anonymity and obfuscate their identity on online platforms users may morph their text and portray themselves as a different gender or demographic. Similarly, a chatbot may need to customize its communication style to improve engagement with its audience. This manner of changing the style of written text has gained significant attention in recent years. Yet these past research works largely cater to the transfer of single style attributes. The disadvantage of focusing on a single style alone is that this often results in target text where other existing style attributes behave unpredictably or are unfairly dominated by the new style. To counteract this behavior, it would be nice to have a style transfer mechanism that can transfer or control multiple styles simultaneously and fairly. Through such an approach, one could obtain obfuscated or written text incorporated with a desired degree of multiple soft styles such as female-quality, politeness, or formalness. In this work, we demonstrate that the transfer of multiple styles cannot be achieved by sequentially performing multiple single-style transfers. This is because each single style-transfer step often reverses or dominates over the style incorporated by a previous transfer step. We then propose a neural network architecture for fairly transferring multiple style attributes in a given text. We test our architecture on the Yelp data set to demonstrate our superior performance as compared to existing one-style transfer steps performed in a sequence.
In the light of feature distributions: moment matching for Neural Style Transfer2021-03-12   ${\displaystyle \cong }$
Style transfer aims to render the content of a given image in the graphical/artistic style of another image. The fundamental concept underlying NeuralStyle Transfer (NST) is to interpret style as a distribution in the feature space of a Convolutional Neural Network, such that a desired style can be achieved by matching its feature distribution. We show that most current implementations of that concept have important theoretical and practical limitations, as they only partially align the feature distributions. We propose a novel approach that matches the distributions more precisely, thus reproducing the desired style more faithfully, while still being computationally efficient. Specifically, we adapt the dual form of Central Moment Discrepancy (CMD), as recently proposed for domain adaptation, to minimize the difference between the target style and the feature distribution of the output image. The dual interpretation of this metric explicitly matches all higher-order centralized moments and is therefore a natural extension of existing NST methods that only take into account the first and second moments. Our experiments confirm that the strong theoretical properties also translate to visually better style transfer, and better disentangle style from semantic image content.
Style is a Distribution of Features2020-07-25   ${\displaystyle \cong }$
Neural style transfer (NST) is a powerful image generation technique that uses a convolutional neural network (CNN) to merge the content of one image with the style of another. Contemporary methods of NST use first or second order statistics of the CNN's features to achieve transfers with relatively little computational cost. However, these methods cannot fully extract the style from the CNN's features. We present a new algorithm for style transfer that fully extracts the style from the features by redefining the style loss as the Wasserstein distance between the distribution of features. Thus, we set a new standard in style transfer quality. In addition, we state two important interpretations of NST. The first is a re-emphasis from Li et al., which states that style is simply the distribution of features. The second states that NST is a type of generative adversarial network (GAN) problem.
Learning to Generate Multiple Style Transfer Outputs for an Input Sentence2020-02-16   ${\displaystyle \cong }$
Text style transfer refers to the task of rephrasing a given text in a different style. While various methods have been proposed to advance the state of the art, they often assume the transfer output follows a delta distribution, and thus their models cannot generate different style transfer results for a given input text. To address the limitation, we propose a one-to-many text style transfer framework. In contrast to prior works that learn a one-to-one mapping that converts an input sentence to one output sentence, our approach learns a one-to-many mapping that can convert an input sentence to multiple different output sentences, while preserving the input content. This is achieved by applying adversarial training with a latent decomposition scheme. Specifically, we decompose the latent representation of the input sentence to a style code that captures the language style variation and a content code that encodes the language style-independent content. We then combine the content code with the style code for generating a style transfer output. By combining the same content code with a different style code, we generate a different style transfer output. Extensive experimental results with comparisons to several text style transfer approaches on multiple public datasets using a diverse set of performance metrics validate effectiveness of the proposed approach.
Unpaired Motion Style Transfer from Video to Animation2020-05-12   ${\displaystyle \cong }$
Transferring the motion style from one animation clip to another, while preserving the motion content of the latter, has been a long-standing problem in character animation. Most existing data-driven approaches are supervised and rely on paired data, where motions with the same content are performed in different styles. In addition, these approaches are limited to transfer of styles that were seen during training. In this paper, we present a novel data-driven framework for motion style transfer, which learns from an unpaired collection of motions with style labels, and enables transferring motion styles not observed during training. Furthermore, our framework is able to extract motion styles directly from videos, bypassing 3D reconstruction, and apply them to the 3D input motion. Our style transfer network encodes motions into two latent codes, for content and for style, each of which plays a different role in the decoding (synthesis) process. While the content code is decoded into the output motion by several temporal convolutional layers, the style code modifies deep features via temporally invariant adaptive instance normalization (AdaIN). Moreover, while the content code is encoded from 3D joint rotations, we learn a common embedding for style from either 3D or 2D joint positions, enabling style extraction from videos. Our results are comparable to the state-of-the-art, despite not requiring paired training data, and outperform other methods when transferring previously unseen styles. To our knowledge, we are the first to demonstrate style transfer directly from videos to 3D animations - an ability which enables one to extend the set of style examples far beyond motions captured by MoCap systems.
Diversified Patch-based Style Transfer with Shifted Style Normalization2021-01-16   ${\displaystyle \cong }$
Gram-based and patch-based approaches are two important research lines of image style transfer. Recent diversified Gram-based methods have been able to produce multiple and diverse reasonable solutions for the same content and style inputs. However, as another popular research interest, the diversity of patch-based methods remains challenging due to the stereotyped style swapping process based on nearest patch matching. To resolve this dilemma, in this paper, we dive into the core style swapping process of patch-based style transfer and explore possible ways to diversify it. What stands out is an operation called shifted style normalization (SSN), the most effective and efficient way to empower existing patch-based methods to generate diverse results for arbitrary styles. The key insight is to use an important intuition that neural patches with higher activation values could contribute more to diversity. Theoretical analyses and extensive experiments are conducted to demonstrate the effectiveness of our method, and compared with other possible options and state-of-the-art algorithms, it shows remarkable superiority in both diversity and efficiency.
Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer2020-05-05   ${\displaystyle \cong }$
Unsupervised style transfer aims to change the style of an input sentence while preserving its original content without using parallel training data. In current dominant approaches, owing to the lack of fine-grained control on the influence from the target style,they are unable to yield desirable output sentences. In this paper, we propose a novel attentional sequence-to-sequence (Seq2seq) model that dynamically exploits the relevance of each output word to the target style for unsupervised style transfer. Specifically, we first pretrain a style classifier, where the relevance of each input word to the original style can be quantified via layer-wise relevance propagation. In a denoising auto-encoding manner, we train an attentional Seq2seq model to reconstruct input sentences and repredict word-level previously-quantified style relevance simultaneously. In this way, this model is endowed with the ability to automatically predict the style relevance of each output word. Then, we equip the decoder of this model with a neural style component to exploit the predicted wordlevel style relevance for better style transfer. Particularly, we fine-tune this model using a carefully-designed objective function involving style transfer, style relevance consistency, content preservation and fluency modeling loss terms. Experimental results show that our proposed model achieves state-of-the-art performance in terms of both transfer accuracy and content preservation.
MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer2019-03-06   ${\displaystyle \cong }$
An unprecedented booming has been witnessed in the research area of artistic style transfer ever since Gatys et al. introduced the neural method. One of the remaining challenges is to balance a trade-off among three critical aspects---speed, flexibility, and quality: (i) the vanilla optimization-based algorithm produces impressive results for arbitrary styles, but is unsatisfyingly slow due to its iterative nature, (ii) the fast approximation methods based on feed-forward neural networks generate satisfactory artistic effects but bound to only a limited number of styles, and (iii) feature-matching methods like AdaIN achieve arbitrary style transfer in a real-time manner but at a cost of the compromised quality. We find it considerably difficult to balance the trade-off well merely using a single feed-forward step and ask, instead, whether there exists an algorithm that could adapt quickly to any style, while the adapted model maintains high efficiency and good image quality. Motivated by this idea, we propose a novel method, coined MetaStyle, which formulates the neural style transfer as a bilevel optimization problem and combines learning with only a few post-processing update steps to adapt to a fast approximation model with satisfying artistic effects, comparable to the optimization-based methods for an arbitrary style. The qualitative and quantitative analysis in the experiments demonstrates that the proposed approach achieves high-quality arbitrary artistic style transfer effectively, with a good trade-off among speed, flexibility, and quality.
Learning Graph Normalization for Graph Neural Networks2020-09-24   ${\displaystyle \cong }$
Graph Neural Networks (GNNs) have attracted considerable attention and have emerged as a new promising paradigm to process graph-structured data. GNNs are usually stacked to multiple layers and the node representations in each layer are computed through propagating and aggregating the neighboring node features with respect to the graph. By stacking to multiple layers, GNNs are able to capture the long-range dependencies among the data on the graph and thus bring performance improvements. To train a GNN with multiple layers effectively, some normalization techniques (e.g., node-wise normalization, batch-wise normalization) are necessary. However, the normalization techniques for GNNs are highly task-relevant and different application tasks prefer to different normalization techniques, which is hard to know in advance. To tackle this deficiency, in this paper, we propose to learn graph normalization by optimizing a weighted combination of normalization techniques at four different levels, including node-wise normalization, adjacency-wise normalization, graph-wise normalization, and batch-wise normalization, in which the adjacency-wise normalization and the graph-wise normalization are newly proposed in this paper to take into account the local structure and the global structure on the graph, respectively. By learning the optimal weights, we are able to automatically select a single best or a best combination of multiple normalizations for a specific task. We conduct extensive experiments on benchmark datasets for different tasks, including node classification, link prediction, graph classification and graph regression, and confirm that the learned graph normalization leads to competitive results and that the learned weights suggest the appropriate normalization techniques for the specific task. Source code is released here https://github.com/cyh1112/GraphNormalization.
Review of Text Style Transfer Based on Deep Learning2020-05-11   ${\displaystyle \cong }$
Text style transfer is a hot issue in recent natural language processing,which mainly studies the text to adapt to different specific situations, audiences and purposes by making some changes. The style of the text usually includes many aspects such as morphology, grammar, emotion, complexity, fluency, tense, tone and so on. In the traditional text style transfer model, the text style is generally relied on by experts knowledge and hand-designed rules, but with the application of deep learning in the field of natural language processing, the text style transfer method based on deep learning Started to be heavily researched. In recent years, text style transfer is becoming a hot issue in natural language processing research. This article summarizes the research on the text style transfer model based on deep learning in recent years, and summarizes, analyzes and compares the main research directions and progress. In addition, the article also introduces public data sets and evaluation indicators commonly used for text style transfer. Finally, the existing characteristics of the text style transfer model are summarized, and the future development trend of the text style transfer model based on deep learning is analyzed and forecasted.