Presentation is loading. Please wait.

Presentation is loading. Please wait.

DeepFont: Large-Scale Real-World Font Recognition from Images

Similar presentations


Presentation on theme: "DeepFont: Large-Scale Real-World Font Recognition from Images"— Presentation transcript:

1 DeepFont: Large-Scale Real-World Font Recognition from Images
Zhangyang (Atlas) Wang Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman, Aseem Argawala, and Thomas Huang

2 Problem Definition Seen a font in use and want to identify what it is?

3 Problem Definition Font recognition: recognize font style (typeface, slop, weight, etc) automatically from real-world photos Why it matters? Highly desirable feature for designers Design library collection Design inspiration Text editing

4 Challenges An extremely large-scale recognition problem
Over 100,000 fonts claimed on myfonts.com in their collection Beyond object recognition: recognizing subtle design styles. Extremely difficult to collect real-world training data Has to rely on synthetic training data BIG mismatch between synthetic training and real-world testing

5 Solution Deep convolutional neural network?
Effective at large-scale recognition Effective at fine-grained recognition Data-driven Problem: huge mismatch between synthetic training and real-world testing Data augmentation Decomposition-based deep CNN for domain adaptation

6 The AdobeVFR Dataset Synthetic training set
2383 fonts from Adobe Type Library (extended to 4052 classes later) 1000 synthetic English word images per font ~2.4M training images Real-world testing set 4383 real-world labeled images Covering 671 fonts out of 2383 …………………………………………… The first large-scale benchmark set for the task of visual font recognition Consisting of both synthetic and real-world text images Also good for fine-grain classification, domain adaption, understand design styles

7 Deep Convolutional Neural Network
Following the benchmark structure?

8 Domain Mismatch Direct training on synthetic data and testing on real-world data (Top-5 accuracy) Need domain adaptation to minimize the gap between synthetic training and real-world testing! Synthetic Real-World Training 99.16% NA Testing 98.97% 49.24%

9 Data Augmentation Common degradations
Noise, blur, warping, shading, compression artifacts, etc

10 Data Augmentation Common degradations
Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction.

11 Data Augmentation Common degradations
Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction. Random character spacing: render training text images with random character spacing

12 Data Augmentation Common degradations
Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction. Random character spacing: render training text images with random character spacing Inputs to the network: random 105x105 crops

13 Effects of Data Augmentation
Synthetic 1-4: common degradations Synthetic 5-6: special degradations Synthetic 1-6: all degradations On the right: MMD between synthetic and real-world data responses

14 Beyond Data Augmentation
Problems Cannot enumerate all possible degradations, e.g., background and font decorations. May introduce degradation bias in training Design the learning algorithm to be robust to domain mismatch? Mismatch already happens in the low-level features Tons of unlabeled real-world data

15 Network Decomposition for Domain Adaptation
Unsupervised cross-domain sub-network Cu (N layers) Supervised domain-specific sub-network Cs (7-N layers)

16 Network Decomposition for Domain Adaptation
Train sub-network Cu in a unsupervised training using stacked convolutional auto encoders, with both synthetic data and unlabeled real-world data. Fix sub-network Cu, and train sub-network Cs in a supervised way, using the labeled synthetic data.

17 Quantitative Evaluation
4383 real-world test images collected from font forums. Model Augmentation? Decomposition? Real-World Test (Accuracy) Top 1 Top 5 LFE Y Na 42.56% 60.31% DeepFont N 42.49% 49.24% 66.70% 79.22% 71.42% 81.79% Varying the layer number K of unsupervised network Cu K 1 2 3 4 5 Training 91.54% 90.12% 88.77% 87.46% 84.79% 82.12% Testing 79.28% 79.69% 81.79% 81.04% 77.48% 74.03%

18 Successful Examples

19 Failure Examples

20 Model Compression For a typical CNN, about 90% of the storage is taken up by the dense connected layers Matrix factorization methods are considered for compressing parameters in linear models, by capturing nearly low-rank property of parameter matrices. The plots of eigenvalues for the fc6 layer weight matrix in DeepFont. This densely connected layer takes up 85% of the total model size.

21 Model Compression During training, we add a low rank constraint on the fc 6 (rank < k) layer In practice, we adopt very aggressive compression on all fc layers, and obtained a mini-model with ~40 MB in storage, with a compression ratio >18, and (top-5) performance loss ~3%. Take-Home Points: FC layers can be highly redundant. Compressing them aggressively MIGHT work well. Joint Training-Compression performs notably better than two-stage.

22 In Adobe Product: Recognize Fonts from Images

23 In Adobe Product: Photoshop Prototype

24 Text Editing Inside Photoshop

25 Text Editing Inside Photoshop

26 In Adobe Product: Discover Similarity between Fonts
Font inspiration, browsing, and organization

27 In Adobe Product: Discover Similarity between Fonts
Font inspiration, browsing, and organization

28 Thank you! For more information
Full paper will be made available quite soon AdobeVFR Dataset will be available soon


Download ppt "DeepFont: Large-Scale Real-World Font Recognition from Images"

Similar presentations


Ads by Google