3GM-CMU Collaborative Research And this paper is a joint collaborative research between General Motors Company and CMU. We are working together to develop future autonomous driving vehicles and the paper is part of this research project.
4Sensors Setup on SRX Platform Images from: Junqing Wei et al., “Towards a Viable Autonomous Driving Research Platform,” IEEE Intelligent Vehicles Symposium (IV), 2013
5Sensors: Price vs Information CameraLidarAnd this paper is a joint collaborative research between General Motors Company and CMU. We are working together to develop future autonomous driving vehicles and the paper is part of this research project.RadarPrice
6Computer Vision Applications Object detection (pedestrian, vehicle, bicycle…)Road parsing (lane/border detection, road segmentation, vanishing point estimation…)Localization and trackingDriver status monitoringMany other applications……And this paper is a joint collaborative research between General Motors Company and CMU. We are working together to develop future autonomous driving vehicles and the paper is part of this research project.
7Motivation, Description and Goal Development for future driving assistance system and autonomous driving systemRobust detection within 0.5 to 6 meters detection range. Achieve near 100% accuracy in daytime and over 90% in nighttime on the right most laneHandling various scenarios including highway entrance and exitExtend to the joint system with front viewWhat we are trying to do is that we have a monocular camera looking towards the side and we seek to use vision and learning algorithms to automatically detect the border and shoulder of a highway. By saying border we mean it is the physical end of the paved road. As you can see the red line in the image is the border returned by our algorithm.Another thing we aim to detect is the road shoulder. A shoulder is defined as the region between the right most solid lane and the border. For example, the blue line here is the lane marking. And the green region is the shoulder.
8High-Level Idea: Learning based Method Concrete BarrierGuard RailSoft ShoulderGuard RailSoft ShoulderConcreteBarrierLane MarkingHow does the algorithm work? In this paper, we train both the border detector and the lane marker detector, and perform scanning window detection. Our trained detector handles various types of borders as shown on the top row. The scanning window detection will return densely triggered detectors and each triggered detector will return a voting point indicating approximately where the border and lane marking are. We then use our proposed structured Hough voting model to finally output both the border as well as the shoulder.Structured Hough VotingDensely Fired scanning windowsReturned Voting PointsBorder / lane marking hypotheses
10Training Patch Alignment For each sample patch, fix the width-height ratio to be 2Center each patch with respect to y-coordinates of ground truth1-3 positive samples from each image. Separate on x-coordinates to cover as much as possible.Each positive sample is associated with 3 negative samples with the same size, randomly selected from background.ConcreteNaturalSteelLane MarkerPositive Samples:Negative Samples:
11Concatenated Filter Bank Feature Concatenated HOG Feature Feature ExtractionFilter BankConcatenated Filter Bank FeatureConcatenated HOG FeatureHOGPatches that are discriminative to HOGPatches that are discriminative to filter banks
12Classification & Detection Extract features from all training patches (based on previous page)Perform Fisher discriminant analysisTrain an RBF kernel SVMScanning window detection (Deliberately having a lot of positive firing)Guard RailSoft ShoulderConcreteBarrierLane Marking
14Structured Hough Voting: Intuitions Basic philosophy: A model that assumes voting results are correlated rather than independentInter-frame structural info on hypotheses (Temporal smoothness)Intra-frame structural info (Geometric relationship)Multiple candidate hypotheses generation (Proposals with diversity)Constrained Hough Voting on detected voting points (Detection + Tracking)Arbitrary Hough Voting on detected voting points (Detection)Constrained Hough Voting on image gradients (Pure Tracking)
15Purpose of Candidate 1Deals most of the frames where hypotheses from consecutive frames have strong correlation.
16Purpose of Candidate 2Automatically corrects result through searching for “much better” voting configurations (This is the power of detection, avoids error from tracking)
17Purpose of Candidate 3In the worst case where Type 1 voters fail, perform tracking by gradients from previous pose configuration.
18Modeling under CRF: Background A Conditional Random Field (CRF) discriminatively defines the joint posterior probability as the product of a set of potentialsThe potentials are functions with hypotheses Hi being the variables. They are modeled in such a way that a larger potential value generally indicates a better hypothesis configuration.CRF inference seeks to find the joint hypothesis configuration H that maximizesUnary PotentialPairwise PotentialH1H2…HNX1X2XN
19Modeling under CRF: Intuition What are the hypothesis Hi?E.g.: image pixel labels (FG/BG, Object Class, etc.), if it is a segmentation problem.In our problem, Hi is the Hough Voting hypothesis: Hi = (r, θ).X is the observation of voting point coordinates and their weights.The unary potential corresponds to the exponential of Hough voting weights: exp(v(Hi)).The pairwise potential corresponds to the inter-frame smoothness (tracking) constraint.H1H2…HNX1X2XN
20No Structural Information Hbd,1Hbd,2Hbd,N…X1X2XNHln,1Hln,2Hln,N…X1X2XNSimplest Case: frame-wise independent Hough voting
26Mode Selection Potential Use decision tree to guide the mode selection.The mode selection basically forces the output to be one of the candidate hypotheses, but allows discrepancy with the decision tree prediction with a penalty.
27Coupled Structure Potential The coupled structure potential captures two most important relations between a border hypothesis and a lane hypothesisParallelismDistance
28InferenceConducting a whole inference each time given a new frame is computationally infeasible.Relaxation: Initialize with the inferred state variable configuration of the previous t-1 frames and infer the current state variables, updating in an incremental way.Inference procedure at t = 1: 1. Perform Hough voting for both border and lane marking 2. Perturbate hypotheses if geometric relationship violated (optional)Inference procedure at t > 1: 1. Generate the 3 candidate hypotheses for both border and lane marking 2. Use decision tree to help selecting the best candidate 3. Perturbate candidate hypotheses if geometric relationship violated (optional) 4. Re-select the best candidate
30Experiments: Qualitative Results Ground Truth and Baseline methods:Ground TruthIndependent Hough voting in each frame using the fired detector voting pointsHough voting using the triggered detector voting points constrained by previous frameAdding gradient tracking to Baseline 2.Kalman filter.Proposed Method
32Highway Entrance Detection and Lane State Tracking
33Summary Proposed the Structured Hough Voting Model The proposed model can be theoretically formulated under a CRFFast real-time feature extraction and online inferenceAchieves very robust and good performance under challenging scenarios and low quality inputs from production camera