Presentation is loading. Please wait.

Presentation is loading. Please wait.

Table Extraction Using MaxEnt Zonghui Lian. Introduction Table extraction Table format.

Similar presentations


Presentation on theme: "Table Extraction Using MaxEnt Zonghui Lian. Introduction Table extraction Table format."— Presentation transcript:

1 Table Extraction Using MaxEnt Zonghui Lian

2 Introduction Table extraction Table format

3 Problem HTML table Tags can help us to understand it How about plain text table?

4 An Example title separator header datarow

5 MaxEnt How to define features How to learn model weights

6 Data Set CS dept university of Massachusetts Amherst (FedStats.gov) Training data: 9321 Test data: 1200 Format

7 Features White space Large gaps /Small gaps Four space indents Space percentage Text feature Digit percentage Month and year

8 Features Special characters -, +, =, :, |,.

9 Result

10 Error Analysis TABLEFOOTNOTE -> NONTABLE DATAROW DATAROW -> SECTIONDATAROW TABLEHEADER -> SUPERHEADER Most error happened when recognizing … [TABLEFOOTNOTE : 0.2719665271966527 DATAROW : 0.12552301255230125 TABLEHEADER : 0.11715481171548117 TABLEFOOTNOTE1 Includes Hawaii. TABLEFOOTNOTE2 Includes processing total for dual usage crops.

11 Future Work Improve the performance Features For example Alphabet characters Previous label Next label Data set size

12 Future Work Identity columns Add tags Use table understanding algorithm


Download ppt "Table Extraction Using MaxEnt Zonghui Lian. Introduction Table extraction Table format."

Similar presentations


Ads by Google