Skip to content

Conversation

@aloknayak29
Copy link

Added support for extraction of font boolean atrributes like bold and italic (from textfontinfo class). Note that experiments revealed that these attributes will surely be True positive but can be false negative.

… italic (surely True positive but can be false negative)
unicode name
double size
Color color
PyBool isbold
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be rather bool type, to delay type coercion to Python boolean until it's really required.

def __cinit__(self, unicode name, double size, Color color, PyBool isbold, PyBool isitalic):
#nparts=name.split('+',1)
#self.name=nparts[-1]
self.name=name
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the content of name could break existing implementations, I guess if full name is needed it should be new property full_name

def __cinit__(self, unicode name, double size, Color color):
nparts=name.split('+',1)
self.name=nparts[-1]
def __cinit__(self, unicode name, double size, Color color, PyBool isbold, PyBool isitalic):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use Python naming convetion - snake case - e.g. is_bold is_italic

self.color=color
self.isbold=isbold
self.isitalic=isitalic

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake case

return self.isitalic
def __set__(self, PyBool val):
self.isitalic=val

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake_case

self._bboxes.append(last_bbox)
w.getColor(&r, &g, &b)
font_name=w.getFontName(i)
textfontinfo = w.getFontInfo(i)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitelly need to handle case when w.getFontInfo returns null

@izderadicka
Copy link
Owner

Thanks for PR - see detail comments in code for particular issues.
If we are to add more font info from TextFontInfo, why not to add remaining :

GBool isFixedWidth() 
GBool isSerif() 
GBool isSymbolic() 

@izderadicka
Copy link
Owner

Could you also elaborate bit on false negatives? When it happens? I actually use font name to check for bold ( in python)

@aloknayak29
Copy link
Author

Thanks for the code reviews. I will update my repo soon.
When false negative happens, checking on font name was not helpful. e.g in one case, I was getting 'helvetica' as output for all words regardless some of them were bold visually. flase negative happened for is_italic as well. I have less knowledge about significance of isFixedWidth(), isSerif(), isSymbolic() . If possible, Can you give me some references of their usage examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants